in

Ative at Work

Agile software development

Ative at Work

  • The Five Whys of Lean

    Root cause problem resolution is one of the core practises in agile. If the engine requires new oil every 500 km we don't just top it up with oil. We fix the engine.  Adding oil is just a hack that leaves technical debt in the system and slows down our journey. It reduces our sustainable pace.

    Good operations people get this. Many developers, however, do not. We would happily work overtime walking with a stone in our shoe rather than stopping for a moment to remove it.

    To increase the awareness of removing the cause rather than treating the symptoms of the problem, Taiichi Ohno of Toyota implemented the simple practise of asking "Why?" five times when something went wrong. As we keep on asking, we get closer to true cause of the problem, not just its symptoms.

    It is a simple technique and it works very well.

    In my career I have seen many systems that take weeks or months to integrate and deploy after the developers declare them "done". When integration is very expensive people tend to put it off and build up huge batches of work-in-progress - software that the developers have declared to be done but which cannot be deployed to production yet.

    From a Lean perspective that is a waste.

    Taiichi Ohno's "5 whys" technique proves quite useful in a situation like this.

    "Why not release to production at the end of each sprint instead of just once per year?" "Well, yeah, that's a good idea, but we can't do that."

    "Why?". "It is very expensive".

    "Why?". "We have to test it all and get all the bugs out."

    "Why does that take so long?". "It's a manual test, and we have to deploy to a very complex environment and there are a lot of bugs."

    "Why don't you do automated testing, then?". "Oh, we tried that - it does not work."

    "Why?". "You see, we create tests but when we run them two months later they are all red....".

    "Why?". "Time passes and so the monthly batch jobs run and update the data. This breaks the tests... so we can't do automated testing.".

    "Why not let the tests create a set a known test data when they start and then test against that?". "No that wouldn't work. You see the test data is a snapshot of the production data."

    "Why?" "So that we have examples of all the kind of data we need to test on. Creating the test data is very complex and takes too long."

    "Why?" "Well, there is so much data to enter."

    "Then why not automate it then?". "We cannot do that - you see, the legacy system does not really have the hooks for that."

    "Why? - who wrote it?" "Don't be stubborn. We wrote it... but , you see management doesn't want us to change the legacy system since it costs so much to test."

    "But that's the problem we are trying to solve!"

    Getting to the root of the problem and fixing that is the core of lean and agile practises. Next time you encounter a problem, try asking the five whys, and ask yourself - "are we treating the symptom, or removing the cause?" This is the way to go faster and improve quality.

  • Why Going Faster Matters

    I did a talk on Value Stream Mapping from Lean last Thursday at the Danish Agile User Group meeting (slides in Danish are available at http://tech.groups.yahoo.com/group/DanishAgileUserGroup/files/ (registration required)). 

    One of the many interesting questions that came up was why faster is better - could it be that slower would be more efficient?

    The context was that be that faster could be more expensive - ie. adding more people to fix the problem. In that case there is a chance that slower is in a sense better. Dogmatic software development even has it that resources, timeline, quality and feature set are interrelated. If you want to go faster or to increase the quality, it costs more.

    The underlying assumption is that the process is perfect and therefore that all the work is essential and valuable.

    That, however, is a myth.

    From the perspective of the Seven Wastes that I blogged about recently I would like to rephrase the question to "is less waste always better?"

    I think the reason that many people connect lean with faster is that the most obvious waste in value stream maps is the waste of waiting. So on first impression lean is all about eliminating that and keeping extra resources available to solve tasks immediately when they arise with with no delay.

    That would be wonderful however but if we have a great variation in the workload we will take on a huge overhead to have peak capacity available at all times. Since this would be a waste when demand is slow lean has a practise of balancing work and capacity to keep the variation low.

    In case there is extra capacity available we use it for kaizen - process improvement. In software, for example, we could spend it on refactoring a legacy system to a cleaner state so we can work faster in the future. This would enable us to take on extra demand with the same number of resources.

    Now there are other wastes than waiting and it is by looking at them that we see that lean is actually not a cause of stress as many seem to belive. The aim is not working faster doing the same thing. The goal is to only do the work that is actually valuable.

    This means getting to the goal faster by doing less. It does not mean burnout or overtime.

    From this perspective I believe that lean is a more humane approach than the obvious alternative of status quo where speeding up is derived from the project manager pressuring and threatening employees to work overtime, cut quality etc. In that traditional context faster means evil, but in the agile world, where respect for people is the centerpiece faster also means better. It denotes less meaningless work. It means freeing up all the untapped human talent of the team and putting it to bear on doing exciting work - to spend our working hours creating something that will please the customer faster and at a better quality than ever before.

    That is the essence of agile - and it's the source of great job satisfaction.

  • The Waste of Defects - Bugs are Stop-the-Line Issues

    "Don't clean it, " my grandmother used to say; "keep it clean.". 

    She probably learned it long before the computer era. Yet for some reason her advice did not spread to the software industry. We still have a tendency to build up a big mess and put off cleaning it up until much later. I am thinking about the waste of defects - the lean principle of preventing trouble from creeping in rather than struggling to get it out after the fact.

    In software, as in lean, bugs are stop-the-line issues.
     
    Once you find them you have to fix them before moving on. Period. 
     
    I have heard many excuses for accepting mediocre quality, but no good ones. If it doesn't have to work, why do we build it at all? In fact, if we explicitly want to build something that does not have to work, the easiest way is not to build it at all!
     
    So, if we want it built it is fair to say that we want it to work. Therefore, we have to make sure that it works when we build it, and that it keeps working - namely, that we fix bugs as soon as they appear.
     
    A side benefit of this that we might not even need a bug tracking system - after all, it only exists to manage all the defects that we should not have allowed to linger in the system in the first place. If we fix the issues as soon as we find them, we can easily track the number of open bugs on a single post-it note. 

    We did this on a mainframe migration projects. We built it test-driven and fixed the odd uncaught defect on the spot. In the end, we had some 7000 automated test cases to keep the system in a known good state. And no bug tracking system.
     
    So, in many ways a bug tracking system could be considered an indicator of waste in the organisation. However, for an organisation with low software quality, the non-existence of a bug-tracking system is an indicator for even greater problems.
     
    The transition to treating defects as a stop-the-line issue will definitely be painful for many organisations. Remeber that it took Toyota about a month to build the first car after introducing this concept in the NUMMI factory they acquired from General Motors. In software it is often worse.
     
    First of all, some teams - namely the teams with the highest technical debt or lowest output quality will appear to not produce anything. They will suddenly spend all their time cleaning up the house rather than adding new features.
     
    The downside is that the low quality suddenly becomes painfully visible to the whole organisation. Since many organisations measure new features or delivering on schedule as success and shy away from measuring quality this will create a sense of crisis.
     
    The upside is that the quality becomes painfully visible. The teams that produce low quality will be stopped from producing more low-quality stuff whereas the teams with a higher output quality can dedicate a higher proportion of their effort to building new software. This creates a virtuous circle where all new software is produced by the teams that are capable of building the highest quality software. The other teams will be busy cleaning up their mess.
     
    The net result is legacy software base that improves and higher quality new software.
     
    The pain of transition is so great that many organisations shy away and prefer to run up their technical debt instead.

    Sometime they try to hide it by frequent changes of "strategic platform": when the mainframe was replaced in favour of Java it initially appeared to be much more productive but eventually the organisational bent to produce bugs rather than fixing them built up enough technical debt that the Java platform deteriorated to the same level as the mainframe. Then came the time for another "strategic platform shift" to .NET - and development productivity soared, only to decline over time as the code-base atrophied, making the organisation ready for "the next big platform."

    The underlying issue is that the technology or platform is not the root cause of the unacceptable productivity levels. It is the organisational culture of accepting low quality that is the cause. This cannot be helped by replacing the technology. The remedy is all about behaviour.
     
    We change this behaviour top-down or bottom-up. But we need to fix it. There is no excuse to wait. We have to stop the line, and stop building bugs into our software. Not only for the quality, but also for the productivity. This is what lean software development is all about.

  • Lean Principle Number 1 - Eliminate Waste

    The key principle in lean manufacturing is to eliminate waste.

    The Toyota Production System names seven major sources of waste. Mary and Tom Poppendieck list these in their book Lean Software Development - an Agile Toolkit (Addison-Wesley, 2003) with the translation to software development:

    The Seven Wastes of Manufacturing

    The Seven Wastes of Software Development

    Inventory

    Partially Done Work

    Extra Processing

    Extra Processes

    Overproduction

    Extra Features

    Transportation

    Task Switching

    Waiting

    Waiting

    Motion

    Motion

    Defects

    Defects

    Partially Done Work

    Partially Done Work is a frequent cause of failure in many projects. It usually begins with the best intentions such as creating expert teams for each layer in the application. Put a database team in one room and the middleware team in another room and the application team in the next building and plan single big-bang integration phase in the end of the project and you have a recipe for disaster. The experience from lean manufacturing is that it is faster to do single-piece flow and finish a feature completely rather than taking in the overhead of managing huge batches of work-in-progress. It also takes out a lot of the risk since we are always 100% done with x% of the application - not x% done with 100% of the application. This permits us to realize the cost savings or extra revenue of the feature early rather than later. The key is to integrate early and often, deliver incrementally and manage the product lifecycle thinking in small releases.

    Extra Processes

    It also relates to Extra Processes. Many of the practices of "professional" software development organsations are needed only to provide a sense of control while not addressing the underlying problem. It is like repeatedly shooting yourself in the foot and then priding yourself on how your organisation's surgeons are really good at patching it up afterwards.

    Examples of extra processes include all kinds of requirements, analysis, design, and test documents that no-one will ever read, the endless meetings with more people than are really needed or all the bureaucratic inventions to prevent change - Change Control Boards, reviews, complex ISO and CMMI processes that make simple work hard. It also includes the manual tasks that should be automated but are not (see Setting a Minimal Professional Standard http://community.ative.dk/blogs/ative/archive/2006/12/10/Going-Agile-_2D00_-Setting-a-Minimal-Professional-Standard.aspx). In short, extra processes are all the work that intelligent people wouldn't do without being told.

    For example, one particular system we worked on had a rigorous ISO process for change control that made it practically impossible to fix anything that had gone through all the reviews. In one case, we discovered a private member of a class that should have been publicly accessible - something that is very easy to fix if there are no extra processes. However, since the module was already reviewed and "finished" the process required us to change the code, rerun all the test cases, copy the output into a Word document, go through a review meeting and update a host of documents. All this for changing the word "private" to "public" in a single source file. Needless to say processes like this drives up the cost of software development dramatically and in many cases the only business value is being applauded by the ISO auditors for following the process very rigorously.   

    Extra Features

    The easiest way to finish faster is to write less software. The oft quoted Standish Group study from XP 2002 concluded that 45% of features in a typical application were never used and only 20% of features were used often or always. A good Product Owner with an iterative, incremental process can leverage this to get dramatic cost savings by simply building the most important features first and stopping the development process when the marginal value of extra features no longer justifies further investment. Also, by delivering incrementally we can start learning generating ROI on the most valuable features faster - and learn about the system so we are better able to prioritize the remaining work. A further benefit of delivering only the essentials is that we end up with a smaller, simpler code-base which is easier to change and evolve - which means we can go faster in the future since we are no longer carrying a huge burden of non-productive weight.

    Task Switching

    Task switching is another common productivity killer. Programmers know the state of "flow" where work is focused, efficient and effortless. It takes a while to get into this state. This is one of the reasons that single-piece flow is such a great way to work: we start something, we focus and we finish. There are no interruptions and context switching to disturb us from getting into the state of flow and we free up all the resources we would otherwise use to track all our bits of work-in-progress. 

    In many organizations task switching is caused by sharing people across multiple teams. For example, on one project I worked with an analyst who was 50% allocated to another team. After a while he came to our status meeting and told the project manager that they should not expect much output from him on any of the teams: he almost was fully booked just by attending all the meetings in the two teams. Here, focusing on one team for one sprint at a time was much more efficient.

    Waiting

    Waiting is another waste. If it takes three hours of work to accomplish a certain feature you should not have to wait four months to have it delivered. This is a waste. The anti-dote provided by lean is to use value stream analysis to look at the value stream of the whole development process from the customer's perspective and comparing the hours of work done to the waiting time (calendar time) from the moment the concept is conceived to the moment the feature is delivered. In many organizations the efficiency of the production cycle - the work effort compared to the time it takes to complete the process - is less than 1%. Most of the features spend their lives sitting in queues, waiting for the next Change Control Board meeting, living in Requirements documents, waiting in Analysis documents before being implemented, waiting for system integration, waiting for the testers to test them, or waiting in a bug tracking system to be corrected or waiting for the next bi-annual deployment window. All this waiting time accumulates Partially Done Work, causes us to introduce complex procedures to manage it and adds no business value to the customer. Therefore, it must be eliminated.

    The approach here is to take a simple approach. If you need it - build it now. Select only the most valuable features then deliver them quickly. Match the in-flow of tasks and the production capability. Specifically think about your team like you dimension your server hardware and don't try to run it at peak utilization. In fact, the highest efficiency is generally found when there organization is running at no more than 80% utilization since this provides much better latency.

    Motion

    Motion is one of the other wastes. It occurs when we need to go somewhere else to solve a problem - if the customer or an expert is not at hand when we need answers to questions or if the team is spread out over multiple offices, floors, buildings or even time-zones. (We have written about this in Refactoring the Physical Workspace http://community.ative.dk/blogs/ative/archive/2006/10/13/Refactoring-The-Physical-Workspace.aspx). It also happens during the many hand-offs that are so typical in waterfall development. One person writes the requirements, then an analyst translates it to something for the technical team, then another person takes over and interprets it to code it etc. The documents that are handed over at each step will never be adequate. The cure is not to write more detailed documents since this takes even longer and only provides a false sense of security. The best cure is simply to eliminate the hand-offs altogether and have developers and customers sit together to figure out what needs to be done, code it up as an acceptance test and implement the code to make it pass quickly.

    Defects

    This also leads to the last of the wastes: defects. As a rule of thumb, the longer a defect is allowed to stay in the system, the more it costs to fix it. First of all there is the risk of change, we might have to do a costly recall, repeat a lot of manual testing, revise the documentation etc. On the other hand, if we fix the bug immediately, or even better - if we don't allow it to come into the system in the first place we reduce the cost of fixing the defects to almost zero. The key here is to realize that software with defects is work-in-progress. This is the reason that test-driven development and automated regression test suites are among the key practices of lean software development teams. In fact, preventing defects is so essential that we will devote a whole article to it shortly. Stay tuned.

     

  • Lean Software Development

    Today the lean meme is in every business newspaper. It has been a long time coming. Some companies like Toyota have lived these principles since the 1950s when their lack of capital forced them to improve their production processes radically. In the West it appeared on the radar with the books by Womack & Jones published from 1990 and onwards. Now, the ideas have gained mainstream mindshare and are being applied in many fields. Applied to software development “lean” provides a great toolbox of agile methods to help radically improve development efficiency.

    Deliver High Customer Value Quickly at Low Cost

    The lean starting point is to optimize the production from a customer-centric perspective. The goal is to provide the customer with maximum value quickly and at a low cost. Therefore the first step is to define the value of the product (or features) from the customer’s perspective. Translated to Scrum this is the job of the Product Owner.

    Eliminate Waste

    The central point of lean is to eliminate "muda", or waste. Muda is defined as all the activities and steps in a production process that add no value to the customer. In software, examples are work-in-progress, defects, features that are not necessary, the bureaucratic hindrances in traditional software development organizations and all the stuff and over-generalizations that developers love to do (“we might need it later”) even when a much simpler solution will suffice.

    Since there is so much waste a typical lean transition begins by mapping the value stream for the complete production process from the customer's perspective. Then we reduce it to only the steps that add value to the customer. This is usually a very small subset.

    Introduce Single-Piece Flow

    Once we know the steps we organize the remaining steps into the best sequence and introducing “single-piece flow”: completing the production of a whole but small product in one continuous sequence with no delays.

    Use Pull Scheduling

    At the integration points between the single-piece flow “cells” the lean method of scheduling is using “pull”. When a downstream consumer is ready for processing a new part it requests it from its upstream supplier. This makes scheduling very simple. There is no inventory to keep or partially made items to track. The synchronization mechanism is simply to keep everyone working in the same “takt time”, producing new stuff at just the rate it is needed. Therefore no waiting occurs. This is a simple way to optimize the throughput without complex planning systems.

    Improve The Process Continuously
     
    After the initial radical reengineering of the production process we continuously focus on improving the process as we learn and innovate, so that the effort and time do the work keeps falling. This produces a virtuous cycle. In the words of Womack & Jones lean is a method to do more and more with less and less.

    Stop-the-Line Root Cause Problem Fixing

    A lean process stops when there is a problem and it does not restart until the root cause of the problem has been fixed.

    One of the legendary lean examples of this is when Toyota took control of the NUMMI car factory in the USA. The US tradition was to let defects slip and fix them in QA later. Instead, Toyota simply it told the workers to do good work and stop the assembly line when they could do their work properly.

    It took about month to produce the first vehicle.

    Since then, however their quality and productivity has been outstanding and they have earned an impressive number of awards.

    To teach people the stop-the-line culture Taiichi Ohno of Toyota used the principle of “five whys”. When we encounter a problem we ask why five times to unveil the true root cause of the problem rather than just treating the symptoms. In software you see that very good operations people have a natural tendency to do this, while most developers faced with the same problem tend to just advice to “restart the server” or some other fix that does not unveil the nature of the problem or prevent it from reoccurring. The stop-the-line practice improves quality - while the other just accepts low quality without fixing it. One of the central lessons of lean is that you have to improve quality to go faster so by continuously removing the root causes of problems we also improve the efficiency of the entire production system. In fact, from a customer perspective testing to find defects is a “muda” whereas testing to prevent defects from occurring is not. In software terms this makes the case for test-driven development and identifies test-later as an expensive, wasteful practice.

    In the coming months we will explore lean and its applications to software development in this blog. Until then - Happy New Year and keep the comments coming. We really appreciate your participation.


    Further Reading
     

    • Taiichi Ohno - one of the greatest industrial innovators of the 20th century, the father of the Toyota Production System. After spending his career relentlessly optimizing manufacturing at Toyota he wrote the book Toyota Production System: Beyond Large-scale Production that describes his work.
    • Womack & Jones - Their books are great and it is well worth to read them all to see a lot of the principles and case studies for lean thinking. Also, it is quite interesting to see that software development is now rediscovering some of the things that manufacturing learned much earlier - in the case of Toyota as early as in the 1950s and 1960s. Begin your studies with The Machine That Changed the World, a five-year study of the global auto industry from MIT and go on with the Lean Thinking and Lean Solutions. They give a fascinating perspective on manufacturing and plenty of examples of the lean principles and they applications. 
    • Mary and Tom Poppendieck - with a background in manufacturing and software they have translated the concepts of lean to software development. They have written two great books and Implementing Lean Software Development and Lean Software Development - an Agile Toolkit. Both books are well worth reading a present a both the principles and lot of cases in a friendly, colloquial manner. Mary Poppendieck is a frequent conference speaker. We saw her giving some very impressive lectures at Agile 2006 (www.agile2006.org) and there is plenty of opportunity to see her on the conference circuit. Highly recommended!

    Updated January 11, 2007: fixed grammar and typos.

    Posted jan 08 2007, 09:06 by Martin Jul with 9 comment(s)
    Filed under: ,
  • Going Agile - Introducing Inspect-and-Adapt Cycles

    Many large IT organizations are so inefficient that helping them go agile may seem like an incredible amount of work.

    The key to not being discouraged is taking the long view and following the advice for how to eat an elephant: it is “one bite at a time”. If we try to solve all the problems at once we will simply be overwhelmed by complexity. Therefore we take an incremental approach.

    The key is to introduce a reflective inspect-and-adapt cycle into the process.

    Scrum addresses this problem in a very simple manner. We keep track the impediments experienced by the team in an impediment list. Hand in hand with this list is the “1-day rule” that any impediment must be addressed within one day. Even if the root problem cannot be fixed immediately the continuous application of these principles will keep focus on process improvement. Day by day things will become better.

    Addressing impediments is quite painful to large organizations since the process is extremely good at making the organizational dysfunctions very visible. Therefore, Scrum also comes with the warning that “a dead ScrumMaster is a useless ScrumMaster”. We have to adapt to the pace that the organization is capable of absorbing. In the case of bottom-up implementation this may be a very long process - especially if the organization is financially sound. Ironically, in going agile, it is often much simpler to treat a dying patient where “business as usual” is no longer an option and the motivation for fixing the problems is much higher.

  • Going Agile - Setting a Minimal Professional Standard

    Many agile processes are just that: Processes. However, one of the keys to success that is often overlooked is the technical project infrastructure, and the discipline and craftsmanship required by the team. While agile is lightweight it also sets the entry level for professional standards higher than many organizations are used to.

    It is all about setting a standard for quality and craftsmanship with zero tolerance for defects.

    First of all we gain a big win by moving every artefact required to build the project into a source control system and creating a build script to assemble it all. Gone will be the days of “but it works on my machine”. It may seem obvious but for many projects even this is lacking.

    The second step is building something that is installable. This means eliminating the waste of lengthy instructions on how to set up the system properly and replacing them with a script. This way it is easy to deploy the project frequently and without error. The frequent configuration problems that that pop up in complex environments with much manual configuration will be curbed.

    Together, this enables us to know what we have, build it and deploy it in a controlled, repeatable manner with a single click or command.

    So far, it does not take rocket science - just a bit of discipline.

    Then, the next level of professional decency is to apply automated unit testing. Taking a test-first approach will give us a massive quality boost and also, as a secondary effect, a much better architecture. Poorly designed systems are so hard to test that test-first will force us to create better designs.

    Needless to say, the tests should be run as part of the build and a failed test treated as a stop-the-line issue. There is no point in completing the build and deploying a system to a test environment when we already know that the application does not work. Instead, bugs should be fixed immediately when they are discovered.

    All these steps can be done by the developers alone.

    The next step is to introduce automated integration or acceptance testing. Here we engage the customer in defining the test cases. These tests are also run as part of the build. There are many levels of this spanning from integration testing below the GUI level using tools like FIT (http://fitnesse.org/) or through the user interface using frameworks like Selenium (for web applications - http://www.openqa.org/selenium/). The secret is to use simple tools and avoid the kind of brittle tests that break when underlying data changes or someone changes a few controls in the UI.

    Applying automated acceptance testing is a long process. Many times it is practical to start with a small piece like a “smoke test” that tests a few central use cases that exercise the key elements of the system to provide a first-order estimate of its quality. For example when we worked on a system with a distributed object database we had a smoke test that validated that the changes made on one workstation were replicated to the its peers in the local network and to other notes at remote sites. This allowed us to quickly discard a lot of bad builds before spending a lot of time and effort going through the more complex test scenarios.

    The key is to reduce the cycle time by not allowing ourselves to build bugs into the code: the sooner we know that the software is broken, the sooner we can fix it. Less bugs means more finished software, less work-in-progress, less risk, lower costs, shorter time-to-market and higher flexibility.

    All in all that is not a bad result for applying a little professional discipline. It is the starting point for agile development.
     

    (Updated 13. Dec 2006 - improved formatting.)

  • Myths about SCRUM - "it's a daily stand-up meeting"

    Recently I have visited three different project teams that all claimed to be doing Scrum.

    When I quizzed them it turned out that what they did was just short a daily status meeting.

    Now, if that was all there is to Scrum we could have saved the money we spent on ScrumMaster certifications for everybody in Ative. It's is a good start but there is much more to Scrum.  

    The daily status meeting - colloquially called "The Daily Scrum" - is a key activity, however.

    Every project - Scrum or not - can benefit from it since it focuses the team on the situation and raises the awareness of the problems that need to be addressed. For truly dysfunctional teams just the simple fact that it provides a few minutes where everbody takes off their headphones to talk to each other about how to achieve their goal is a big boost to the project.

    Three simple rules apply to the Scrum meeting that distinguish it from old-fashined status mettings. It's short agenda is simply that every team member must present the answers to the following three questions:

    1. What did I do yesterday?
    2. What will I do today?
    3. What is blocking me from working efficiently towards our sprint goal?

    That's all. Timebox it to around 2 minutes per person. Use an hour glass ("minute glass") if necessary so you don't fall into the trap where some old-school project manager type does all the talking and concludes by asking if anybody has anything to add.

    At Maersk Data Defence we also used two additional extra questions from Craig Larman's "Agile & Iterative Development":

    1. Do you have any new items to add to the Sprint backlog?
    2. Have you learned or decided anything new of relevance to the team members (technical, tools, requirements, smarter ways of working...)

    The essence of Scrum is that of a self-directing, self-organising team. This means that the meeting is not about reporting status to a manager, the goal is for the team to self-organise around achieving its goal and removing any obstacles in the way.

  • Great Quote on Testing

    While I was paging through the great Uncle Bob presentation "The Prime Directive: Don't Be Blocked" I noticed a great quote on testing from Kent Beck:

     "You only need to test the things you want to work."

     

    Posted nov 19 2006, 12:55 by Martin Jul with no comments
    Filed under: ,
  • Saved by the ITimestampProvider

    If you are doing any kind of timestamping on your data, testability requires you to get the timestamps from a mockable provider, rather than using unpredictable and thus untestable values from the system.

    For this purpose we usually inject an ITimestampProvider with methods to get the UTC time into any classes that need it.

    Earlier we worked on a military project that did not do this. Instead it relied on a combination of the application and the database assigning timestamps.

    Unfortunately SQL Server DateTime does not offer the same precision as System.DateTime in .NET meaning that the timestamps were truncated to a lower precision when written and so an object read back from the database would not equal the object that was written.

    This is bad for testing. The OR-mapping code in the system was legacy code (meaning: code without tests) so they didn't know this until very late in the project. At that time, fixing it incurred a great cost.

    On our current project we are using an ITimestampProvider and assigning timestamps in the application. The datastore is used for just that - storing data.

    A side-effect of this is that when we discovered some failing tests in the persistence layer due to timestamps being truncated by the database we only had to modify the application in one place: have the timestamp provider provide data that matches the precision of the datastore (we don't need more than a precision of seconds anyway).

    In effect, the requirement to make the code testable forced us to introduce better, more loosely coupled design which in turn saved us a lot of work downstream.

    This way test-driven development is not just about line-by-line code quality. It also drives the system towards a higher quality design.

  • Code Reviews and the Developer Handbook

    We’re six months into a project. The code base is 53,000 statements, 2/3 of which are tests.

    We have been working 6 months according to a set of standards: test-driven development with unit and integration tests, model-view-control architecture, NHibernate OR-mapping for the persistence layer, and the Castle container for weaving it all together.

    Then an external consultant shows up. He has been hired to check the quality of the work to ensure that a “normal” developer can take the code and work with it inside a “reasonable” amount of time.

    I convince him to look at a few architecture diagrams, before he starts looking at the code. I try to show him how to find the implementation code corresponding to the tests but he is not interested in the tests. His goal is to ensure that code is properly documented – and in his world this means the implementation only, not the tests.

    Instead he starts at the top with one of the client applications, a web-based intranet for editing data in the system. He starts browsing the code – looking through the container and some of the controller and model classes that it uses.

    A few weeks later we get a report.

    There should be more comments in the code (our guideline is to write simple, readable code rather than spaghetti with comments).

    Oh and member variables should be prefixed with an m, value parameters with a v and reference parameters with an r. And the type of a variable must be derivable from its name.

    He also raises a number of concerns about the use of NHibernate and Castle. Basically he wants us to write documentation for these as well (never mind they come with documented source and books and on-line articles).

    More or less the audit says to use his idea of what constitutes a good project. So basically he is testing the documentation quality against an arbitrary set of requirements.

    We need to devote a lot of effort to convince him.

    So, note to self – a pragmatic solution for documentation and auditing:

    • Write a simple developers handbook with coding and documentation guidelines etc. and have the customer sign it off. 
    • Involve the auditor early to check the guidelines. 
    • Use the handbook as the guideline for audits. 
    • In any event have the auditor provide a set of “functional requirements” to the documentation, eg. “it should be possible to easily find out what a method is doing” rather than saying arcane things like “there should be a two lines of comments for every branch statement”. 
    • Create a HOWTO document for each of these requirements and add it to the developer handbook: for example, “The requirements are captured in integration tests using the Selenium tool, they are named so and so.” etc.. 

    Update Nov. 6, 2006: Please note that I'm not advocating a heavy-weight developer's handbook, just a small document that shows the file structure, naming conventions, how to get the source from version control, how to build and run the tests etc. I've spent time on a project with a big guideline on whether or not to put curly-braces on a separate line etc. and its a waste of time. For code layout I recommend to just write readable code that looks like the rest of the code in the system and rely on the default formatting in IDE.

  • Iterative Means Baby Steps

    I am working on an agile development project in tandem with a database team using a traditional somewhat iterative waterfall approach.

    The application has some metadata that are basically a bunch of enumerations in the database.

    In order to support multiple languages in the GUI we had added a table with translations of the language specific text for one type of metadata to the development database. This way we have a language-neutral domain model and can ask for translations of various bits of the metadata (say, for populating dropdown lists with something humanly readable).

    This step taken, we asked the database team to incorporate the translation table in the database in the future.

    Then we continued working on the steps for fulfilling the requirements involving that particular dropdown list.

    Some days later the DBAs asked us about a general approach for translating all metadata. We discussed it briefly and they set off to work.

    When they delivered the database the week after we discovered that generalizing from that single example had been a very poor idea. Meanwhile we had uncovered some more requirements with the customer and that particular solution was very poor for a lot of the other data.

    In other words they had implemented the wrong thing for every piece of metadata except that first table. Waterfall at its best.

    This of course now forced the database group to undo a lot of work and come up with another solution spending valuable resources on the way. We had tried to think to far ahead, make uninformed decisions up-front instead of informed, on-demand decision-making. And predictably, the waterfall approach once again proved itself a loser.

    Now, on the application side no rework was needed: by focusing on solving only the specific, known problem we had produced no code to solve the unknown requirements. No code needed to be changed.

    Agile development done right is a big timesaver and it underlines the fact that iterative development is not about week- or daylong iterations, but "baby step" iterations of minutes and seconds.

    Posted okt 27 2006, 11:45 by Martin Jul with no comments
    Filed under:
  • Great Quote on Iterative Development

    "A complex system that works is invariably found to have evolved from a simple system that worked…."

    "A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over, beginning with a working simple system."

    - Jason Fried, 37 signals

     

  • Performance and Scalability Myths

    "When I hear the word performance I reach for my gun".

    In the fuzzy front end of a project when unknowns abound people need a sense of stable footing.

    Since it usually takes a long (calendar) time to understand the requirements and domain discussions tend to focus on more concrete things like the application architecture, and eventually everybody will be obsessed with performance. It is a safe harbour for “productive discussions” when the waters are full of unknown monsters like fuzzy or non-existing requirements, a vague understanding of the domain etc.

    So performance discussions ensue. Many times the customer wants to know the hardware requirements up front so we order heaps of multi-CPU servers, SANs, middle-tier application servers, front-end application servers, load-balancers etc.

    The next natural step is to think up an intricate object or component distribution strategy to ensure “scalability”.

    I don’t know of anyone who ever got fired for “designing a system for high performance and scalability” and building complex, buzzword compliant distributed application architectures.

    But maybe someone should be. Just to set an example.

    When I worked with the outstanding architect Bjarne Hansen he would often mutter, “Fowler, page 87”.

    He was referring to Martin Fowler’s book, Patterns of Enterprise Application Architecture. Page 87 has a section title in big bold letters: “The allure of distributed objects” – and it addresses precisely that.

    We worked on a system designed to deliver high performance. It had a set of multi-CPU servers, a fiber SAN, a separate multi-CPU application server, handwritten OR mappers using high-performance stored procedures and a complex distributed caching object database on top to keep all the clients in sync and take the load off the database server.

    As for performance…

    It took some 80 people around two years to build the system. I worked with Martin Gildenpfenning for a total of about two man-weeks to optimize away the bottlenecks in the application. At that point it ran faster on a developer desktop with all the components deployed on a single machine including client, server, database server and GIS system than on the high-powered multi-server deployment set-up.

    It turned out network latency was the limiting factor.

    The system had been designed to optimize a scenario that did not happen in practice.

    As Donald Knuth put it, more often than not premature optimization is the root of all evil.

    The optimization work we did was quite simple. We did so as performance as bottlenecks became evident as we neared the end of the development cycle. At that point we had a real working application and knew the use cases to optimize. Armed with a set of profilers the task was quite easy.

    In fact, the big lesson was that once the app is there and the real data is there it is easy to find the bottlenecks – and in general most performance bottlenecks can be solved in a very local manner: most of the bottlenecks could be removed by refactoring a single module only. Mostly it was a matter of replacing a algorithm with a better (non-quadratic!) one or introduce a bit of caching, for example, to not load the same object more than once from the database during the same transaction.

    Discussing performance early in a project embodies the complete fallacies of the waterfall project model. It does not work. It is broken. Given a good design, optimizing performance is a simple, measurement-driven task that belongs in the end of every iteration in the project.

    1) “It is easier to optimize correct code than to correct optimized code.” (Bill Harlan, http://billharlan.com/pub/papers/A_Tirade_Against_the_Cult_of_Performance.html)

    2) Don’t pretend you can spec the hardware for a system before you even know the requirements.

    Someone said on the Ruby on Rails IRC channel:
    > “PHP5 is faster than Rails”
    < “Not if you’re a programmer”

    Google and O’Reilley’s “Hacker of the Year” award recipient for 2005, David Hanson, related how they built and deployed the first major Rails application on an 800 MHz Intel Celeron server running the full web-server-app-server-database application stack. It maxed out when they had about 20,000 users on the application and at that time it was easy to scale it out.

    At the same time Rails is widely claimed to “not be scalable” by the members of the J2EE “enterprise app” community. These are the same people who designed our multi-tier architecture that only maxed out the CPU idle times when it was put into production.

    So, alas here is a checklist for your next project:

    1. Get some hard, measurable targets for performance and expected transaction volume. 
    2. If performance comes up, ask for the CPU, network, I/O utilization statistics for the current system (if it exists). At least this will provide some guidance to whether or not performance will be an issue.
    3. If you are asked to design the hardware, suggest building the app on a single server with plenty of RAM and doing some measurements later. Even if performance becomes an issue, you will benefit from some additional months of Moore’s law to get a better deal on those Itanium boxes.
    4. Create a simple design with no optimizations: use an off-the-shelf O/R-mapper; resist the urge to build complex caches, keep everything in the same process, fight the urge to write those “performance enhancing” stored procedures. Figure out the common use cases and implement a spike with them. Even if the tools and simple design introduces a few milliseconds of overhead you will have saved plenty of man-months in development to pay for an extra CPU to compensate.
    5. Scale the application out, not the components. The big lesson from the big web sites is that buying a load balancer and a rack of cheap servers each running the full application is good enough. In fact, for most applications you don’t even need more than one server.
    6. Measure early, measure often.
  • ScrumMaster Certifications

    As of today everybody in Ative is a certified ScrumMaster. It has been a great course taught by Jens Østergaard and Jeff Sutherland, "the father of Scrum". We are looking forward to bringing the full Scrum arsenal to work for helping our clients create better software faster. 

     

More Posts « Previous page - Next page »
© Ative Consulting ApS