in

Ative at Work

Agile software development

Ative at Work

januar 2007 - Posts

  • The Waste of Defects - Bugs are Stop-the-Line Issues

    "Don't clean it, " my grandmother used to say; "keep it clean.". 

    She probably learned it long before the computer era. Yet for some reason her advice did not spread to the software industry. We still have a tendency to build up a big mess and put off cleaning it up until much later. I am thinking about the waste of defects - the lean principle of preventing trouble from creeping in rather than struggling to get it out after the fact.

    In software, as in lean, bugs are stop-the-line issues.
     
    Once you find them you have to fix them before moving on. Period. 
     
    I have heard many excuses for accepting mediocre quality, but no good ones. If it doesn't have to work, why do we build it at all? In fact, if we explicitly want to build something that does not have to work, the easiest way is not to build it at all!
     
    So, if we want it built it is fair to say that we want it to work. Therefore, we have to make sure that it works when we build it, and that it keeps working - namely, that we fix bugs as soon as they appear.
     
    A side benefit of this that we might not even need a bug tracking system - after all, it only exists to manage all the defects that we should not have allowed to linger in the system in the first place. If we fix the issues as soon as we find them, we can easily track the number of open bugs on a single post-it note. 

    We did this on a mainframe migration projects. We built it test-driven and fixed the odd uncaught defect on the spot. In the end, we had some 7000 automated test cases to keep the system in a known good state. And no bug tracking system.
     
    So, in many ways a bug tracking system could be considered an indicator of waste in the organisation. However, for an organisation with low software quality, the non-existence of a bug-tracking system is an indicator for even greater problems.
     
    The transition to treating defects as a stop-the-line issue will definitely be painful for many organisations. Remeber that it took Toyota about a month to build the first car after introducing this concept in the NUMMI factory they acquired from General Motors. In software it is often worse.
     
    First of all, some teams - namely the teams with the highest technical debt or lowest output quality will appear to not produce anything. They will suddenly spend all their time cleaning up the house rather than adding new features.
     
    The downside is that the low quality suddenly becomes painfully visible to the whole organisation. Since many organisations measure new features or delivering on schedule as success and shy away from measuring quality this will create a sense of crisis.
     
    The upside is that the quality becomes painfully visible. The teams that produce low quality will be stopped from producing more low-quality stuff whereas the teams with a higher output quality can dedicate a higher proportion of their effort to building new software. This creates a virtuous circle where all new software is produced by the teams that are capable of building the highest quality software. The other teams will be busy cleaning up their mess.
     
    The net result is legacy software base that improves and higher quality new software.
     
    The pain of transition is so great that many organisations shy away and prefer to run up their technical debt instead.

    Sometime they try to hide it by frequent changes of "strategic platform": when the mainframe was replaced in favour of Java it initially appeared to be much more productive but eventually the organisational bent to produce bugs rather than fixing them built up enough technical debt that the Java platform deteriorated to the same level as the mainframe. Then came the time for another "strategic platform shift" to .NET - and development productivity soared, only to decline over time as the code-base atrophied, making the organisation ready for "the next big platform."

    The underlying issue is that the technology or platform is not the root cause of the unacceptable productivity levels. It is the organisational culture of accepting low quality that is the cause. This cannot be helped by replacing the technology. The remedy is all about behaviour.
     
    We change this behaviour top-down or bottom-up. But we need to fix it. There is no excuse to wait. We have to stop the line, and stop building bugs into our software. Not only for the quality, but also for the productivity. This is what lean software development is all about.

  • Lean Principle Number 1 - Eliminate Waste

    The key principle in lean manufacturing is to eliminate waste.

    The Toyota Production System names seven major sources of waste. Mary and Tom Poppendieck list these in their book Lean Software Development - an Agile Toolkit (Addison-Wesley, 2003) with the translation to software development:

    The Seven Wastes of Manufacturing

    The Seven Wastes of Software Development

    Inventory

    Partially Done Work

    Extra Processing

    Extra Processes

    Overproduction

    Extra Features

    Transportation

    Task Switching

    Waiting

    Waiting

    Motion

    Motion

    Defects

    Defects

    Partially Done Work

    Partially Done Work is a frequent cause of failure in many projects. It usually begins with the best intentions such as creating expert teams for each layer in the application. Put a database team in one room and the middleware team in another room and the application team in the next building and plan single big-bang integration phase in the end of the project and you have a recipe for disaster. The experience from lean manufacturing is that it is faster to do single-piece flow and finish a feature completely rather than taking in the overhead of managing huge batches of work-in-progress. It also takes out a lot of the risk since we are always 100% done with x% of the application - not x% done with 100% of the application. This permits us to realize the cost savings or extra revenue of the feature early rather than later. The key is to integrate early and often, deliver incrementally and manage the product lifecycle thinking in small releases.

    Extra Processes

    It also relates to Extra Processes. Many of the practices of "professional" software development organsations are needed only to provide a sense of control while not addressing the underlying problem. It is like repeatedly shooting yourself in the foot and then priding yourself on how your organisation's surgeons are really good at patching it up afterwards.

    Examples of extra processes include all kinds of requirements, analysis, design, and test documents that no-one will ever read, the endless meetings with more people than are really needed or all the bureaucratic inventions to prevent change - Change Control Boards, reviews, complex ISO and CMMI processes that make simple work hard. It also includes the manual tasks that should be automated but are not (see Setting a Minimal Professional Standard http://community.ative.dk/blogs/ative/archive/2006/12/10/Going-Agile-_2D00_-Setting-a-Minimal-Professional-Standard.aspx). In short, extra processes are all the work that intelligent people wouldn't do without being told.

    For example, one particular system we worked on had a rigorous ISO process for change control that made it practically impossible to fix anything that had gone through all the reviews. In one case, we discovered a private member of a class that should have been publicly accessible - something that is very easy to fix if there are no extra processes. However, since the module was already reviewed and "finished" the process required us to change the code, rerun all the test cases, copy the output into a Word document, go through a review meeting and update a host of documents. All this for changing the word "private" to "public" in a single source file. Needless to say processes like this drives up the cost of software development dramatically and in many cases the only business value is being applauded by the ISO auditors for following the process very rigorously.   

    Extra Features

    The easiest way to finish faster is to write less software. The oft quoted Standish Group study from XP 2002 concluded that 45% of features in a typical application were never used and only 20% of features were used often or always. A good Product Owner with an iterative, incremental process can leverage this to get dramatic cost savings by simply building the most important features first and stopping the development process when the marginal value of extra features no longer justifies further investment. Also, by delivering incrementally we can start learning generating ROI on the most valuable features faster - and learn about the system so we are better able to prioritize the remaining work. A further benefit of delivering only the essentials is that we end up with a smaller, simpler code-base which is easier to change and evolve - which means we can go faster in the future since we are no longer carrying a huge burden of non-productive weight.

    Task Switching

    Task switching is another common productivity killer. Programmers know the state of "flow" where work is focused, efficient and effortless. It takes a while to get into this state. This is one of the reasons that single-piece flow is such a great way to work: we start something, we focus and we finish. There are no interruptions and context switching to disturb us from getting into the state of flow and we free up all the resources we would otherwise use to track all our bits of work-in-progress. 

    In many organizations task switching is caused by sharing people across multiple teams. For example, on one project I worked with an analyst who was 50% allocated to another team. After a while he came to our status meeting and told the project manager that they should not expect much output from him on any of the teams: he almost was fully booked just by attending all the meetings in the two teams. Here, focusing on one team for one sprint at a time was much more efficient.

    Waiting

    Waiting is another waste. If it takes three hours of work to accomplish a certain feature you should not have to wait four months to have it delivered. This is a waste. The anti-dote provided by lean is to use value stream analysis to look at the value stream of the whole development process from the customer's perspective and comparing the hours of work done to the waiting time (calendar time) from the moment the concept is conceived to the moment the feature is delivered. In many organizations the efficiency of the production cycle - the work effort compared to the time it takes to complete the process - is less than 1%. Most of the features spend their lives sitting in queues, waiting for the next Change Control Board meeting, living in Requirements documents, waiting in Analysis documents before being implemented, waiting for system integration, waiting for the testers to test them, or waiting in a bug tracking system to be corrected or waiting for the next bi-annual deployment window. All this waiting time accumulates Partially Done Work, causes us to introduce complex procedures to manage it and adds no business value to the customer. Therefore, it must be eliminated.

    The approach here is to take a simple approach. If you need it - build it now. Select only the most valuable features then deliver them quickly. Match the in-flow of tasks and the production capability. Specifically think about your team like you dimension your server hardware and don't try to run it at peak utilization. In fact, the highest efficiency is generally found when there organization is running at no more than 80% utilization since this provides much better latency.

    Motion

    Motion is one of the other wastes. It occurs when we need to go somewhere else to solve a problem - if the customer or an expert is not at hand when we need answers to questions or if the team is spread out over multiple offices, floors, buildings or even time-zones. (We have written about this in Refactoring the Physical Workspace http://community.ative.dk/blogs/ative/archive/2006/10/13/Refactoring-The-Physical-Workspace.aspx). It also happens during the many hand-offs that are so typical in waterfall development. One person writes the requirements, then an analyst translates it to something for the technical team, then another person takes over and interprets it to code it etc. The documents that are handed over at each step will never be adequate. The cure is not to write more detailed documents since this takes even longer and only provides a false sense of security. The best cure is simply to eliminate the hand-offs altogether and have developers and customers sit together to figure out what needs to be done, code it up as an acceptance test and implement the code to make it pass quickly.

    Defects

    This also leads to the last of the wastes: defects. As a rule of thumb, the longer a defect is allowed to stay in the system, the more it costs to fix it. First of all there is the risk of change, we might have to do a costly recall, repeat a lot of manual testing, revise the documentation etc. On the other hand, if we fix the bug immediately, or even better - if we don't allow it to come into the system in the first place we reduce the cost of fixing the defects to almost zero. The key here is to realize that software with defects is work-in-progress. This is the reason that test-driven development and automated regression test suites are among the key practices of lean software development teams. In fact, preventing defects is so essential that we will devote a whole article to it shortly. Stay tuned.

     

  • Lean Software Development

    Today the lean meme is in every business newspaper. It has been a long time coming. Some companies like Toyota have lived these principles since the 1950s when their lack of capital forced them to improve their production processes radically. In the West it appeared on the radar with the books by Womack & Jones published from 1990 and onwards. Now, the ideas have gained mainstream mindshare and are being applied in many fields. Applied to software development “lean” provides a great toolbox of agile methods to help radically improve development efficiency.

    Deliver High Customer Value Quickly at Low Cost

    The lean starting point is to optimize the production from a customer-centric perspective. The goal is to provide the customer with maximum value quickly and at a low cost. Therefore the first step is to define the value of the product (or features) from the customer’s perspective. Translated to Scrum this is the job of the Product Owner.

    Eliminate Waste

    The central point of lean is to eliminate "muda", or waste. Muda is defined as all the activities and steps in a production process that add no value to the customer. In software, examples are work-in-progress, defects, features that are not necessary, the bureaucratic hindrances in traditional software development organizations and all the stuff and over-generalizations that developers love to do (“we might need it later”) even when a much simpler solution will suffice.

    Since there is so much waste a typical lean transition begins by mapping the value stream for the complete production process from the customer's perspective. Then we reduce it to only the steps that add value to the customer. This is usually a very small subset.

    Introduce Single-Piece Flow

    Once we know the steps we organize the remaining steps into the best sequence and introducing “single-piece flow”: completing the production of a whole but small product in one continuous sequence with no delays.

    Use Pull Scheduling

    At the integration points between the single-piece flow “cells” the lean method of scheduling is using “pull”. When a downstream consumer is ready for processing a new part it requests it from its upstream supplier. This makes scheduling very simple. There is no inventory to keep or partially made items to track. The synchronization mechanism is simply to keep everyone working in the same “takt time”, producing new stuff at just the rate it is needed. Therefore no waiting occurs. This is a simple way to optimize the throughput without complex planning systems.

    Improve The Process Continuously
     
    After the initial radical reengineering of the production process we continuously focus on improving the process as we learn and innovate, so that the effort and time do the work keeps falling. This produces a virtuous cycle. In the words of Womack & Jones lean is a method to do more and more with less and less.

    Stop-the-Line Root Cause Problem Fixing

    A lean process stops when there is a problem and it does not restart until the root cause of the problem has been fixed.

    One of the legendary lean examples of this is when Toyota took control of the NUMMI car factory in the USA. The US tradition was to let defects slip and fix them in QA later. Instead, Toyota simply it told the workers to do good work and stop the assembly line when they could do their work properly.

    It took about month to produce the first vehicle.

    Since then, however their quality and productivity has been outstanding and they have earned an impressive number of awards.

    To teach people the stop-the-line culture Taiichi Ohno of Toyota used the principle of “five whys”. When we encounter a problem we ask why five times to unveil the true root cause of the problem rather than just treating the symptoms. In software you see that very good operations people have a natural tendency to do this, while most developers faced with the same problem tend to just advice to “restart the server” or some other fix that does not unveil the nature of the problem or prevent it from reoccurring. The stop-the-line practice improves quality - while the other just accepts low quality without fixing it. One of the central lessons of lean is that you have to improve quality to go faster so by continuously removing the root causes of problems we also improve the efficiency of the entire production system. In fact, from a customer perspective testing to find defects is a “muda” whereas testing to prevent defects from occurring is not. In software terms this makes the case for test-driven development and identifies test-later as an expensive, wasteful practice.

    In the coming months we will explore lean and its applications to software development in this blog. Until then - Happy New Year and keep the comments coming. We really appreciate your participation.


    Further Reading
     

    • Taiichi Ohno - one of the greatest industrial innovators of the 20th century, the father of the Toyota Production System. After spending his career relentlessly optimizing manufacturing at Toyota he wrote the book Toyota Production System: Beyond Large-scale Production that describes his work.
    • Womack & Jones - Their books are great and it is well worth to read them all to see a lot of the principles and case studies for lean thinking. Also, it is quite interesting to see that software development is now rediscovering some of the things that manufacturing learned much earlier - in the case of Toyota as early as in the 1950s and 1960s. Begin your studies with The Machine That Changed the World, a five-year study of the global auto industry from MIT and go on with the Lean Thinking and Lean Solutions. They give a fascinating perspective on manufacturing and plenty of examples of the lean principles and they applications. 
    • Mary and Tom Poppendieck - with a background in manufacturing and software they have translated the concepts of lean to software development. They have written two great books and Implementing Lean Software Development and Lean Software Development - an Agile Toolkit. Both books are well worth reading a present a both the principles and lot of cases in a friendly, colloquial manner. Mary Poppendieck is a frequent conference speaker. We saw her giving some very impressive lectures at Agile 2006 (www.agile2006.org) and there is plenty of opportunity to see her on the conference circuit. Highly recommended!

    Updated January 11, 2007: fixed grammar and typos.

    Posted jan 08 2007, 09:06 by Martin Jul with 9 comment(s)
    Filed under: ,
© Ative Consulting ApS