Foundational Agile Practices
In February 2001, a group of software developers met to discuss lightweight development methods. Out of these discussions they identified four key values for agile software development. This has since become known as the Agile Manifesto:
- Individuals and interactions over processes and tools
- Working software over comprehensive documentation
- Customer collaboration over contract negotiation
- Responding to change over following a plan
Supplementing the manifesto, twelve principles were defined that elaborate further on what it means to be agile:
- Customer satisfaction by rapid delivery of useful software
- Welcome changing requirements, even late in development
- Working software is delivered frequently (weeks rather than months)
- Close, daily cooperation between business people and developers
- Projects are built around motivated individuals, who should be trusted
- Face-to-face conversation is the best form of communication (co-location)
- Working software is the principal measure of progress
- Sustainable development, able to maintain a constant pace
- Continuous attention to technical excellence and good design
- Simplicity—the art of maximizing the amount of work not done—is essential
- Self-organizing teams
- Regular adaptation to changing circumstances
Notice that nowhere in either the manifesto or the principles, are specific practices like scrum or eXtreme Programming mentioned. Both the manifesto and the twelve principles are methodology-agnostic. The message is that it does not matter what specific practices you are using so long as they are consistent with the principles. In fact principle 12: regular adaptation to changing circumstances, or inspect and adapt, is an exhortation not to rigidly follow any specific practice, but to constantly be experimenting with the goal of continually improving your performance.
Today, many organizations practice scrum – the de-facto framework for doing iterations. But doing iterations alone does not necessarily produce deployable code as an output, unless backed up by 2 prerequisite sets of practices, namely Continuous Integration and Test Automation.
Take for example a communications product that resides at the core of a high-speed network. This product not only executes multiple routing protocols, and needs to be highly configurable and manageable, but also interacts with multiple external devices and products, against all of which it must be tested to ensure interoperability and tolerance to many types of adverse events in the network. In order to do this, thousands of test cases must be executed, including functional, performance, stress and interoperability test cases, which traditionally may have taken months to complete. In an ideal world, a full regression would be performed on every change submitted to the build system. At minimum, a very substantial subset of the regression suite needs to be done at least once before the sprint is declared done. A very high level of test automation is required to support this requirement.
But there is another requirement. Traditional construction has involved multiple developers working on private code branches. Changes from each developer are coded/unit tested on their own private branch and then checked into a trunk line from where the next build is produced. New builds must be ‘sanity tested’ for integrity, and then made available to the development team for verification of their changes before the system test team takes over. The process of getting a new build to system test may take several days, and many organizations working in this mode may typically make one new build per week. This is the ‘big batch’ approach to software construction which simply is not consistent with delivering production-ready code in 2-week iterations. Changes from developers must be submitted continuously, and these changes must be of very high quality to ensure minimal re-work, and minimal accumulation of an open bug backlog.
Whereas scrum and be taught and implemented in a fairly short time frame, building an effective CI system, and automating large legacy test suites may require significant financial investment and time to get into place.
In summary, agile software development must comprise elements of:
- Iterative development e.g. Scrum
- Continuous Integration
- Test Automation
In traditional, so-called waterfall software development, the starting point for project planning is a product requirements document (a “PRD”). All of the requirements must be defined, reviewed and approved up-front before planning or development begins. Additionally, the development team may produce a software functional specification (“SFS”) that takes each requirement in the PRD and elaborates it into more detail, adding technical requirements and constraints. Both of these documents are then used as the basis of planning subsequent development activities required to deliver the content specified in the PRD. In order to derive a development plan and schedule, every task required to construct and test every feature must be estimated. Once development has been completed, there is a handoff of partially tested features to a test or “QA” organization, followed by several weeks or months of testing and bug-fixing. The length of this phase will be highly dependent on the quality of the code at the point of handoff. Finally, at some point, a decision to release the product to the market must be made, and this is where the usual tradeoffs between schedule, features and quality are fought over. The delivered software is usually accompanied by a set of ‘release notes’, listing all of the features that were actually delivered, plus all of the bugs that the team did not have time to fix.
Generic Waterfall Framework
Software development is referred to as an ‘empirical process’ by advocates of scrum. The output of almost every stage of software development is impossible to predict with precision, and each step produces defects that must be re-worked. This so-called ‘plan-driven’ paradigm has been demonstrated time and again not to be amenable to the level of precision expected from other forms of production such as manufacturing and construction. The throughput of a production line can be specified with a high level of precision and associated quality can be defined at very high levels, such as ‘six sigma’ (less than 4 defects per million units). It is difficult to imagine any large software project coming remotely close to this standard. Agile is a response to this problem.
We do not want to spend much time here listing the disadvantages of traditional waterfall development, however, we should note that the waterfall and its variants have delivered many successful software products over the past several decades, despite all of the known limitations and attendant budget overruns, quality problems, and stressed out development teams. Further, I have worked on many successful waterfall programs that did indeed deliver on-time with high quality. One of the key ingredients of success was building in sufficient instrumentation into the project and getting visibility into emerging risks sufficiently early that sensible tradeoffs could be made to ensure that major priorities could be met. Let us also note that there are many project activities beyond the boundaries of the development team that need to occur for the release of a commercially successful software product. Product development teams in large companies operate within a broader organizational ecosystem that includes multiple other functions which must work together to define, develop, market and support new products. Teams that want to adopt agile development practices must recognize that this cannot be accomplished in a vacuum, and must address how to achieve an effective integration with other inter-dependent functions. Furthermore, optimization of a single function (the software development team) alone will not be enough to deliver the maximum possible benefit to the customer, and a broader approach will be needed to fully realize the advantages of agile and lean.
Key Process Differences
- Requirements are not defined in detail up-front. Requirements are continuously elaborated into technical details throughout the project. The goal is not to nail down all requirements in detail up-front, but to expect and accommodate changes as the project evolves. From a business perspective this means that the cost of change is much lower than with waterfall.
- Release-level planning is based on relative sizing of requirements, together with knowledge of the team’s production capacity, or velocity – not on the time and effort for development tasks
- Detailed task-level estimating is only performed at the beginning of each iteration.
- There are no hand-offs between developers and testers, testing runs concurrently with design and coding.
- Each iteration is completed to an agreed definition of ‘done’, which always includes the requirement that all bugs are fixed within the iteration. The output of an iteration is production-ready code.
- Iterations are ‘time-boxed’. A 2-week iteration ends in 2 weeks whether all of the planned work is completed or not. This is required to maintain an accurate velocity assessment. The team’s velocity is used as the basis of release-level planning.
- The team meets daily to synchronize with each other and to highlight any issues that are impeding progress. This process maintains team alignment and drives accountability, while minimizing the need for written reporting and redundant communication. The Daily Standup should take place at the same time and place each day and last no more than 15 minutes. Three basic questions are addressed by each member of the team: What have I accomplished since yesterday, what do I plan to accomplish today, and, what impediments or roadblocks are getting in my way. The session is not intended to be a problem-solving one, and any items that require a more in-depth discussion should be taken outside the meeting. In addition to the three basic questions, the team must also synchronize on their project data – remaining story points, open defect counts and so on, confirming that they are on track to deliver all committed user stories. If the team is co-located and have a dedicated meeting room, then this data can be tracked on a whiteboard. If the team is distributed then the data can be tracked using a spreadsheet or web-based scrum tool and then shared using one of the many collaboration technologies now available.
- At the end of each iteration, the team demonstrates the results of their work to the product owner and other stakeholders. This provides opportunity for feedback and any course corrections.
- At the end of each iteration a retrospective review is carried out by the team to identify factors impacting the team’s performance, and to establish opportunities for improvement. This process could be as simple as having everyone on the team answer a three part question: What should we stop doing, what should we start doing and what should we continue doing? This is how the team ‘inspects and adapts’ to continuously improve their processes.
Sustaining an agile transformation requires the transition to a continuous improvement mindset. Absent this, agile adoption, or scrum, (or any other change initiative like Lean, TQM, Six Sigma…) has a high probability of stalling, or even complete failure. Further, continuous improvement must be applied horizontally across the entire organization and not just within functional silos like the software development department. In scrum, the retrospective is intended to provide teams with the opportunity to reflect on their work, and try new approaches in order to improve. This, in principle sounds good, but in practice I have witnessed many retrospectives that were entirely superficial and where no serious attempt was made to dig into systemic issues that had the potential to deliver quantifiable improvements. Meaningful improvement requires data, and the old adage that you can’t improve what you can’t (or won’t) measure is very true. Improvement also requires that teams are willing to recognize problems and confront them. Hiding or ignoring problems is the easy way out, but leads ultimately to failure. This is why I encourage teams to show up at retrospectives armed with data from the sprint just completed. I prefer to see a data-driven retrospective rather than hear comments about what team members feel went well or not well.
Key Role Differences
- Product Owner: The product owner owns and manages the product backlog. In addition to traditional product management responsibilities, product owners are an integral part of the team, and are involved in all aspects of the process from release and iteration planning, to daily stand-up meetings, acceptance of completed user stories, and participation in demos and retrospectives.
- Scrum Master: The scrum master is not the project manager in the traditional sense, but rather fulfills the role of scrum process expert, guiding and coaching the team in all aspects of scrum. The scrum master may lead the daily stand-ups, and may also take the lead in working to remove any impediments identified by the team. In principle any member of the team can assume this role, but basic facilitation skills are essential for success.
- Team: The team has all of the functions and expertise necessary to deliver a fully shippable increment.
The following diagram summarizes the elements of the scrum framework:
Elements of Scrum
There is a common misperception that scrum is an invitation to abandon process – perhaps through poor interpretation of the Agile Manifesto. An effective implementation requires a level of discipline that some may not fully appreciate, and which most teams will take time to adjust to.
Basic Scrum Framework
The generic scrum framework is quite simple and can be learned and put into practice very quickly by small, co-located teams. Minimal tooling is required for planning and tracking releases and iterations (scrumboards in the team standup room, or spreadsheets). A full-time product owner embedded within the team, and available for planning and attending daily stand-ups, means that requirements can be defined and translated into user stories without a lot formality, and changes can be discussed and agreed quickly.
There is one universal framework but no single universal process that can be specified in detail. The situation for each team is unique in terms of product, technology and organization right down to the individuals on the team. Each team is tasked with continuously evaluating the results of their efforts and for identifying opportunities to improve.
Ideal Iteration Length – A survey
Recently I put the question of the rationale for a max sprint length of 30 days to one of my LinkedIn groups. Here are the responses:
- The idea is that anything over 30 days is too large to effectively break down and estimate properly and for everyone to keep that model in their head. It also keeps you focused on the quick evolution of the team to learn through regular small mistakes, instead of remembering what went wrong months ago.
- The idea is to fail fast and more than 1 month is not fast enough
- Short sprints and small user stories provides the ability to test early, deliver usable functionality incrementally in small batch sizes, with lower risk.
- Delivering working software in every sprint over 30 days is called a project. If you keep in mind that the content of the sprint is frozen and that a scrum team should be around 7+-2 people you freeze 7-9 man months or more of work in advance without allowing the customer or product owner to have a say or see a working product in between. This is a huge investment with high risk of error. It also prevents you from re-evaluating your way of working often and to review your failures and learn from them continuously.
- At the beginning, the requirements for that Sprint have been frozen. So the question of Sprint length has also to do with the ability of the Product Owner and Scrum Master to keep these requirements untouched by external influences and to negotiate new requirements onto the Product Backlog for review on the next planning meeting. If you are getting a lot of ‘this just can’t wait’, then shorter Sprints are better.
- You lose the pressure if you start a sprint longer than 30 days. I’ve experienced a 10 days sprint as the best. Try a discussion about changes with the team after you’ve estimated all tasks.
- 10 days is the right amount. In addition, this helps with user exceptions for urgent changes or issues. It’s an easier sell if I tell them this sprint is set (or even the next one). At the most they typically only have to wait 3 weeks for us to get started on something new for them.
- A shorter sprint duration reduces risk. I would add that shorter sprints force more frequent synchronization and convergence of the different work streams occurring within a project. If there is any incompatibility between, say, modules of code it would be discovered sooner (and presumably be fixed sooner) thus reducing the risk of having a much bigger incompatibility problem that might arise from longer sprints. It’s better to fail fast because big problems discovered later than sooner are a recipe for blowing the schedule.
- We have tried two week sprints and three weeks sprint and the feeling was that three weeks was the perfect length. When we did two week sprints it felt like we were always starting or ending and the teams stress level was too high.
- This also comes down to how the human brain works. If you have a term paper due at the end of the semester, you tend to start working on it at the end of the semester. If you had an outline due three weeks into the semester, you’d start working on your Term Paper two weeks into the semester. Slicing into chunks (sprints) helps to prevent all the work piling up at the end. This goes straight to the Agile Principle of “promote sustainable development.”
- My Agile coach always recommends starting with one week sprints for new Scrum projects. When you are just starting, how you use Scrum is as much or more important than what is being built. With one week sprints you have a shorter time between learning cycles to adjust things like estimating, how many story points the team takes on and so on.
- There are three reasons. Two are explicit in Scrum the other isn’t mentioned, but is one of the foundations on which Scrum is based. The explicit ones are feedback and enforced view of reality. The second is removing delays.
- You want quick feedback. Longer than 30 days does not force you to get it. But feedback exists in many forms. There is the time from getting a request until delivering it. Time from starting a story until it is complete. This is where lessons are learned quickly. I personally don’t like to fail fast, I prefer to learn fast. One does not need to write code to discover the customer doesn’t want it, one needs to write a little where the customer is clear, show it to them, and enable the customer to learn a little bit more. But, if you are going to fail, do it quickly.
- By enforced view of reality, I mean that things will show up to make it difficult to deliver value in 30 days. For example, customers may not want to talk to you that often, build processes may be inefficient, testing and coding may be too separated from each other. By having a time-box in which your work is intended to get completed, those things that work against you will become more visible and painful. These are impediments to your work actually, and instead of avoiding them by making the sprint length longer, you actually need to shorten the sprint to expose these even more. It is also harder to ignore them – if you haven’t tested your code by sprint’s end – it’ll be obvious and irrefutable. Both of these reasons actually tend to call for even shorter than 30 day sprints. Most successful Scrum teams I know of use 1-2 week sprints. In fact, I’d go so far as saying teams with 3-4 week sprints are showing a “smell” of not being engaged enough in solving their problems (not always true, but often true).
- The third reason, which in many ways is the biggest one, is removal of delays. Since about 2004, I have been claiming that Scrum is a weak implementation of Lean principles. I say “weak” because Scrum does not deal with optimizing the whole and it leaves out a lot of lean-education and management. But one of the key tenets of Lean is to remove delays so value can be delivered quickly. This is critical because delays between steps of work literally create more work. The time-boxing of Scrum requires teams to complete their work quickly. This all reduces delays and therefore avoids incurring work that doesn’t need to happen.
- The combination of feedback, awareness and removal of delays drives us to have shorter feedback loops until the overhead of the sprint overcomes the value achieved by them. For most teams this will be 1-2 weeks. Some teams that discover they don’t need the discipline of the time box will abandon it completely and move to Kanban. I might add that a simple value stream analysis will show most that “the shorter Sprint the better”. Scrum contains no technique or method for optimizing end-to-end, and it should not. The retrospective might uncover such a problem, but I generally advise to use Lean thinking to address end-to-end optimization explicitly.
- 30 days is also fairly typical ‘management reporting interval’. A Sprint longer than 30 days means that management may not get a ‘status update’ for two months.
- With experienced teams and a well-defined product backlog, a 30 day sprint may be fine (not my preference). But when the teams are newly formed, new to Scrum or when the product backlog is very dynamic, it’s better, as someone pointed out, to fail earlier and adapt sooner.
- A two-week sprint is my preference. Just long enough to develop some rhythm and velocity, but not so long that you risk going down the wrong road for a month.
- 30 days matched traditional development teams that were new to scrum, or where older technology was not nimble enough for rapid development for all the mentioned reasons especially quick review and feedback. Even with a 30 day sprint cycle, I have usually obtained feedback in shorter cycles before being fully accustomed to scrum. Maybe as technology and teams get more progressive we will see shorter sprint cycles.
- All above answers are great. I will amend that by having frequent and not too far away reviews that show what was built (the increment) then you are being transparent and providing the visibility to the stakeholders, so everything that was mentioned (risk, done, value) are all observed and demonstrable. Also by repeating the cycle and having a time for inspect and adapt you can become agile. If you do them more than 30 days the people may not remember what happened and will not effectively adapt. BTW 30 days is too much for us, my experience that most teams use a 2 week (10 days) sprints.
- Start with sprints as short as possible… one week or two week sprints. Do not overpromise the product owners but in first place under-promise as you need to learn on what you can deliver on shippable products. If one week or two week sprints are not possible to deliver shippable products you can extend the dime of a sprint to for example 3 weeks or 4 weeks.
- A 2 week sprint has effectively just 9 days to build and test deliverables as you also need to reserve time for backlog grooming, sprint planning, sprint review and retrospectives. When you start your first sprint just under-promise.
The Lean Roots of Agile
Agile software development has its roots in the lean manufacturing paradigm developed at Toyota – the Toyota Production System (TPS). Lean manufacturing centers on creating more value with less work. One central principle of lean manufacturing is to relentlessly work on identifying and eliminating all sources of waste from the manufacturing process, and also to eliminate all obstacles that impede the smooth flow of production – usually done using a lean technique known as Value Stream Mapping. ‘Waste’ can take many forms but it basically refers to anything that does not add value from the perspective of your customer. Taichi Ohno, an executive at Toyota, identified seven categories of waste:
- Inventories (in process, or finished goods)
- Unnecessary processing
- Unnecessary movement of people
- Unnecessary transport of goods
Central to the lean philosophy is making improvements in real time: identify a problem, measure it, analyze it, fix it and apply it immediately on the factory floor (or in the next iteration) – don’t wait to read about it in a report later.
It is fairly easy to come up with software examples for each of the above categories of waste (example: reduce ‘Waiting’ by eliminating hand-offs, or reduce ‘Overproduction’ by cutting out all unnecessary features). Lean principles will thus sound very familiar to agile practitioners. In their book, Lean Thinking, the authors Womack and Jones, identified five principles of “lean thinking.”
- Value (Definition of): Specify value from the customer perspective.
- Value Stream: Analyze your processes and eliminate anything that does not add value for the customer.
- Flow: Make process steps flow continuously with minimal queues, delays or handoffs between process steps.
- Pull: Build only in response to specific requests from customers. In other words organizations should not push products or services to customers. Order from suppliers only the material or services required to supply specific customer requests.
- Perfection: Continuously and relentlessly pursue perfection by reducing waste, time, cost and defects.
There is no clear written description of Toyota’s practices, however there is a lot of written material about TPS – most notably The Machine That Changed the World, by Womack, Jones and Moos – now a management classic.
Several authors have boiled down the definition of Lean into Two Pillars:
- The practice of continuous improvement
- The power of respect for people
Continuous Improvement means the relentless elimination of waste in all it’s forms, and, the identification and removal of anything that disrupts the continuous flow of the process.
“Respect for people” is a pretty vague term, but at Toyota it means something very specific, including providing people with the training and tools of improvement, and motivating them to apply these tools every day. At Toyota they would say: “We build people before we build cars”.
To summarize TPS even further in a single sentence: Toyota practices continuous improvement through people.
Optimum Batch Size
From the late 1940s Toyota was experimenting with batch sizes and in particular with die changes associated with stamping out steel metal parts for cars. During this time the important discovery was made that it actually costs less per part to make small batches of stamped parts than – unlike their American competitors – to produce enormous lots on a large scale. There were two reasons for this: small batches eliminate the need to carry huge inventories of parts required for mass production, and making small batches means that any defects are discovered much earlier. The implications of the latter issue were enormous – and have a direct correlation with agile software development – it made those responsible for stamping much more concerned with quality, and it eliminated re-work and the waste associated with defective parts.
For more on the history of Lean Production, I refer to readers to The Machine That Changed the World, the essential reading reference on this topic. In the book: The Principles of Product Development Flow, Donald Reinertsen sets out a dozen ‘principles’ that promote the merits of smaller batch sizes. In the context of software development, Waterfall executes a software release in a single batch, whereas scrum breaks up the requirements for the release (the Release Backlog) into a set of smaller batches, and implements these in a sequence of Sprints. The question we have is: what is the optimum size for a sprint?
Much research has been done on quantifying the optimum batch size in the context of manufacturing logistics. The calculation of ‘optimum’ batch size – usually called Economic Batch Quantity (EBQ) – is based on the relationship with demand, machine set-up cost, and inventory carrying cost per item.
|EBQ = Sqrt(2*Demand*Setup Cost/Inventory Holding Cost per Unit)
Where the inventory holding cost is the overhead cost associated with each item, and generally increases with the size of the batch, and the setup cost per item generally decreases as a function of batch size. This can be illustrated in the following chart:
Optimum Batch Size
For manufacturing operations the impact of large batch sizes on cost and throughput may be fairly obvious: large batches increase work-in-process inventory levels, materials lead-times, and finished good stock levels. But more importantly, large batches drive more variability, and more variability means more re-work, more cost and delays.
We will look at exactly what this means for software, but before that, what we really care about is maximizing throughput, i.e. how do we maximize the number of User Stories (or Story-Points) delivered per sprint – known as the velocity in scrum? The throughput chart is simply the inverse of the cost chart above, and will be an inverted U-Curve.
Let’s separate the production costs of User Stories into direct costs – those that contribute directly to the definition, design, coding and testing of each story, and indirect costs – those that represent overhead, or non-value adding activities.
Direct Costs per User Story
Indirect Costs Per User Story
The fixed costs are associated with the fundamental development tasks that must be carried out to deliver the set of User Stories in the iteration: design, coding and testing. Building 100 user stories in a batch as opposed to building 1 at a time yields obvious economies of scale. (Think of test automation as an example). However, as the size of an iteration goes up, so too does the amount of indirect cost, examples of which include dealing with more complex branch merges, fixing broken builds, scrubbing long lists of bugs, plus all of the data collection and status reporting that goes with this – more control and more project management overhead. Requirements re-work and bug-scrubbing could be considered as WIP to make an analogy with manufacturing. The simple fact of having to deal with an iteration of larger scope and complexity drives up the amount of overhead one encounters in a project. Thus the overall cost/story vs. iteration size graph will be quite similar to that in Figure 1.
The larger the batch, the fewer opportunities there are to inspect and adapt, and the more we are exposed to WIP and re-work. It is up to each team to find their own sweet spot for the size of an iteration. In scrum, we talk about getting to an ideal ‘velocity’, or the optimum number of Story Points per Sprint. Teams should relentlessly pursue this objective by searching for and eliminating all sources of waste and delay in their development process.
References, Recommended Reading
- James P. Womack, Daniel T. Jones and Daniel Roos, 2007. The Machine That Changed The World: The Story of Lean Production – Toyota’s Secret Weapon in the Global Car Wars. Free Press.
- James P. Womack and Daniel T. Jones, 2003, Lean Thinking – Banish Waste and Create Wealth in Your Corporation, 2nd Edition. Free Press.
- George Koenigsaecker, 2013. Leading the Lean Enterprise Transformation, 2nd Edition, CRC Books.
- Philip Kirby, 2015. The Process Mind: New Thoughtware for Designing Your Business on Purpose, CRC Press.
- Donald G. Reinertsen, 2009. The Principles of Product Development Flow, Celeritas Publishing.