Monday, 29 December 2014

Queuing Theory


What is Queuing Theory and Why Is It Relevant?

Our goal in an Agile Project should be to create Business Value in the shortest possible time. In undertaking this goal we are attempting to move a backlog or "queue" of tasks through a process as quickly as possible. It is here that Queuing Theory becomes relevant.

The time taken from the moment a piece of work enters our work process until it exits (Done) is known as the "Lead Time". This is a valuable metric when using Kanban. Other valuable metrics are "Cycle Time" (the time taken to actually process the work item once work starts) and "Throughput" (the rate at which work flows through the system).

Some Relevant Guidelines From Queuing Theory

To minimise the cycle time we can look for certain problems or warning signs. The following problems can increase cycle times:

•Variability in rate of arrival, or in service time (the time taken to respond to or process a task), will increase the cycle time and the queue. Reducing variability will improve flow. Variability increases cycle time.
•The arrival of large batches leads to queues and associated inefficiencies. Large batches increase cycle time.
•Very high utilization results in no spare capacity for dealing with sudden incidents or unusual batches.

Consequently, when such incidents occur, there are problems associated with increased queues, task switching and associated inefficiencies. This greatly increases cycle time. High utilization increases cycle time.
To improve cycle times and reduce queues, target an even, stable rate of arrival. Restaurants, parking stations and night clubs achieve this by applying discounts or introducing offers at certain times of the day.

Other ways to help are: create smaller batches, create stable service times, and utilize parallelism where applicable (if a task can be split into 2 atomic tasks with no interdependencies then it can be shared between 2 people, thus halving the cycle time.)

Other Guidelines include:


•Decreasing variability early in the process is better than later. The first bottlenecked station dictates the rate to the rest of the system.
•High utilization can lead to problems that are worse than just high cycle times. A system where every station is working all the time is close to collapse (also known as Goldratt's maxim). If everyone is already working all the time, the system has no time to recover when its queue grows due to upstream variability.
•Engage early. Start processing as soon as a batch starts to arrive.
•Monitor the queue. Respond if the queue grows and understand what caused it to grow - respond to the cause as well.

Applying Queuing Theory To Agile Software Development - An Example

As an example, we can apply queuing theory to a common problem in testing:
Testers in many projects do little for the first 75% of the Sprints and are then overloaded in the last part of the sprint - they work long hours (often all night or weekends) to get the work done.
Queuing theory tells us that if you can't match servers to the load (increase the number of testers on demand), then you need to create a more stable arrival rate (the work associated with testing needs to be spread out more evenly).

Based on the guidelines above we need to:
1.Decrease variability. The load should not all arrive at the end of the sprint.
2.Decrease the size of individual units of work (Decrease Batch Size).
3.Engage with a unit of work early.
4.Monitor the queue and respond.

So, in terms of Queuing theory, our solution to the problem is for testers to:
1.Decrease Variability and engage earlier with tasks. Plan and write tests earlier, as part of requirements elaboration (based on acceptance criteria), and automate as you implement. Don't wait for the developer to complete development as this leads to a large, late batch arrival. Writing tests as part of requirements elaboration smooths out the work flow and addresses points 1 and 3 above.
2.Create Small Batch Sizes (work units). Plan for tasks to be small and testable. Don't plan for large user stories, break them into smaller chunks of functionality, ideally 4 hours – use the INVEST principle.
3.Monitor Your Queue. Look for people that are sitting idle because they're waiting for something to test. Understand why they are sitting idle.

References:

Lean Tools and Queuing Theory: http://css.dzone.com/articles/lean-tools-queuing-theory

Wikipedia on Queuing Theory: http://en.wikipedia.org/wiki/Queueing_theory

The Theory Of Constraints


Overview

In conjunction with Queuing Theory, the Theory of Constraints (ToC) presents some additional ideas for making improvements in the efficiency of the Software Development process.

The Theory of Constraints was introduced into management by "The Goal", a book written by Eli Goldratt in 1984.

The basic idea is that you can always find a slowest point (the area with the slowest throughput of work) or the "limiting constraint" in a process by finding out where there is a buildup of unfinished work. This slowest point defines the limit of the amount of work that can be put through the system.

This slowest point is where you need to concentrate your efforts for improvement - because improvements made anywhere else will not improve the rate at which work can be moved through the workflow.

For example, suppose two people are cleaning dishes in a restaurant, one washing dishes and the other drying. If the number of wet washed dishes keeps building up, then the constraint for the process is the person drying. Increasing the rate at which dishes get washed won't increase the number of dishes you are able to put on tables, because those dishes aren't going to be dry. If you want to increase the rate at which you can put dishes on tables you need to increase the rate at which dishes get dried, not the rate at which they get washed.

Applying The Theory to Agile

In more complex processes such as software development, it is sometimes more difficult to see where the constraint is. So look for people who are waiting for work that is coming from someone else. Usually, the "someone else" is a constraint, so a method should be found to improve the efficiency there.

In Agile, the focus on resolving the constraint would be to provide that person with extra training or get other members of the team to assist in the work by "Swarming".

The ToC is important because it tells us where to focus our efforts. For example, if we need to improve throughput in a project, it would be a waste to increase the developer's capacity to code stories from 90 points up to 100 points per sprint if the BAs could only supply 80 points of elaborated stories and the testers could only test 70 points of stories!

The "Limiting Constraint" is defined as the point at which the lowest throughput occurs - in this case the testers, who can only throughput 70 points of stories. Work that focusses to improve any other area will be wasted as our delivery of tested code would still be limited to 70 points per sprint. The first area of concern should be the limiting constraint.

NOTE: Removing the limiting constraint (in this case the testing process, at 70 points) simply moves the constraint to a new bottleneck (in this case it would move to the BAs at 80 points). The process of identifying and removing constraints is thus ongoing.

The Theory of Constraints can be applied to measures other than rate of work throughput. There may, for example be a limiting constraint on work quality.

The ToC may have implications in prioritizing which areas to improve during a Retrospective. If an area identified for improvement does not represent a constraint to the work by any measure (volume, quality, etc) then this area may not be a priority.

Root Cause Analysis For Agile


Overview

Agile typically uses Root Cause Analysis techniques drawn from LEAN.

Root Cause Analysis using the 5 Whys

The 5 Whys is a technique developed by Sakichi Toyoda at Toyota as part of the lean methodologies that were developed there. It is an iterative question-asking technique. The primary goal of the technique is to determine the root cause of a defect or problem.

The technique starts with a statement of the problem, then the application of 5 "whys", with each answer triggering the next "why".

The classic example given to illustrate this technique is:-

The problem: The vehicle will not start.

1.Why will the vehicle not start? - Because the battery is dead. (first why)
2.Why is the battery dead? - Because the alternator is not functioning. (second why)
3.Why is the alternator not functioning? - Because the alternator belt has broken. (third why)
4.Why is the belt broken? - Because the alternator belt is well beyond its useful service life but has not been replaced. (fourth why)
5.Why has the belt passed its useful service life but not been replaced? - Because the vehicle was not maintained according to the recommended service schedule. (fifth why, a root cause)

The use of 5 questions derives from an empirical observation that 5 is typically the number of iterations required to establish a Root Cause of a problem. In practice, the number may be greater or smaller than that.

Root Cause Analysis using Fishbone Diagrams



Note that a Root Cause may not be just one factor. In Complex Adaptive Systems it is common for the Root Cause to be an interaction of factors. The primary technique used to address this level of complexity is the fishbone (or Ishikawa) diagram although a tabular format listing causes can also be used. These tools allow for analysis to be branched in order to provide multiple root causes.

 Th Fishbone Diagram divides the causes into general categories (often Equipment, people, process, materials, environment and management) and iteratively explores each one in an approach similar to the 5 Whys to understand all of the factors contributing to a problem.

Note: The shapes to create these diagrams are available in Visio, in Business->Business Process->Cause and Effect Diagrams.











(Example from http://en.wikipedia.org/wiki/File:Ishikawa_Fishbone_Diagram.svg )


Mess Maps

Lean assumes some degree of linear flow. For very complex environments it may be necessary to go outside of the LEAN methodologies - a Mess Map may be required.

Adaptive Stance: "Experiment, Inspect and Adapt"



Our linear thinking struggles with decision-making in an uncertain situation. Committing to an unknown strategy that will be guided by an undefined series of "experiment, inspect and adapt" cycles is a challenge to most leaders.

This problem is increasingly recognised in military circles. The battlefield is an extreme example of a Complex Adaptive System, with modern communication and networking enabling enemy elements to communicate instantly, sharing information and changing tactics. For a military leader, the consequences of failing to adapt quickly are extreme.

A guideline to achieving an "Adaptive Stance" mindset for military decision making is laid out in "Decision-Making" by Lieutenant Colonel Mick Say and Lieutenant Colonel Ben Pronk:


The Adaptive Stance is built within a framework of a number of key personal qualities:

  • Ambiguity tolerance. There are no simple solutions to complex problems, and attempts to remove ambiguity from a situation can be very dangerous. Every effort must therefore be made to resist the urge to over-simplify the complex. Again, one must accept that messiness and sense-making are key.
  • Self-reflection through ever-present consideration of the questions: ‘How would I know if I was wrong about this?’ and ‘How much would it matter?’ This characteristic encapsulates an ‘ingrained habit of thoughtful self-reflection about the effectiveness of one’s beliefs, actions and decisions’. It echoes the requirement to treat one’s own ideas dispassionately, helps to combat confirmation bias, and primes the practitioner to be constantly on the lookout for ‘Question Four’ moments.
  • Decriminalisation of being wrong, openness to learning and supporting others’ learning. If one accepts that it is virtually impossible to predict the outcomes of an interaction with a CAS, and the process of adaptation entails elements of trial and error, then it becomes completely naïve to expect ‘fail-safe business plans with defined outcomes’. Toleration of failure is an ‘essential aspect of experimental understanding'


Lean - What To Cheer and What To Fear


Overview

Although it began in car manufacturing, Lean has become an increasingly important influence within the Agile Community. However, Lean's origin in manufacturing is a mixed blessing. Although the manufacturing production-line heritage provides us with some valuable continuous improvement concepts as well as language and patterns that are familiar to upper management, this heritage also leads to a tendency for management to see software engineering as an oversimplified pattern - a simple, well-understood process of repeated steps.

What To Fear

In reality, producing a new piece of software is rarely as neat as the production-line manufacture of a car.

The key difference is that the software development process is intended to design and create a new piece of software, while the car manufacturing process is intended to produce identical copies of a car that has already been designed, prototyped, tested and had the bugs shaken out.

Once we have developed our software, copying it is a simple file copy process. The car production line is an analogue to our file copy process, not our software development process.

Comparing software development to car design is probably more accurate than comparing it to the production-line "copying" process. Even the car design comparison is imperfect as the specifications for a car are defined by expert Mechanical Engineers using clear, unambiguous Engineering terms and CAD/CAM techniques, while the specifications for our software business logic are defined by business people, using ambiguous business terms, and are rarely as clearly elicited as the CAD/CAM diagrams for a car.

A Kanban board may be able to capture a perfect representation of the car production process, but it is unlikely that this could be done for software development. The flow for one software module could be quite different to the flow for the next. Attempting to turn a Kanban board into a complete description of your process is likely to be a mistake.

The challenge is to take from Lean the language and processes that are of value, while remembering that there is a difference between creating a fleet of identical cars on a production line, and creating a unique software solution based on an ambiguous, emerging and changing business specification.

What To Cheer: What Lean Brings to Agile

1) Lean/Agile fusion brought us ScrumBan. Scrum's Sprints provide the cadence and development practices, but Lean's Kanban provides the flow.

2) Lean provides terminology and patterns that Executives understand. Explaining Agile to executives in a way that they can buy into is sometimes challenging. We can use Toyota's Lean processes as an example to help explain how and why Agile concepts work.

3) Lean has processes for looking at the big picture. Agile's focus on improving the quality of the product's codebase and delivery process can potentially lead to delivering the wrong product in an efficient way. Lean extends Agile's focus to include areas like Systems Thinking and Vantage. Consequently, Lean has helped us focus on building the right product.

4) Lean provides techniques for continuous improvement. Lean provides us with ideas and techniques for improving the efficiency of our process. This is where most of the Lean concepts are being applied in IT today. Value Stream Maps provide one of the key methods for waste elimination and process improvement.

Value Stream Mapping and Beyond


Overview

Value Stream Mapping (VSM) is a LEAN technique, used to map out process steps in order to both illustrate and understand:
•where work is occurring
•where delays or waste is occurring
•how much work of different types are in the system.

A Value Stream Map also identifies queues, wait times, waste, cycle times and other useful pieces of data. A Value Stream Map helps design visual representations of the work (e.g. Kanban Boards on the walls) and provides candidates for targeted improvement efforts.

What is a Value Stream?

A Value Stream identifies all the steps needed in order to make and deliver a specific product. In redesigning a Value Stream, the ideal result is a continuous smooth flow of valuable new features into production with no delays or waste anywhere in the stream.

The value stream is not just the development process, it includes everybody from the customer to operations and support engineers, and may even include marketing, sales and any other Business Unit that impacts on the stream of work.

Criticisms

The word stream implies that the work passes through a smooth, unbroken, and defined sequence of steps. The reality in a Complex Adaptive System may be quite different:

•When building a software product, specialised steps may be required for some parts of the system, but not others.
•For complicated products it is common for paths to change, and it may be rare for any two software modules to follow exactly the same path.
•In a system that is undergoing continuous change (due to retrospectives or other improvement processes) any process map may be outdated before it is completed.
Defining the workflow too precisely may lead to a rigid, unvarying process being applied where a looser, flexible approach may be more appropriate in order to allow for emergent and changing needs.

However, defining the workflow too loosely may reduce the value of mapping the process. The correct balance will depend on the intended purpose of the mapping.

If the system does not seem to be mappable as a stream of value that follows a defined sequence of steps, then Process Mapping may be more valuable (see below).

When is VSM Useful?

Use VSM:
•When looking for waste, or for areas that offer opportunities for process improvement. (Ask how much work went into this step, and how much value came out?)
•When you need to track one isolated flow in a complex series of chains (particularly if you want to experiment with a process change in one isolated area, with a limited, mapped and understood scope for impact).
•When the process can be mapped as a relatively simple chain of events.
If the process can't be mapped as a relatively simple chain of events then Value Stream Mapping may not be the ideal tool (see below for criticism and alternatives).

How To Run A VSM Session

To prepare a Value Stream Map:

  1. Make sure the process is facilitated by an expert with a strong understanding of value stream maps.
  2. Get feedback from at least one representative from all of the areas within the value stream being mapped.
  3. Do not have your facilitator take notes and then prepare the map in isolation - involve the full team and create the initial Value Stream Map collaboratively
  4. Create your initial value stream map using a pencil (you will need to make frequent corrections and changes) on a sheet of butcher's paper. 
  5. Once you have the initial map, you can create an electronic version using Visio, or a similar tool..


Make sure you note the metrics needed to detect waste. For example, if you are seeking to improve cycle time, then for each step in the process note down how long the item spent at that station and how long it actually took to be processed. For example, it is common to find that a manager exists as a "gate" in a process - required to make a go/no go decision. Although the manager may be able to make the decision in a few minutes, most managers are busy, and may take days to get to the decision. So although the processing time was only a few minutes, the time at the station was days. Moving the decision to a lower level may make a dramatic difference to the total cycle time.

Once you have mapped the process and eliminated waste, your goal should be to further improve the process. Ask (from a customer perspective):

•What would the ideal flow of new features look like from a customer perspective?
•To make this happen, what would need to happen and in what order?
•What resources would need to be available to make that possible?
•How would roles and responsibilities need to be defined?
•What is the ideal flow?
•What is the best flow we can achieve?
•How can we improve to meet the ideal?
 These questions may suggest possible areas of improvement.

Note: Visio supports Value Stream Mapping shapes. You can find them in Business->Business Process->Value Stream Map Shapes.

An Alternative - Process Mapping

If the system is complex and the path is variable, then it may not be possible to map the process as a simple sequence of steps. In that case, use an alternative to Value Stream Maps, such as Process Mapping.

 Process Mapping is from the Systems Thinking side of Agile and is closely aligned with the Vantage method - so is very focussed on the point at which the customer interacts with the system.

In Process Mapping:

•Don’t ask the managers - they may know the ideal process, but they are often too removed from the work to know the special cases, alternate paths, or even perhaps the actual process.
•Don’t just send in a Business Analyst to talk to people - people are busy, and may even be wary of revealing the full horror of the process.
•Don’t talk to proxy customers or talk to a spokesperson for an area because, once again, you will only learn the ideal process.
•Don’t try to map the process in a room with butchers paper and sticky notes, bringing people in as required.
•Instead go into where the work is occurring, with a small team comprised mostly of peers of those who do the work.

Why peers? Because workers tend to tell their colleagues the truth about what’s really going on at work.

Study the work as a series of systems, viewed from the customer's point of view. Start at the point at which the customer first contacts the system and work out from there.

 Map the system as a series of varying interactions between subsystems. Define the customer's needs and document the sub-system interactions from the customer's viewpoint - explaining how each sub-system contributes to those needs. To improve the system, ask how could the work (mapped as a path through subsystems) be redesigned to support the customer's varying needs. How could the sub-systems be redesigned? How could the interactions be redesigned?

When is a Process Map Useful?

If you have one or more of these problems:
•Multiple entry points to processes
•Independent but interacting processes
•Workflow path is extremely variable, with few or no steps guaranteed
then a Process Map may be of use - either in addition to or instead of, a series of Value Stream Maps:

For more on Process Maps: http://nnphi.org/CMSuploads/MI-ProcessMappingOverview.pdf.