Saturday, 7 March 2020

Lead Time Driven Delivery - Metrics

In part 1 of this series we have explored the basic idea of Lead Time Driven Delivery. The main idea is to minimise Wait Time, Disruption Time, Task Time and thus Lead Time for the work that is flowing through the organisation. In this blog post we are going to explore metrics and processes that will help you to:

  • Verify if process changes that you are implementing are minimising Lead Time
  • Identify how much Lead Time can be further removed

Before we explore the core idea, let me introduce you to the Little's Law.

Little's Law

Imagine a busy, but a small web agency in Edinburgh building web products for it's clients. This web agency employees UX Designers, Web Developers, Testers and Cloud Developers. They work as one team. This web agency has made commitments to deliver number of features for a very important client, this work is committed to the queue, which means this work is "work in progress" (WIP). All work that is being done will need some time from all of the team members (UX, Web Devs, Testers, etc). As soon as the team takes the work of the queue the timer starts and the timer stops when team has stopped working on the task and the work is done. This is called Cycle Time. Finally there is Lead Time, Lead Time timer starts from the moment that the work is committed to the queue and timer stops when work is done.

Remember the Hot Feature A from last blog post? Well it has spent 1 week in the queue, then finally it was picked up by the UX, however Web Devs, Testers, etc were all busy working on other work. So it has slowly made its way from one person to another until it was completed 3 weeks later. So it has taken 1 month to complete overall, but it required only 12 hours worth of work. This is one poorly managed web agency!

Relationship in the above diagram can be described with Little’s Law:

Lead Time = WIP / Throughput

Web agency team on average completes 0.3 of a task per day. Team on average has 9 commitments in the backlog that they need to get through. That means (9 / 0.3) = 30 days lead time. To improve this delivery situation team has two options:

1. Reduce the amount of committed work in the queue. If team reduced committed queue size from 9 to arbitrary lower number such as 3, this would mean that lead time would go to (3 / 0.3) = 10 days.


2. Team needs to improve the Throughput (Cycle time).

For more information around Little’s Law do check out this awesome article. This entire blog series focuses on improving Cycle Time and not reducing the WIP.

Core idea

Accelerate book suggests that one of the important metrics that should be tracked is Lead Time. This totally makes sense as this is what customer experience's and it impacts recovery time, experimentation speed, etc. This has very much inspired this entire blog post series. Accelerate book does recommend to track other additional metrics, if you are interested in knowing what they are then check out this summary / review of Accelerate book.

As a software delivery practitioner I find that Lead Time is a start, however it is very high level and it does not provide much detail for me to make the improvements or scheduling decisions.

Lead Time Driven Delivery is suppose to help by exposing Wait, Disruption and Task time. If you know how much Wait and Disruption there is in the system and where it is coming from then you can do something about it. By this point you might be wondering, how can I extract this information from my Application Lifecycle Management System (ALM)? Is it even possible to automatically get metrics for Wait Time, Disruption Time and Task Time variables? Answer is that you will need to use quantitative and qualitative techniques to extract data.

Quantitative analysis

This is the easy part. If your team is storing dev data in some ALM system then you can get this data in many different ways. You just need to make sure:
  • Only actual development time against the work is logged. In this web agency they have terrible internet speed, they love meetings and build server takes forever to run tests. So developer has taken 2 hours to do the actual work, but between all of the waiting, meetings and random requests the whole day passes (8 hours). In this instance developer should log only 2 hours of actual dev time and not 8 hours.
  • All work that needs to be done is grouped in a logical way so that it is possible to identify wait states between tasks for the deliverable.

Now you can create a two metrics Lead Time Resistance and Lead Time Spent Idle.

Lead Time Resistance

Lead Time Resistance measures how difficult it is to get work done. In the above digram Sam (UX) might take only 2 hours to do the design work (blue), however for the rest the day he is disrupted (orange) and before he knows two days have gone by. Lead Time Resistance calculation takes total actual time for the work and divides it by the total elapsed time for the work.

Feature in the diagram has taken 12 hours actual time, but it took 5 days total elapsed time. Lead Time Resistance for this is 1-[12 (hours) / [5 (days) * 8 (hours per day)]] = 70%, 70% was spent on Disruption and Wait Time i.e. stuff getting in the way, creating resistance.

Lead Time Spent Idle

Lead Time Spent Idle measures how well the work was planned. Sam (UX) has completed his work in 2 days. Work waited for ~1.5 days before it was picked up by John (Dev). After John was done, it has waited for another ~3.5 days before Dan (Test) picked it up. Total idle time of no activity is divided by the total elapsed time. Feature in the diagram has taken 20 (5 days * 4 weeks) business days to complete. Out of 20 business days it has spent 8.5 business days (1.5 days + 3.5 days + 3.5 days) in idle state, this means 8.5/20 = 42.5%, 42.5% was spent in idle state.

Now let's bring it all together. The feature in the diagram has encountered 70% of resistance and it has spent 42.5% of the time in the idle state. I don't know about you, but this is really useful information. Now that this is known, team can move on to the qualitative analysis and do some deeper analysis on what can they do to improve this situation.

Qualitative analysis

This is the hard part, this is where team needs to actually continuously question existing working practices and get creative about improvements. Quantitative analysis will expose a lot of variables, just take a look at this:

However, numbers alone will not tell you if Lead Time Resistance is large due to waiting around or chronic disruption culture. Additionally, it will not tell you how much time is lost due to poor work distribution, poor designs, lack of standards / components, staff turnover, etc. This is where teams should keep a daily log of all of the Wait Time, Disruption Time and Task Time. They should then use this information during retrospectives to review their current workflow setup and figure out how the future workflow should look like to improve the Lead Time.

Side Note

In Lean there is strong focus on waste elimination, just Google 8 wastes or check out my older blog post. Problem is that you need to look up what waste means in order to understand it. Then you need to translate it to knowledge work so that it is relevant. Personally I don’t think it is that relatable to knowledge work and once translated it does not stay in your mind for that long. Software pactioners have introduced "Waste Snake” while I like the concept, problem is still the same. "Waste" is a vague name. From my personal experience, I have seen teams use it for a while showing mainly disruptions. I don't know why but they have not focused on other more hidden wastes such as Wait and Task Time. It might sound less cool, but instead of "Waste Snake" create a "Lead Time Wall" and just stick on to it anything that impacts Wait, Disruption and Task Time with the amount of time lost.


Your team should automate the following metrics:

  • Lead Time
  • Cycle Time
  • Work in progress (WIP)
  • Lead Time Resistance
  • Lead Time Spent Idle

These automatic metrics are useful as they will expose Wait Time and they will tell you if you are going in the right direction with your process changes. To actually figure out what needs to change to improve Lead Time, your team will need to conduct constant qualitative analysis where you manually review Disruption and Task Time.

Tuesday, 28 January 2020

Lead Time Driven Delivery - Basics

TL;DR Your organisation needs to minimise Wait, Disruption and Task Time so that work gets delivered quickly all the way to the customer. This means you need to minimise Lead Time.

I am very luck as I have been an Agile and Lean practitioner for a while. I have spent time reading about different delivery methodologies such as XP, Scrum, Kanban, etc and I had an opportunity to work in organisations where they have used these different methodologies. One thing that constantly stood out to me was how they all prescribe best practice with a very shallow reasoning behind these prescriptions. It looks like they are based on experience and environment where they were created. This complicates things as approaches end up being open to interpretation on what is Agile and what is someone's subjective opinion on the matter. For example, if someone has came up to you and said, I want to change our existing process from A to B. How would you test/verify that this new process is more "Agile"? Would you refer to Agile Manifesto? Use your experience/training? As far as I know there is no testable/verifiable way to measure Agile, this creates a communication and expectation problem with new and seasoned practitioners. If you promote someone to be a team lead or if you onboard a new member of staff you need to explain to them why you are doing something in a certain way. Saying to them "please read this Agile book and follow it" is not going to work. If you tell your new members of staff just do "what we do”, well that’s flawed because they don’t understand the reasoning behind the intents. Also, why should they follow it? This means that the moment you need to change how your company works you end up with organisation that starts to make inconsistent decisions between different teams and departments as no one really understands what behaviour and KPIs they are trying to minimise or maximise.

In this blog post (and eventually series) I am going to attempt to breakdown the core reasoning behind the Agile practice so that it is more verifiable. Hopefully this will mean that core Agile ideas can be explained quicker to people around you and that you and your team can confidently mature your own delivery practice. Let’s get started.

1. Anatomy of you sitting down and trying to do some work

Three factors that make up your work:

  • Wait Time - This is when you are waiting around for some knowledge that you don’t have, decisions that you can’t make and finally you are waiting around for someone else to complete some work before you can start yours.
  • Disruption Time - This is when you have to expedite some work, rework some work, corporate interruptions and mental health impact.
  • Task Time - Finally, this is the actual work that you are doing, pure sitting down and getting things done.

Imagine you are working on your own on your own start-up. You will have very little wait and disruption time. You are on our own, you can make all of the decisions. Also if you are lucky enough to work in a quiet environment you should experience very little or no disruptions. You get things done fast, your users are impressed with your company, new features just come out all the time. However, this changes the moment you hire your first employee in your start-up. The moment you do that, you create an organisation, that means you have created a system. In the system work no longer gets done by a single individual, it gets done by many individuals. You as the founder are unlikely to feel much impact by hiring this new person (apart from knowledge transfer burden), but if you are not careful your new employee will have to wait for your decisions, knowledge and task allocation. Their Wait Time will grow as they wait for you and they will probably be disrupted by you. You will wonder why they are not as a productive as you, it might be because they have not got enough autonomy to make decisions (maybe they don’t know your values so they don’t know what decisions to make on your behalf) also they might not be getting enough clarity about the desired outcomes. Most people are not founders, they are the employees and they struggle to do their best as they just don’t understand the reasoning framework and don’t get enough autonomy.

2. Anatomy of your Waterfall company trying to deliver some value to the customer

Image a company that does not follow any Agile process and instead they have departments of people per discipline. So Web devs in one department, API Devs are in another department, you get the point. Each department will have their own backlog, which means everyone has their own Lead Time, on top of that all individuals will experience disruptions (team meetings, urgent requests you know the drill) and there will be many handovers from one department to another. Work will also end up traveling backwards due to misunderstandings. So if a customer has requested a “Hot Feature A” they will have to wait for a long time for this work to travel through this type of organisation (system). Actual Task Time for "Hot Feature A” might be 12 hours of work in total, however given all of the Wait Time (handovers and lead times) and disruptions it might take up to 1 month before it gets shipped. So there is a big difference between 1 month Lead Time and 12 hours Task Time. However your customer will not care about the 12 hours of Task Time, they will just care that you took 1 month Lead Time. Overall in this type of organisation Lead Time for most work will be very high, fewer projects will be shipped, projects will very rarely go out on time and individuals will feel frustrated as there will be a lot of firefighting.

3. Anatomy of your Agile company trying to deliver some value to the customer

Now imagine another company that understands importance of Lead Time and works to remove as much Wait, Disruption and Task Time (more on Task Time later) from overall delivery process. They have decided to sit people together for a limited amount of time to deliver certain features and projects. They have done this as they want to remove handovers, the amount of project management is required, competing agendas, waiting for decisions, knowledge and organisational dependencies. They work as a team on one story at time and their main job is to push that one story through the system as fast as possible. Now, that story that took 1 month to deliver, in this new system will take 12 hours or even less. This is because you have removed all of the waiting around, disruptions (team lead and product owners act as defenders) and because this team is sitting together they can actually expose the unknowns faster, tame complexity, share their experience and share the burden of the work so they can actually deliver the work faster.

4. Anatomy of Task Time

I have left this till last for a reason. Management team needs to improve the overall system before they look at improving Task Time. Why? It is much healthier to focus on fixing the overall organisation before looking at how they can improve individual's performance. In companies where Lead Time is high, good talent might become disengaged. As you fix the systemic problems, you might find that people who were not performing that well start to really surprise you and that Task Time reduces naturally.

The actual Task Time is made up from eight factors which are dynamic:

  • Volume of work - This is just you sitting and typing, copy and pasting.
  • Unknowns - This is you identifying stuff that you did not consider when you were estimating the work.
  • Complexity - This is you figuring out an algorithm to solve a problem, the main thinking part.
  • Risk - This is how much testing you have to do given the risk level that is acceptable for the task at hand.
  • Skill - This is you improving your hard/soft transferable skills (programming, math, architecture, algorithm design, management, etc) or using your existing skills to get work done quicker.
  • Domain - This is you gaining new domain knowledge (HR, Logistics, Financial Trading, etc) or using your existing domain knowledge to get work done faster.
  • Attitude - This is how you perceive your work environment and tasks.
  • Aptitude - This is you having developed or have predisposed skills towards the work that you are doing.

I know this is obvious but I would like to stress one point. Most people will take different amount of time to get a task done. Why? They are different people, with different mindsets, skills, domain knowledge, aptitude and attitude. All of these things impact overall Task Time.

It will not surprise anyone that experienced teams (high skill and domain knowledge), are more likely to identify unknowns, reduce volume of work through some automation, tame complexity and as a result deliver high quality work quickly. If your organisation wants to improve Task Time then it needs to ensure that people stick around.

What does it all mean?

In priority order, everyone in your organisation should be working hard to minimise Wait, Disruption and Task Time and thus minimise Lead Time. Organisations will never achieve perfect Lead Time, however they need to constantly work towards it. To me this is what DevOps, Slack, XP, Scrum, Kanban, Lean, etc is all about.

If you take anything away from this blog post, then it would simply be this, start to measure Lead Time for work that is traveling through your organisation and find ways to minimise it.

Thursday, 11 July 2019

Applied Software Delivery : Optimal Backlog

Backlog management should be simple and efficient, what are the key things you can do to make it so?

Give teams their own backlogs

Each cross functional team of around (-+3)+7 members must have their own backlog.

This is one of the biggest productivity improvements you can make. If teams have to work in a global backlog then team members have to scan the whole backlog, sort stories against each other, discuss and understand stories, allocate bugs, etc. This is highly inefficient and puts backlog management on the exponential waste curve, when team has its own backlog it puts them on to the linear waste curve.

Keep it short

Large backlogs are wasteful because refined stories will get dropped as business priorities change and they create lots of unnecessary maintenance and conversations which intern creates confusion and misunderstandings.

You can prevent backlog waste by ensuring that each cross-functional team has their own backlog and that they have only 1-2 sprints or just-in-time worth of work refined overall.

Prepare for the story refinement meeting

You can reduce further time waste by creating draft stories with clear INVEST acceptance criteria before the team refinement meeting. This will give story refinement session overall focus and a strong starting point for a discussion. I don’t agree that all stories need to be created together with the whole team from scratch. Story refinement meetings should be used to get everyone to ask questions, think, create shared understanding and figure out “how” they are going deliver “what” is specified in the user story.

“Organizing is only necessary when you have too many things. Think about it: when we organize a collection of books, it’s because when they’re not organized, we can’t find the books we want. But if we had, say, five books, we wouldn’t need to organize.” By Leo Babauta

Sunday, 2 September 2018

Applied Software Delivery : Full Stack Developer vs Partial Stack Developer

Agile team consists of cross-functional team members, some of them work on the backend, frontend, infrastructure, persistence, etc. Typically developers specialise in one of these areas. However what happens if you identify a constraint in your team? What if a developer leaves? This means you have to hire or borrow a developer from another team. What is the alternative?

Full Stack

This developer is a Swiss army knife developer. Full stack developers have large amount of shallow (and maybe in depth) knowledge and can take on large amount of technologies, such as .NET, Angular, MSSQL, MongoDB, Azure Hosting, Selenium, etc. Finding these people is near impossible, why is explored here by Andy Shora.

Partial Stack

Much more realistic alternative would to be to move away from developers with single speciality towards hybrid developers. These developers would not know the entire stack but they would have a primary (core) and secondary skill. How does this compare?

Team with single skill

Name Skill
John Austin Angular
Vince Perk BDD
Sarah Wood .NET Developer
Ed Skim Test Analyst
Martin Lee Angular
Jason Dmit .NET Developer

Single skill model creates fragile teams. If single skill is not available then developers need to be borrowed from other teams or new additional people need to be hired, chances are that inventory will start to pile up. This slows down overall delivery.

Team with hybrid skills

Name Primary Skill (Core) Secondary Skill
John Austin Angular .NET Developer
Vince Perk BDD .NET Developer
Sarah Wood .NET Developer Angular
Ed Skim Test Analyst BDD
Martin Lee Angular Test Analyst
Jason Dmit .NET Developer BDD

Teams with hybrid developers are less fragile, they can elevate constraints, resolve issues within a team, don’t have to wait for anyone, which means they are more empowered and productive.

To create teams with hybrid developers your company will need to cross-skill existing staff and start hiring people with relevant skills. All of this will take time and patience. However, I do believe that this will not just benefit the organisations, but also the developers.

Saturday, 25 August 2018

Applied Software Delivery : Hidden Impact Of Team Leaders

Upon reading software management books, one of the things that I don’t recall reading much about is the impact of team leaders on the overall team's performance. Most software management books talk about lean production, theory of constraints, etc. These concepts are important, however these books do skip the very important factor in software engineering, people. Good team leaders can make a software team fly or crawl.

No matter what you call your team leaders, managers, scrum masters, supervisors, etc. These people have huge impact. Good team leaders resolve issues, bring people together and ironically enable the team to be more self organising. Team leaders are the ceiling of your teams performance.

When it comes to software development things are not easy, especially at scale. Your typical software team will be facing issues with the following:

  • Infrastructure 
    • Build servers 
    • Release pipelines 
    • Laptops 
    • Environments 
    • Access 
  • Timely, clear and prioritised: 
    • Functional Requirements 
    • Non functional requirements 
    • Business priorities 
  • Bugs 
  • Customer requests / feedback 
  • Ambiguous or bad decisions 
  • People 
    • Absence 
    • Behaviour 
    • Conflict 
    • Lack of staff 
  • … the list goes on

At scale there are hundreds of things that can go wrong during a sprint. Good team leaders work with their teams to implement permanent solutions to these problems. They assign work to the right people in the team, they provide right level of support and coach for optimal performance. This is not easy stuff, especially when you have to do it everyday. It is unlikely that team leaders in your organisation will be able to do this out of the box, which means you will need to listen, train, mentor, coach, and generally invest in them.

Managers of all levels often overlook this and focus on the wrong things. Personally, I have fixated more on the lean production concepts such as single piece flow, theory of constraints then team leaders. Why? Well, process is easier to implement and change. People are hard. They are not code, they are not servers, they can’t be changed with few lines of code and they can’t be reconfigured within few seconds. It takes time and patience. In software engineering this is the hardest and the most important thing you can do, invest in your people, especially in your team leaders, before you fixate on agile concepts and technology.