Saturday 26 September 2020

Lead Time Driven Delivery - Part 5, Practical and closing thoughts

It is time to close this mini-series with some practical and personal ideas.

Work in Progress - Reducing lead time through queue control

If your team has a backlog than applying Little’s Law will not make much sense to the backlog itself. Little’s Law can be applied only to a queue and in queue systems something needs to come in and then come out. If something stays there and never comes out, then Little’s Law will not hold. From my experience a backlog is not a queue as work enters the backlog and can stay there indefinitely. When someone enters a queue, they are committed and slowly or quickly make their way through it and eventually leave the queue. If you are using Sprints then you can apply Little’s Law to the Sprint, under a Sprint condition team commits to the work, work enters the Sprint and then leaves the Sprint. If you have number of projects that are going through the system that are committed, then you can use Little’s Law for that as well.

Little’s Law is very significant to the LTDD as average Lead Time = Work in Progress / Throughout. This means the more things we commit to the queue the higher average lead time grows. So, it makes sense to pick a strategy where it is possible to quickly make space for new high priority work. This way customers don’t experience long lead times while software teams deliver their long-term commitments. This also tells us something very interesting. Average delivery lead time can be brought under control if your work scheduling is respected, this stabilises average delivery lead time which makes it more predictable and it enables planning and forecasting. However, for an organisation to benefit from low lead time they need to understand what work is important to them and give that work priority.

Understanding what work is important

Most of us heard the famous thought experiment: “If a tree falls in a forest and no one is around to hear it, does it make a sound?”. If your software team delivers a brand-new “feature A” that customers don’t care about (as they don’t use it right now), but you don’t fix their bugs, customers cases and answer their questions. Will they perceive your organisation to be responsive (low lead time) or not? In this scenario bugs, customers cases and questions are visible, that is what customer cares about, and this new “feature A” is that tree that has fallen in forest.

Prioritisation will depend on what is the most important thing for your company at that point in time. Maybe other customers have been told that “feature A” is coming and it is in the contract. Maybe “feature A” is amazing and most customers will upgrade to a new tier to get that feature. So, revenue might play a critical role. Maybe your company believes in quality above everything else. This means company might be OK with delivering new functionality a little bit slower while they prioritise customer cases, bugs and answering questions. There are two personas here: internal stakeholders (investors and sponsors) and external stakeholders (customers and partners). Personally, I don’t think internal and external stakeholders are at all mutually exclusive. However, your organisation needs to make a decision, which stakeholders lead time needs to be minimised, internal or external. Of course, there are more levers then that and it is more nuanced, but priority needs to be established.

I think that there is something that we can all learn from the Manufacturing here. Manufacturers have found that health and safety has correlation with productivity, in fact Lockheed Martin stated that by focusing on Health and Safety they have experienced 24% productivity increase and 20% reduction in factory costs. Think about that for a second. Does that mean we can focus on delivering quality service and be even more productive? There is no trade off? The answer seems to be yes. Now when it comes to software products some companies choose to sweep bugs under the rug, accept security risks and not deal with quality problems in their products in order to get more features out. This creates more escalations, customer cases, late night calls, bugs, endless cycle of firefighting and hardship. Everyone has to work harder just to keep the lights on. What if software companies on average constantly prioritised "Operational Quality" over new features first. Is it possible that software companies would provide get ~24% productivity increase by working this way?

Minimising customer facing lead time

More work there is in progress the higher lead time grows. Let’s say your team is working on important strategic project that will increase company revenue. You are about to commit to a deadline. Before you do it is important to remember that if you commit without leaving any space for customer requests, bugs, or small features customers will have to wait until you have completed your important strategic project. The only way for you minimise customer or contractual (security, SLA, GDPR) lead time is by factoring in some “operational slack” to address customer or contractual concerns. Also please don’t confuse your contingency with “operational slack”. Project contingency deals with project-based risk (discovering unknowns, someone needs few unexpected days off) and “operational slack” is space to deal with day to day operational concerns. This does mean that important strategic project will experienced longer lead time overall, but not at the cost of customer facing lead time.

Throughput - Ideas to reduce your cycle time

As you look to improve throughput you will want some explicit examples on what techniques can be used to actually achieve this. Here is a cheatsheet on what you can do per each factor:

Wait Time

Minimise knowledge, decisions and work dependencies. Try the following:

  • Reducing handovers
  • Prioritised Backlog
  • Planning and Scheduling
  • Prioritised Product Portfolios
  • Project Management
  • Removing dependencies on others
  • Training and knowledge sharing
  • Empowering to make local decisions and create knowledge
  • Single piece flow
  • Self-service
  • Just-In-Time
  • Minimise supporting teams WIP

Disruption Time

Minimise expedite (reactive work), rework, interruptions and mental health impact. Try the following:

  • Building quality in to the development process and product
  • Prioritised Backlog
  • Planning and Scheduling
  • Prioritised Product Portfolios
  • Empowering to make local decisions and create knowledge
  • Aligning work to an individual and company objectives
  • Protecting team from disruptions and being proactive

Task Time

Minimise / Maximise volume of work, unknowns, complexity, experience, attitude, aptitude and risk. Try the following:

  • Training and knowledge sharing
  • Components re-use
  • Involve subject experts
  • Increase talent retention
  • Aligning work to an individual and company objectives
  • Maximise team’s strengths a minimise weaknesses
  • Continual learning and experimentation

Please remember that you can make some quick wins in throughput, however really important breakthrough improvements will take time. These improvements will never really stop either, as you make some improvements, you will find new improvements that were hiding. Then you will make those and find new ones, and so on. It is important to stress that process improvements don’t have to be expensive to implement and most likely you don’t even need to build any additional software to make these improvements happen. From my experience, process changes mostly require a lot of thinking and communication.

I strongly recommend that you log all of the Wait, Disruption and Task Time during a Sprint so that your team can discuss this during the retrospective. Your team needs to take just one improvement away (ideally the one that will make the biggest impact) and actually make the change, that is the key. If after each retrospective team actually reviews issues and implements just one improvement, then after a while there will be no stopping this team.

Constraints - Remove constraints, reduce wait time

I am a big fan of theory of constraints and Eliyahu M. Goldratt’s work. After years of use, I have realised that theory of constraints model does not translate literally into knowledge work (this debate is outside of the scope of this blog post). However, I believe that there are few useful mental shortcuts (heuristic) that can be applied to get the benefit from theory of constraints in knowledge work.

Dependency constraint heuristic

If you would like to know if there is an operational constraint in the system just listen to the people say: “We are constantly waiting for X”, “How can X be so slow?”, “I can never get hold of X”, “They are just so busy, but we really need X”, “They keep promising that they will get it done by it never happens”, “Quality from X is never good enough”, “X is constantly down, this really slows us down”, etc. These constraints slow down the whole system as it can’t perform to the optimal levels. This means these individuals (can also be technology .e.g. build servers) are not delegating enough (usually managers), not saying no to things enough (anyone who over commits) or there are not enough of people to do the work (hands on people i.e. not managers) to remove this constraint.

Change constraint heuristic

If you don’t see enough change in the process it normally means that people who are supposed to be implementing the change are not prioritising it as a top priority (they are either over committed, can’t delegate or prioritise). This creates a big problem with opportunity cost. By this individual being a constraint to a process change you cannot enjoy the benefits of the change and you will not get to the desired destination sooner. Assuming that these individuals need to make the change, you need to implement a "change circuit breaker", which individual just prioritises the changes in over everything else for a short while (gracefully without impacting the customer lead time of course). If this does not happen then opportunity cost will just keep growing.

Conclusion

Lead time consists of number of items in the queue and your team’s throughput speed. It is possible to provide low lead time to your customers by leaving space in the queue for customer requests (Little’s Law). However, your investors and sponsors will most likely want you to also focus on getting your throughput improved so that they get more for their investment. Throughput is made up from three factors: wait, disruption and task time, by eliminating wait and disruption and minimising task time you can finally increase throughput speed. Once you start to eliminate wait, disruption and minimise wait time it might force you to go beyond your existing agile framework methods. As you focus on results and not methods you might end up questioning your long-term beliefs about what actually makes your team and your organisation productive.

Monday 21 September 2020

Lead Time Driven Delivery - Part 0, Introduction

Contents:

Lead Time Driven Delivery (LTDD) approach has emerged from personal need to improve software delivery teams speed, LTDD is an extension of your Agile framework and it attempts to fix the following problems:

  • Agile frameworks tend to be collection of methods from industry practitioners. Most of these methods do not have any real evidence behind them that they actually work. Agile frameworks don’t necessarily have clear focus on what result they are trying to achieve, that is apart from vague "delivering value to customer” which is hard to measure.
  • Once organisations roll out Agile framework, it is not happily ever after. Some organisations start to deliver software slower, some speed up. However, no matter what happens, organisation's sponsors expect continuous improvement, so what’s the next improvement? How do you know what you can and can’t change? Are you bound to the Agile framework methods?
  • New practitioners and managers starting in the industry should not need years of experience to learn (often arbitrary) methods to be able to understand the main delivery concepts of why they are following some method, how it is applied and how they can make further improvements.
  • Certain scientific manufacturing management paradigms and models such as Theory of Constraints, 8 Wastes, etc don’t translate well into knowledge work. In fact, some aspects are hurtful and damaging to the knowledge work.
  • Software engineering department is not the only department in your company, how do you integrate your Agile Framework with Sales? Customer support? Implementations?

As a practitioner if you have identified similar problems then you might be happy to know that you are not alone, maybe this short series will give you some ideas on how you can further improve your team and your organisation overall. LTDD is not a collection of specific delivery methods such as pair programming, sitting together, using story points etc, this is already covered in abundance. LTDD is a framework and a way of thinking, it frees you from the Agile method and it allows you and your organisation to choose the methods that minimise your organisation’s lead time.

The name "Lead time driven delivery" name comes from research book called “ACCELERATE Building and Scaling High Performing Technology Organisations", this book identified KPIs that seem to correlate with profitability of organisations, and lead time is one of them. This is hardly surprising, our sponsors and customers don’t care that you have taken 5 minutes to make a software change but have taken 3 months to ship this change to production, all your customers see is 3 months elapsed time and not that 5 minutes. So, if lead time makes your organisation respond to market changes faster and provide better customer experience than why is this not your number #1 KPI?

This short series will attempt to give you some tools to make a change, and a really great thing is that it does not matter where you are and it does not matter how long it will take you to reduce that elapsed time from 3 months to 5 minutes, what matters is that you make a start and work with your peers through these problems, that collaboration is the real transformation.

Thursday 3 September 2020

Lead Time Driven Delivery - Part 4, Stabilise through embedded testing

Before you read this please read prerequisite Focus on results, not methods blog post as it briefly explains the basic scientific thinking that will be used here.

How do you know if a piece of process, software, hardware, concept or idea will behave in a correct way? Also, how do you know if this thing will meet the required quality, performance and reliability levels? Well, it is all about knowing how this thing will behave under certain conditions and more specifically it is about knowing when something will work and when it will fail.

To make this a bit more concrete let’s imagine that a customer with a lot of money went to two different software houses, one is called Henry’s Software and the other one called Adam’s Apps. Customer asked them to develop an identical software. To keep things simple, we will focus on a specific requirement. Here is what these two companies have written down for an identical requirement.

Henry’s Software: As a user when I open the mobile app for the first time, I would like to be able to quickly and easily connect to sign into my companies account.

Acceptance Criteria:
  • User enters information that he/she knows
  • User is directed to the relevant login screen
  • User puts in username and password
  • User successfully is directed to company account.

Adam’s Apps: As a user when I open the mobile app for the first time, I would like to be able to quickly and easily sign into my companies account.

Acceptance Criteria:
  • User knows the name of the (1) company that he/she works for or their (2) corporate email address.
  • Story needs to deliver an experience that facilitates login with just one piece of information, there is no need for 2.
  • If email address is used, full valid email address needs to be provided before the company is looked up.
  • If company name is used at least 3 letters need to be entered before company is looked up, this is done to slow down the enumeration attack.
  • If email address or company name detects more than 1 company, then list of companies is given so that user can select which one they will be given a login screen for.
  • Once user clicks on the company user gets redirected to companies configured authentication provider for login.
  • If user cancels out of the login screen, then they will get redirected back to the company selection screen.
  • After 6 attempts to provide company information or email user will be asked to wait for 30 seconds, then 1 minute then 2 minutes, following [30 seconds * num of attempts], all the way to 24 attempts.
  • Company look up approach must be discussed with senior customer support team member(s) to ensure that it will result in least amount of customer support calls. Their feedback needs to be documented.
  • All web requests must not take more than 2 seconds.

Remember, it is exactly the same requirement. Henry’s Software requirement documentation is vague, it does not provide specific information that can be used to verify and test what was delivered. While Adam’s Apps does capture user behaviour, expectation, delivery options, prerequisites and system performance. Adam’s Apps can establish specific test criteria for this feature and verify it when it is delivered.

Henry’s Software might say that Adam’s Apps documentation is heavy and too specific, their argument might be:

  1. The customer is always available, requirements will iteratively emerge or will be discussed (see below).
  2. Conversation over documentation, team knows what was discussed so we don’t need to be explicit also remember that customer is always available, it is possible to reconfirm.
  3. Team is trusted to make a right decision, there is no need to be so specific.

Of course, we do want customers or business analysts (who represent the customer) to be always available, but they are not due to holidays, meetings and competing corporate priorities. Individuals might have discussed this requirement with customer, however, these individuals might leave, go on a holiday, get sick, have to do other work, which means story might be picked up by someone who has no context. This means this individual will not be able to fill in the assumptions and in turn make mistakes which will cause rework. If business analysts or whoever writes the story knows the criteria, they should write it and not assume assumptions as known. Finally, number 3, trust does not mean that team should not take the vague requirement and refine it to be testable, at the end of the day they will still need to test it.

The main argument of this section is not about requirements documentation, it is about testing. Regardless of the format of how we write requirements down, I hope we can agree that when you know under what conditions something will work and fail then it is possible to test the piece of process, feature, concept, idea or hardware under specific test criteria.

As software people we all want to deliver great software experiences on time to our users. However, so many projects overrun, things go wrong for millions of reasons. Normally something goes wrong early in the process and then it cascades issues downstream, these issues are normally systemic i.e. they are bugs in your delivery process. They normally emerge because process is: non existent, not followed, opaque, regressed, out of date or poorly designed.

Volatility : liability to change rapidly and unpredictably, especially for the worse.

It can also be hard for managers to see where issue has actually stemmed from as people are looking at the whole thing and not the individual parts. One of the ways to increase predictability and transparency in your process is to break it down into component parts and then exposing each component part to a test criteria. Why would you want to do this? Remember the reason why we are doing any of this is because we are trying to reduce the lead time. Additionally, further the faulty feature travels through your development lifecycle more costly it is to fix it, it adds more lead time to the faulty feature, and it adds additional lead time to other features in the queue! This means we want to catch bugs as soon as possible and release work into the next stage only if it has passed all of the relevant test criteria. This way you will stabilise your delivery and reduce lead time across your entire development lifecycle.

You might be thinking, but we have documentation for all of this stuff. We know how things should work. We have definition of done, ready, test plans and so on. Yes, the problem is that normally this documentation is outside of your development lifecycle. It is a separate piece of document that you need to read and let’s face it, you probably don’t read it often enough. You probably refresh yourself once in a while just before the auditors knock on your door. The real challenge is to embed this process documentation into your development lifecycle so that system becomes self-checking, testing and auditing. To make what I am saying more concrete, here are few examples of what you can do to enable this:

  • Embedded checklist - If you are using some sort of Application Lifecycle Management software then consider embedding a version of your definition of done / ready into the Epic, Feature, Story or Task as a checklist or validation rules. When individual fills in the Epic in they have to confirm that they have done X, Y and Z, or they can’t complete the work until something is done.
  • Public self-accountability - I don’t know about you, but when I have to email large group of people update on the project, I 100% want to get my facts right. When we publicly report something, we tend to be more transparent, accountable and self-governing. Normally, we don’t want to lose face. This means it can be a good idea to get teams to send out fortnightly project updates to senior stakeholders. There are many ways that this can be used.
  • KPIs - If you know how you and your peers are being evaluated then you will change your behaviour around that evaluation. In this case you and your team should be evaluated against Lead Time.
  • Automation - It should be no surprise that when processes are automated correctly than reliability and speed of these processes drastically improve. Conceptually what you want to do is have your entire "Development Lifecycle As Code", this means that if process does not need human creativity it should be standardised and automated away (where appropriate).

Testing in development lifecycle is not just about writing down testable requirements like we did for Henry’s Software and Adam’s Apps above. It is about embedding quality controls throughout the process, and these quality controls might look nothing like you would expect. You might not even think of them as quality controls. Do you think of automation as a quality control? Routine email being sent out by an individual? Your weekly stakeholder update meeting? What about your morning stand-up? These are all form of quality controls designed to catch faults and problems in your process.

Wednesday 26 August 2020

Lead Time Driven Delivery - Part 3, Focus on results, not methods

Imagine you come up with a process change in your company, and you believe this process change is going to make things better. How do you verify that this process change is “Agile” compatible? Does change just need to sound and look good? What is Agile? If you really think about it, it is not immediately clear.

Scientist have created a simple framework to test plausibility of what is being said and proposed. All claims must be testable. Let’s say I tell you that I can jump 3 meters high, test is simple, either I can, or I can’t. Once test is conducted you will quickly find out that it was a lie, I can’t jump that high. If you have a COVID-19 cure and you are running clinical trials, you will measure average recovery time of patients without your amazing drug (control) and compare it against the group that is taking the actual drug, and you will look for a very significant average improvement. Things are not that simple when it comes to business, sociology, psychology and many other fields that have people element in them. For example, in the 20th century Freud ideas were treated as scientific. Freud was able to come up with a theory about someone’s behaviour and then fit it in to the situation so that it seemed to makes sense. If his theory worked it was because it worked, if it did not work it was because something was wrong with patient or some other theory of his worked better. Either way he was right. Fortunately for us, Karl Popper was around to introduce falsificationism to bring some sense in to the world. In over simplified terms, he said that a theory must have a clear test criteria for when it works and it does not work. If it does not have a clear test criteria than it is pseudoscience. Pseudoscience leave things vague enough so that it sounds like they might be true, and it leaves things open to interpretation, just think of Astrology, Hypnosis, Polygraph, Psychoanalysis ... (the list is long, I strongly recommend you Google it). At this point you might be thinking, OK, I get it, how is any of this relevant to software delivery?

Now let's come back to that Agile Framework process change. How do you test for Agile Framework compatibility? I can take a look at the Agile manifesto tenets. However, some of the statements there are not testable, some of them are falsifiable and the rest is list of working practices. However, it is just that, it is a list of working practices i.e. methods of doing something and not the results, this makes it hard to test. I don’t know if this is true for all Agile Frameworks, however most of them seem to suffer from same problem that they focus on method (how something should be done) and not the result (when an action is performed it produces a consistent result):

The increasing adoption of agile practices has also been criticized as being a management fad that simply describes existing good practices under new jargon, promotes a one size fits all mindset towards development strategies, and wrongly emphasizes method over results” - Wiki

It seems that Scrum, XP and other Agile frameworks are collection of working practices that seemed to have achieve good result in the organisation that it was rolled out in at the time. This can be a good thing as companies can take process of the shelf and just get started. In reality I have not worked for a company that has rolled out pure form of Agile Framework, there are always variants. Personally, I have seen some really good results by rolling out Agile and Lean working practices (even with the variants). However, I have heard of companies that have never seen their Agile process change investment pay back dividends or worse it has grinded their operations to a halt. One might say that in that case issue is with leadership of the organisation, leaders are not bought into the Agile framework and this is the cause of all of the problems. I don’t know if that is true, at some point you have to take a step back and just wonder why so many Agile Framework deployments fail all over the place. When you take paracetamol, it works (for most people), pharmaceutical companies do not say that patients that are not getting better are wrong and that their drug works in all circumstances. Surely the problem is that the solution that is being offered is wrong or is wrong for that company? Industry experts seem to be prescribing the same solution to everyone and then when there are no results, they point the finger at the leadership and never at the panacea. So why are we so fixated with the ceremonies, shrines and the dogma of how we work and not focus more on what actually produces consistent results?

Even if things don’t go wrong during the Agile Framework roll out, once the roll out is complete, then what? You want to instil culture of continuous improvement, but you are out of book, there are no more agile tricks are left up your sleeve. How do you achieve further improvement? If you start to implement changes how do you know that they are compatible with your existing Agile framework? Are your internal Agile purists (years ago I was one of them) going to become dissatisfied that you are changing this pure roll out? How do you know that your changes will improve the process and not make it worse? Also what about the rest of the company, how are they going to get better and integrate with your department Agile process?

It seems that there are three problems with Agile Frameworks:

  • They rarely get rolled out in pure form as organisation fail to fully embrace them
  • They don’t tend to consider the whole organisation, just a part of it
  • Eventually you run out of the book and you are on your own. Your investors will expect you to improve your operations further, further changes are hard to make as there is nothing to guide you

Companies should pick an Agile Framework (after all you don’t need to reinvent the wheel) that suits their organisation and once they hit a wall (and you have genuinely rolled out as much of the framework as possible) they should refocus on to the results. This might feel scary, this is because as humans we derive comfort from being told what to do by industry experts, conformity and sometimes superstitions. This is where that scientific and critical thinking can help you cut through the noise and help you to make your own decisions. Decisions that will help your company and your unique circumstances.

When you say we are going to change process from A to B. How do you know of this change will enable faster delivery? Proposed process change should have some of the following characteristics:

  • Removes number of handovers required to get something done
  • Shifts left work by enabling people upstream to get work done themselves earlier in the process
  • Reduces wait time to get something done
  • Reduces amount of disruption
  • Reduces the constraints
  • Improves individuals domain knowledge, moral or skill
  • Reduces unknowns, complexity or risk

Ultimately if you are not sure you can try the change and test it quantitatively by seeing if cycle time and lead time has reduced / increased. The main thing is that you can verify or refute process change as ultimate test is reduced average lead time.

Please take note, when it comes to results over methods, it is important to establish organisational values that will create boundaries of how far you will take the results over methods approach. Last thing you want is behaviour in your organisation that does not align with your organisational moral compass as "anything goes" to get the results, this is where your company culture needs to step in.

Saturday 7 March 2020

Lead Time Driven Delivery - Part 2, Learning from data

In part 1 of this series we have explored the basic idea of Lead Time Driven Delivery. The main idea is to minimise Wait Time, Disruption Time, Task Time and thus Lead Time for the work that is flowing through the organisation. In this blog post we are going to explore metrics and processes that will help you to:

  • Verify if process changes that you are implementing are minimising Lead Time
  • Identify how much Lead Time can be further removed

Before we explore the core idea, let me introduce you to the Little's Law.

Little's Law

Imagine a busy, but a small web agency in Edinburgh building web products for it's clients. This web agency employees UX Designers, Web Developers, Testers and Cloud Developers. They work as one team. This web agency has made commitments to deliver number of features for a very important client, this work is committed to the queue, which means this work is "work in progress" (WIP). All work that is being done will need some time from all of the team members (UX, Web Devs, Testers, etc). As soon as the team takes the work of the queue the timer starts and the timer stops when team has stopped working on the task and the work is done. This is called Cycle Time. Finally there is Lead Time, Lead Time timer starts from the moment that the work is committed to the queue and timer stops when work is done.

Remember the Hot Feature A from last blog post? Well it has spent 1 week in the queue, then finally it was picked up by the UX, however Web Devs, Testers, etc were all busy working on other work. So it has slowly made its way from one person to another until it was completed 3 weeks later. So it has taken 1 month to complete overall, but it required only 12 hours worth of work. This is one poorly managed web agency!

Relationship in the above diagram can be described with Little’s Law:

Lead Time = WIP / Throughput

Web agency team on average completes 0.3 of a task per day. Team on average has 9 commitments in the backlog that they need to get through. That means (9 / 0.3) = 30 days lead time. To improve this delivery situation team has two options:

1. Reduce the amount of committed work in the queue. If team reduced committed queue size from 9 to arbitrary lower number such as 3, this would mean that lead time would go to (3 / 0.3) = 10 days.

OR

2. Team needs to improve the Throughput (Cycle time).

For more information around Little’s Law do check out this awesome article. This entire blog series focuses on improving Cycle Time and not reducing the WIP.

Core idea

Accelerate book suggests that one of the important metrics that should be tracked is Lead Time. This totally makes sense as this is what customer experience's and it impacts recovery time, experimentation speed, etc. This has very much inspired this entire blog post series. Accelerate book does recommend to track other additional metrics, if you are interested in knowing what they are then check out this summary / review of Accelerate book.

As a software delivery practitioner I find that Lead Time is a start, however it is very high level and it does not provide much detail for me to make the improvements or scheduling decisions.

Lead Time Driven Delivery is suppose to help by exposing Wait, Disruption and Task time. If you know how much Wait and Disruption there is in the system and where it is coming from then you can do something about it. By this point you might be wondering, how can I extract this information from my Application Lifecycle Management System (ALM)? Is it even possible to automatically get metrics for Wait Time, Disruption Time and Task Time variables? Answer is that you will need to use quantitative and qualitative techniques to extract data.

https://www.slideshare.net/Intellspot/qualitative-vs-quantitative-data-infographic

Quantitative analysis

This is the easy part. If your team is storing dev data in some ALM system then you can get this data in many different ways. You just need to make sure:
  • Only actual development time against the work is logged. In this web agency they have terrible internet speed, they love meetings and build server takes forever to run tests. So developer has taken 2 hours to do the actual work, but between all of the waiting, meetings and random requests the whole day passes (8 hours). In this instance developer should log only 2 hours of actual dev time and not 8 hours.
  • All work that needs to be done is grouped in a logical way so that it is possible to identify wait states between tasks for the deliverable.

Now you can create a two metrics Lead Time Resistance and Lead Time Spent Idle.


Lead Time Resistance

Lead Time Resistance measures how difficult it is to get work done. In the above digram Sam (UX) might take only 2 hours to do the design work (blue), however for the rest the day he is disrupted (orange) and before he knows two days have gone by. Lead Time Resistance calculation takes total actual time for the work and divides it by the total elapsed time for the work.

Feature in the diagram has taken 12 hours actual time, but it took 5 days total elapsed time. Lead Time Resistance for this is 1-[12 (hours) / [5 (days) * 8 (hours per day)]] = 70%, 70% was spent on Disruption and Wait Time i.e. stuff getting in the way, creating resistance.

Lead Time Spent Idle

Lead Time Spent Idle measures how well the work was planned. Sam (UX) has completed his work in 2 days. Work waited for ~1.5 days before it was picked up by John (Dev). After John was done, it has waited for another ~3.5 days before Dan (Test) picked it up. Total idle time of no activity is divided by the total elapsed time. Feature in the diagram has taken 20 (5 days * 4 weeks) business days to complete. Out of 20 business days it has spent 8.5 business days (1.5 days + 3.5 days + 3.5 days) in idle state, this means 8.5/20 = 42.5%, 42.5% was spent in idle state.

Now let's bring it all together. The feature in the diagram has encountered 70% of resistance and it has spent 42.5% of the time in the idle state. I don't know about you, but this is really useful information. Now that this is known, team can move on to the qualitative analysis and do some deeper analysis on what can they do to improve this situation.

Qualitative analysis

This is the hard part, this is where team needs to actually continuously question existing working practices and get creative about improvements. Quantitative analysis will expose a lot of variables, just take a look at this:


However, numbers alone will not tell you if Lead Time Resistance is large due to waiting around or chronic disruption culture. Additionally, it will not tell you how much time is lost due to poor work distribution, poor designs, lack of standards / components, staff turnover, etc. This is where teams should keep a daily log of all of the Wait Time, Disruption Time and Task Time. They should then use this information during retrospectives to review their current workflow setup and figure out how the future workflow should look like to improve the Lead Time.

Side Note

In Lean there is strong focus on waste elimination, just Google 8 wastes or check out my older blog post. Problem is that you need to look up what waste means in order to understand it. Then you need to translate it to knowledge work so that it is relevant. Personally I don’t think it is that relatable to knowledge work and once translated it does not stay in your mind for that long. Software pactioners have introduced "Waste Snake” while I like the concept, problem is still the same. "Waste" is a vague name. From my personal experience, I have seen teams use it for a while showing mainly disruptions. I don't know why but they have not focused on other more hidden wastes such as Wait and Task Time. It might sound less cool, but instead of "Waste Snake" create a "Lead Time Wall" and just stick on to it anything that impacts Wait, Disruption and Task Time with the amount of time lost.

Conclusion

Your team should automate the following metrics:

  • Lead Time
  • Cycle Time
  • Work in progress (WIP)
  • Lead Time Resistance
  • Lead Time Spent Idle

These automatic metrics are useful as they will expose Wait Time and they will tell you if you are going in the right direction with your process changes. To actually figure out what needs to change to improve Lead Time, your team will need to conduct constant qualitative analysis where you manually review Disruption and Task Time.

Tuesday 28 January 2020

Lead Time Driven Delivery - Part 1, Learning to see

TL;DR Your organisation needs to minimise Wait, Disruption and Task Time so that work gets delivered quickly all the way to the customer. This means you need to minimise Lead Time.

I am very luck as I have been an Agile and Lean practitioner for a while. I have spent time reading about different delivery methodologies such as XP, Scrum, Kanban, etc and I had an opportunity to work in organisations where they have used these different methodologies. One thing that constantly stood out to me was how they all prescribe best practice with a very shallow reasoning behind these prescriptions. It looks like they are based on experience and environment where they were created. This complicates things as approaches end up being open to interpretation on what is Agile and what is someone's subjective opinion on the matter. For example, if someone has came up to you and said, I want to change our existing process from A to B. How would you test/verify that this new process is more "Agile"? Would you refer to Agile Manifesto? Use your experience/training? As far as I know there is no testable/verifiable way to measure Agile, this creates a communication and expectation problem with new and seasoned practitioners. If you promote someone to be a team lead or if you onboard a new member of staff you need to explain to them why you are doing something in a certain way. Saying to them "please read this Agile book and follow it" is not going to work. If you tell your new members of staff just do "what we do”, well that’s flawed because they don’t understand the reasoning behind the intents. Also, why should they follow it? This means that the moment you need to change how your company works you end up with organisation that starts to make inconsistent decisions between different teams and departments as no one really understands what behaviour and KPIs they are trying to minimise or maximise.

In this blog post (and eventually series) I am going to attempt to breakdown the core reasoning behind the Agile practice so that it is more verifiable. Hopefully this will mean that core Agile ideas can be explained quicker to people around you and that you and your team can confidently mature your own delivery practice. Let’s get started.

1. Anatomy of you sitting down and trying to do some work



Three factors that make up your work:

  • Wait Time - This is when you are waiting around for some knowledge that you don’t have, decisions that you can’t make and finally you are waiting around for someone else to complete some work before you can start yours.
  • Disruption Time - This is when you have to expedite some work, rework some work, corporate interruptions and mental health impact.
  • Task Time - Finally, this is the actual work that you are doing, pure sitting down and getting things done.

Imagine you are working on your own on your own start-up. You will have very little wait and disruption time. You are on our own, you can make all of the decisions. Also if you are lucky enough to work in a quiet environment you should experience very little or no disruptions. You get things done fast, your users are impressed with your company, new features just come out all the time. However, this changes the moment you hire your first employee in your start-up. The moment you do that, you create an organisation, that means you have created a system. In the system work no longer gets done by a single individual, it gets done by many individuals. You as the founder are unlikely to feel much impact by hiring this new person (apart from knowledge transfer burden), but if you are not careful your new employee will have to wait for your decisions, knowledge and task allocation. Their Wait Time will grow as they wait for you and they will probably be disrupted by you. You will wonder why they are not as a productive as you, it might be because they have not got enough autonomy to make decisions (maybe they don’t know your values so they don’t know what decisions to make on your behalf) also they might not be getting enough clarity about the desired outcomes. Most people are not founders, they are the employees and they struggle to do their best as they just don’t understand the reasoning framework and don’t get enough autonomy.

2. Anatomy of your Waterfall company trying to deliver some value to the customer



Image a company that does not follow any Agile process and instead they have departments of people per discipline. So Web devs in one department, API Devs are in another department, you get the point. Each department will have their own backlog, which means everyone has their own Lead Time, on top of that all individuals will experience disruptions (team meetings, urgent requests you know the drill) and there will be many handovers from one department to another. Work will also end up traveling backwards due to misunderstandings. So if a customer has requested a “Hot Feature A” they will have to wait for a long time for this work to travel through this type of organisation (system). Actual Task Time for "Hot Feature A” might be 12 hours of work in total, however given all of the Wait Time (handovers and lead times) and disruptions it might take up to 1 month before it gets shipped. So there is a big difference between 1 month Lead Time and 12 hours Task Time. However your customer will not care about the 12 hours of Task Time, they will just care that you took 1 month Lead Time. Overall in this type of organisation Lead Time for most work will be very high, fewer projects will be shipped, projects will very rarely go out on time and individuals will feel frustrated as there will be a lot of firefighting.

3. Anatomy of your Agile company trying to deliver some value to the customer



Now imagine another company that understands importance of Lead Time and works to remove as much Wait, Disruption and Task Time (more on Task Time later) from overall delivery process. They have decided to sit people together for a limited amount of time to deliver certain features and projects. They have done this as they want to remove handovers, the amount of project management is required, competing agendas, waiting for decisions, knowledge and organisational dependencies. They work as a team on one story at time and their main job is to push that one story through the system as fast as possible. Now, that story that took 1 month to deliver, in this new system will take 12 hours or even less. This is because you have removed all of the waiting around, disruptions (team lead and product owners act as defenders) and because this team is sitting together they can actually expose the unknowns faster, tame complexity, share their experience and share the burden of the work so they can actually deliver the work faster.

4. Anatomy of Task Time



I have left this till last for a reason. Management team needs to improve the overall system before they look at improving Task Time. Why? It is much healthier to focus on fixing the overall organisation before looking at how they can improve individual's performance. In companies where Lead Time is high, good talent might become disengaged. As you fix the systemic problems, you might find that people who were not performing that well start to really surprise you and that Task Time reduces naturally.

The actual Task Time is made up from eight factors which are dynamic:

  • Volume of work - This is just you sitting and typing, copy and pasting.
  • Unknowns - This is you identifying stuff that you did not consider when you were estimating the work.
  • Complexity - This is you figuring out an algorithm to solve a problem, the main thinking part.
  • Risk - This is how much testing you have to do given the risk level that is acceptable for the task at hand.
  • Skill - This is you improving your hard/soft transferable skills (programming, math, architecture, algorithm design, management, etc) or using your existing skills to get work done quicker.
  • Domain - This is you gaining new domain knowledge (HR, Logistics, Financial Trading, etc) or using your existing domain knowledge to get work done faster.
  • Attitude - This is how you perceive your work environment and tasks.
  • Aptitude - This is you having developed or have predisposed skills towards the work that you are doing.

I know this is obvious but I would like to stress one point. Most people will take different amount of time to get a task done. Why? They are different people, with different mindsets, skills, domain knowledge, aptitude and attitude. All of these things impact overall Task Time.

It will not surprise anyone that experienced teams (high skill and domain knowledge), are more likely to identify unknowns, reduce volume of work through some automation, tame complexity and as a result deliver high quality work quickly. If your organisation wants to improve Task Time then it needs to ensure that people stick around.

What does it all mean?

In priority order, everyone in your organisation should be working hard to minimise Wait, Disruption and Task Time and thus minimise Lead Time. Organisations will never achieve perfect Lead Time, however they need to constantly work towards it. To me this is what DevOps, Slack, XP, Scrum, Kanban, Lean, etc is all about.

If you take anything away from this blog post, then it would simply be this, start to measure Lead Time for work that is traveling through your organisation and find ways to minimise it.