ZAN KAVTASKIN

Tuesday 21 December 2021

Decision Analysis - Pros and cons framework is not best for decision making

Pros and cons (for and against list) seems to be used universally. While it might be better than nothing, it has some flaws. Typically users resort to pros and cons when they are stuck and have to make a decision, should they choose A or B or maybe between A, B, C and D.

What typically happens is that pros and cons user will write a pro for A, then that pro will be used as a con for B, pros for B will be used as cons for A and so on. Users tend to look at the problem from a subjective vantage point which makes perspective narrow and from experience it can invite emotive points on to the list. Emotions might have a place on the list given the right context, however they need to be framed correctly. Once the user has written everything down, the user can review the pros, cons and the relevant weights against these points and then decide on the “right” outcome.

I do not know about you, I was frequently underwhelmed by the effectiveness of this method. As by the end of it I was not in a much better place than when I had started. I think this is because at the core pros and cons do not add any new perspective or value to the decision making. What you write down is probably what you already know, and you are writing stuff down because you want to know more as you are stuck. Writing down what you already know is not going to tell you anything new, and that is the problem.

By now you probably want to hear the alternative approach to the pros and cons. Instead write down possible options / solutions and in what situation they would work best. I will now provide a hypothetical example:

Requirements: My existing laptop does not have enough storage space and it has become noticeably slower. Get the latest Mac that will meet my requirements.

MacBook Air, best choice when:

Want to fit it into my backpack as I will need to do work on it outside of my house
Want to keep price of the purchase as low as possible
Type of work that I will be doing does not require large screen

MacBook Pro 13”, best choice when:

Want to fit it into my backpack as I will need to do work on it outside of my house
Type of work that I will be doing does not require large screen
Touch bar aids productivity and is worth the extra cost

MacBook Pro 16”, best choice when:

Touch bar aids productivity and is worth the extra cost
Need a large screen for the “split view” with good resolution to avoid the need of getting a standalone monitor
Need extra compute to run intensive tasks such as naive brute force deep learning algorithms
Price is no longer a constraint

Mac, best choice when:

There is space for a desktop computer in the office room
Family members will be allowed to use it, this might be economical long term as will not need to buy as many computers
Time-sharing with the family members will not be a problem
Need a large screen for the “split view”
Need a high-quality camera for online meetings

This analysis is not complete, in the real scenario I would expand on the requirements, provide price for each option and add technical specifications. However, this should give you a feel for this alternative method. What I get out of this method is the value that it adds by making you think through under what conditions certain solutions would work and what makes them stand out in the certain context.

Saturday 26 September 2020

Lead Time Driven Delivery - Part 5, Practical and closing thoughts

It is time to close this mini-series with some practical and personal ideas.

Work in Progress - Reducing lead time through queue control

If your team has a backlog than applying Little’s Law will not make much sense to the backlog itself. Little’s Law can be applied only to a queue and in queue systems something needs to come in and then come out. If something stays there and never comes out, then Little’s Law will not hold. From my experience a backlog is not a queue as work enters the backlog and can stay there indefinitely. When someone enters a queue, they are committed and slowly or quickly make their way through it and eventually leave the queue. If you are using Sprints then you can apply Little’s Law to the Sprint, under a Sprint condition team commits to the work, work enters the Sprint and then leaves the Sprint. If you have number of projects that are going through the system that are committed, then you can use Little’s Law for that as well.

Little’s Law is very significant to the LTDD as average Lead Time = Work in Progress / Throughout. This means the more things we commit to the queue the higher average lead time grows. So, it makes sense to pick a strategy where it is possible to quickly make space for new high priority work. This way customers don’t experience long lead times while software teams deliver their long-term commitments. This also tells us something very interesting. Average delivery lead time can be brought under control if your work scheduling is respected, this stabilises average delivery lead time which makes it more predictable and it enables planning and forecasting. However, for an organisation to benefit from low lead time they need to understand what work is important to them and give that work priority.

Understanding what work is important

Most of us heard the famous thought experiment: “If a tree falls in a forest and no one is around to hear it, does it make a sound?”. If your software team delivers a brand-new “feature A” that customers don’t care about (as they don’t use it right now), but you don’t fix their bugs, customers cases and answer their questions. Will they perceive your organisation to be responsive (low lead time) or not? In this scenario bugs, customers cases and questions are visible, that is what customer cares about, and this new “feature A” is that tree that has fallen in forest.

Prioritisation will depend on what is the most important thing for your company at that point in time. Maybe other customers have been told that “feature A” is coming and it is in the contract. Maybe “feature A” is amazing and most customers will upgrade to a new tier to get that feature. So, revenue might play a critical role. Maybe your company believes in quality above everything else. This means company might be OK with delivering new functionality a little bit slower while they prioritise customer cases, bugs and answering questions. There are two personas here: internal stakeholders (investors and sponsors) and external stakeholders (customers and partners). Personally, I don’t think internal and external stakeholders are at all mutually exclusive. However, your organisation needs to make a decision, which stakeholders lead time needs to be minimised, internal or external. Of course, there are more levers then that and it is more nuanced, but priority needs to be established.

I think that there is something that we can all learn from the Manufacturing here. Manufacturers have found that health and safety has correlation with productivity, in fact Lockheed Martin stated that by focusing on Health and Safety they have experienced 24% productivity increase and 20% reduction in factory costs. Think about that for a second. Does that mean we can focus on delivering quality service and be even more productive? There is no trade off? The answer seems to be yes. Now when it comes to software products some companies choose to sweep bugs under the rug, accept security risks and not deal with quality problems in their products in order to get more features out. This creates more escalations, customer cases, late night calls, bugs, endless cycle of firefighting and hardship. Everyone has to work harder just to keep the lights on. What if software companies on average constantly prioritised "Operational Quality" over new features first. Is it possible that software companies would provide get ~24% productivity increase by working this way?

Minimising customer facing lead time

More work there is in progress the higher lead time grows. Let’s say your team is working on important strategic project that will increase company revenue. You are about to commit to a deadline. Before you do it is important to remember that if you commit without leaving any space for customer requests, bugs, or small features customers will have to wait until you have completed your important strategic project. The only way for you minimise customer or contractual (security, SLA, GDPR) lead time is by factoring in some “operational slack” to address customer or contractual concerns. Also please don’t confuse your contingency with “operational slack”. Project contingency deals with project-based risk (discovering unknowns, someone needs few unexpected days off) and “operational slack” is space to deal with day to day operational concerns. This does mean that important strategic project will experienced longer lead time overall, but not at the cost of customer facing lead time.

Throughput - Ideas to reduce your cycle time

As you look to improve throughput you will want some explicit examples on what techniques can be used to actually achieve this. Here is a cheatsheet on what you can do per each factor:

Wait Time

Minimise knowledge, decisions and work dependencies. Try the following:

Reducing handovers
Prioritised Backlog
Planning and Scheduling
Prioritised Product Portfolios
Project Management
Removing dependencies on others
Training and knowledge sharing
Empowering to make local decisions and create knowledge
Single piece flow
Self-service
Just-In-Time
Minimise supporting teams WIP

Disruption Time

Minimise expedite (reactive work), rework, interruptions and mental health impact. Try the following:

Building quality in to the development process and product
Prioritised Backlog
Planning and Scheduling
Prioritised Product Portfolios
Empowering to make local decisions and create knowledge
Aligning work to an individual and company objectives
Protecting team from disruptions and being proactive

Task Time

Minimise / Maximise volume of work, unknowns, complexity, experience, attitude, aptitude and risk. Try the following:

Training and knowledge sharing
Components re-use
Involve subject experts
Increase talent retention
Aligning work to an individual and company objectives
Maximise team’s strengths a minimise weaknesses
Continual learning and experimentation

Please remember that you can make some quick wins in throughput, however really important breakthrough improvements will take time. These improvements will never really stop either, as you make some improvements, you will find new improvements that were hiding. Then you will make those and find new ones, and so on. It is important to stress that process improvements don’t have to be expensive to implement and most likely you don’t even need to build any additional software to make these improvements happen. From my experience, process changes mostly require a lot of thinking and communication.

I strongly recommend that you log all of the Wait, Disruption and Task Time during a Sprint so that your team can discuss this during the retrospective. Your team needs to take just one improvement away (ideally the one that will make the biggest impact) and actually make the change, that is the key. If after each retrospective team actually reviews issues and implements just one improvement, then after a while there will be no stopping this team.

Constraints - Remove constraints, reduce wait time

I am a big fan of theory of constraints and Eliyahu M. Goldratt’s work. After years of use, I have realised that theory of constraints model does not translate literally into knowledge work (this debate is outside of the scope of this blog post). However, I believe that there are few useful mental shortcuts (heuristic) that can be applied to get the benefit from theory of constraints in knowledge work.

Dependency constraint heuristic

If you would like to know if there is an operational constraint in the system just listen to the people say: “We are constantly waiting for X”, “How can X be so slow?”, “I can never get hold of X”, “They are just so busy, but we really need X”, “They keep promising that they will get it done by it never happens”, “Quality from X is never good enough”, “X is constantly down, this really slows us down”, etc. These constraints slow down the whole system as it can’t perform to the optimal levels. This means these individuals (can also be technology .e.g. build servers) are not delegating enough (usually managers), not saying no to things enough (anyone who over commits) or there are not enough of people to do the work (hands on people i.e. not managers) to remove this constraint.

Change constraint heuristic

If you don’t see enough change in the process it normally means that people who are supposed to be implementing the change are not prioritising it as a top priority (they are either over committed, can’t delegate or prioritise). This creates a big problem with opportunity cost. By this individual being a constraint to a process change you cannot enjoy the benefits of the change and you will not get to the desired destination sooner. Assuming that these individuals need to make the change, you need to implement a "change circuit breaker", which individual just prioritises the changes in over everything else for a short while (gracefully without impacting the customer lead time of course). If this does not happen then opportunity cost will just keep growing.

Conclusion

Lead time consists of number of items in the queue and your team’s throughput speed. It is possible to provide low lead time to your customers by leaving space in the queue for customer requests (Little’s Law). However, your investors and sponsors will most likely want you to also focus on getting your throughput improved so that they get more for their investment. Throughput is made up from three factors: wait, disruption and task time, by eliminating wait and disruption and minimising task time you can finally increase throughput speed. Once you start to eliminate wait, disruption and minimise wait time it might force you to go beyond your existing agile framework methods. As you focus on results and not methods you might end up questioning your long-term beliefs about what actually makes your team and your organisation productive.

Monday 21 September 2020

Lead Time Driven Delivery - Part 0, Introduction

Contents:

Lead Time Driven Delivery (LTDD) approach has emerged from personal need to improve software delivery teams speed, LTDD is an extension of your Agile framework and it attempts to fix the following problems:

Agile frameworks tend to be collection of methods from industry practitioners. Most of these methods do not have any real evidence behind them that they actually work. Agile frameworks don’t necessarily have clear focus on what result they are trying to achieve, that is apart from vague "delivering value to customer” which is hard to measure.
Once organisations roll out Agile framework, it is not happily ever after. Some organisations start to deliver software slower, some speed up. However, no matter what happens, organisation's sponsors expect continuous improvement, so what’s the next improvement? How do you know what you can and can’t change? Are you bound to the Agile framework methods?
New practitioners and managers starting in the industry should not need years of experience to learn (often arbitrary) methods to be able to understand the main delivery concepts of why they are following some method, how it is applied and how they can make further improvements.
Certain scientific manufacturing management paradigms and models such as Theory of Constraints, 8 Wastes, etc don’t translate well into knowledge work. In fact, some aspects are hurtful and damaging to the knowledge work.
Software engineering department is not the only department in your company, how do you integrate your Agile Framework with Sales? Customer support? Implementations?

As a practitioner if you have identified similar problems then you might be happy to know that you are not alone, maybe this short series will give you some ideas on how you can further improve your team and your organisation overall. LTDD is not a collection of specific delivery methods such as pair programming, sitting together, using story points etc, this is already covered in abundance. LTDD is a framework and a way of thinking, it frees you from the Agile method and it allows you and your organisation to choose the methods that minimise your organisation’s lead time.

The name "Lead time driven delivery" name comes from research book called “ACCELERATE Building and Scaling High Performing Technology Organisations", this book identified KPIs that seem to correlate with profitability of organisations, and lead time is one of them. This is hardly surprising, our sponsors and customers don’t care that you have taken 5 minutes to make a software change but have taken 3 months to ship this change to production, all your customers see is 3 months elapsed time and not that 5 minutes. So, if lead time makes your organisation respond to market changes faster and provide better customer experience than why is this not your number #1 KPI?

This short series will attempt to give you some tools to make a change, and a really great thing is that it does not matter where you are and it does not matter how long it will take you to reduce that elapsed time from 3 months to 5 minutes, what matters is that you make a start and work with your peers through these problems, that collaboration is the real transformation.

Thursday 3 September 2020

Lead Time Driven Delivery - Part 4, Stabilise through embedded testing

Before you read this please read prerequisite Focus on results, not methods blog post as it briefly explains the basic scientific thinking that will be used here.

How do you know if a piece of process, software, hardware, concept or idea will behave in a correct way? Also, how do you know if this thing will meet the required quality, performance and reliability levels? Well, it is all about knowing how this thing will behave under certain conditions and more specifically it is about knowing when something will work and when it will fail.

To make this a bit more concrete let’s imagine that a customer with a lot of money went to two different software houses, one is called Henry’s Software and the other one called Adam’s Apps. Customer asked them to develop an identical software. To keep things simple, we will focus on a specific requirement. Here is what these two companies have written down for an identical requirement.

Henry’s Software: As a user when I open the mobile app for the first time, I would like to be able to quickly and easily connect to sign into my companies account.

Acceptance Criteria:

User enters information that he/she knows
User is directed to the relevant login screen
User puts in username and password
User successfully is directed to company account.

Adam’s Apps: As a user when I open the mobile app for the first time, I would like to be able to quickly and easily sign into my companies account.

Acceptance Criteria:

User knows the name of the (1) company that he/she works for or their (2) corporate email address.
Story needs to deliver an experience that facilitates login with just one piece of information, there is no need for 2.
If email address is used, full valid email address needs to be provided before the company is looked up.
If company name is used at least 3 letters need to be entered before company is looked up, this is done to slow down the enumeration attack.
If email address or company name detects more than 1 company, then list of companies is given so that user can select which one they will be given a login screen for.
Once user clicks on the company user gets redirected to companies configured authentication provider for login.
If user cancels out of the login screen, then they will get redirected back to the company selection screen.
After 6 attempts to provide company information or email user will be asked to wait for 30 seconds, then 1 minute then 2 minutes, following [30 seconds * num of attempts], all the way to 24 attempts.
Company look up approach must be discussed with senior customer support team member(s) to ensure that it will result in least amount of customer support calls. Their feedback needs to be documented.
All web requests must not take more than 2 seconds.

Remember, it is exactly the same requirement. Henry’s Software requirement documentation is vague, it does not provide specific information that can be used to verify and test what was delivered. While Adam’s Apps does capture user behaviour, expectation, delivery options, prerequisites and system performance. Adam’s Apps can establish specific test criteria for this feature and verify it when it is delivered.

Henry’s Software might say that Adam’s Apps documentation is heavy and too specific, their argument might be:

The customer is always available, requirements will iteratively emerge or will be discussed (see below).
Conversation over documentation, team knows what was discussed so we don’t need to be explicit also remember that customer is always available, it is possible to reconfirm.
Team is trusted to make a right decision, there is no need to be so specific.

Of course, we do want customers or business analysts (who represent the customer) to be always available, but they are not due to holidays, meetings and competing corporate priorities. Individuals might have discussed this requirement with customer, however, these individuals might leave, go on a holiday, get sick, have to do other work, which means story might be picked up by someone who has no context. This means this individual will not be able to fill in the assumptions and in turn make mistakes which will cause rework. If business analysts or whoever writes the story knows the criteria, they should write it and not assume assumptions as known. Finally, number 3, trust does not mean that team should not take the vague requirement and refine it to be testable, at the end of the day they will still need to test it.

The main argument of this section is not about requirements documentation, it is about testing. Regardless of the format of how we write requirements down, I hope we can agree that when you know under what conditions something will work and fail then it is possible to test the piece of process, feature, concept, idea or hardware under specific test criteria.

As software people we all want to deliver great software experiences on time to our users. However, so many projects overrun, things go wrong for millions of reasons. Normally something goes wrong early in the process and then it cascades issues downstream, these issues are normally systemic i.e. they are bugs in your delivery process. They normally emerge because process is: non existent, not followed, opaque, regressed, out of date or poorly designed.

Volatility : liability to change rapidly and unpredictably, especially for the worse.

It can also be hard for managers to see where issue has actually stemmed from as people are looking at the whole thing and not the individual parts. One of the ways to increase predictability and transparency in your process is to break it down into component parts and then exposing each component part to a test criteria. Why would you want to do this? Remember the reason why we are doing any of this is because we are trying to reduce the lead time. Additionally, further the faulty feature travels through your development lifecycle more costly it is to fix it, it adds more lead time to the faulty feature, and it adds additional lead time to other features in the queue! This means we want to catch bugs as soon as possible and release work into the next stage only if it has passed all of the relevant test criteria. This way you will stabilise your delivery and reduce lead time across your entire development lifecycle.

You might be thinking, but we have documentation for all of this stuff. We know how things should work. We have definition of done, ready, test plans and so on. Yes, the problem is that normally this documentation is outside of your development lifecycle. It is a separate piece of document that you need to read and let’s face it, you probably don’t read it often enough. You probably refresh yourself once in a while just before the auditors knock on your door. The real challenge is to embed this process documentation into your development lifecycle so that system becomes self-checking, testing and auditing. To make what I am saying more concrete, here are few examples of what you can do to enable this:

Embedded checklist - If you are using some sort of Application Lifecycle Management software then consider embedding a version of your definition of done / ready into the Epic, Feature, Story or Task as a checklist or validation rules. When individual fills in the Epic in they have to confirm that they have done X, Y and Z, or they can’t complete the work until something is done.
Public self-accountability - I don’t know about you, but when I have to email large group of people update on the project, I 100% want to get my facts right. When we publicly report something, we tend to be more transparent, accountable and self-governing. Normally, we don’t want to lose face. This means it can be a good idea to get teams to send out fortnightly project updates to senior stakeholders. There are many ways that this can be used.
KPIs - If you know how you and your peers are being evaluated then you will change your behaviour around that evaluation. In this case you and your team should be evaluated against Lead Time.
Automation - It should be no surprise that when processes are automated correctly than reliability and speed of these processes drastically improve. Conceptually what you want to do is have your entire "Development Lifecycle As Code", this means that if process does not need human creativity it should be standardised and automated away (where appropriate).

Testing in development lifecycle is not just about writing down testable requirements like we did for Henry’s Software and Adam’s Apps above. It is about embedding quality controls throughout the process, and these quality controls might look nothing like you would expect. You might not even think of them as quality controls. Do you think of automation as a quality control? Routine email being sent out by an individual? Your weekly stakeholder update meeting? What about your morning stand-up? These are all form of quality controls designed to catch faults and problems in your process.

Wednesday 26 August 2020

Lead Time Driven Delivery - Part 3, Focus on results, not methods

Imagine you come up with a process change in your company, and you believe this process change is going to make things better. How do you verify that this process change is “Agile” compatible? Does change just need to sound and look good? What is Agile? If you really think about it, it is not immediately clear.

Scientist have created a simple framework to test plausibility of what is being said and proposed. All claims must be testable. Let’s say I tell you that I can jump 3 meters high, test is simple, either I can, or I can’t. Once test is conducted you will quickly find out that it was a lie, I can’t jump that high. If you have a COVID-19 cure and you are running clinical trials, you will measure average recovery time of patients without your amazing drug (control) and compare it against the group that is taking the actual drug, and you will look for a very significant average improvement. Things are not that simple when it comes to business, sociology, psychology and many other fields that have people element in them. For example, in the 20th century Freud ideas were treated as scientific. Freud was able to come up with a theory about someone’s behaviour and then fit it in to the situation so that it seemed to makes sense. If his theory worked it was because it worked, if it did not work it was because something was wrong with patient or some other theory of his worked better. Either way he was right. Fortunately for us, Karl Popper was around to introduce falsificationism to bring some sense in to the world. In over simplified terms, he said that a theory must have a clear test criteria for when it works and it does not work. If it does not have a clear test criteria than it is pseudoscience. Pseudoscience leave things vague enough so that it sounds like they might be true, and it leaves things open to interpretation, just think of Astrology, Hypnosis, Polygraph, Psychoanalysis ... (the list is long, I strongly recommend you Google it). At this point you might be thinking, OK, I get it, how is any of this relevant to software delivery?

Now let's come back to that Agile Framework process change. How do you test for Agile Framework compatibility? I can take a look at the Agile manifesto tenets. However, some of the statements there are not testable, some of them are falsifiable and the rest is list of working practices. However, it is just that, it is a list of working practices i.e. methods of doing something and not the results, this makes it hard to test. I don’t know if this is true for all Agile Frameworks, however most of them seem to suffer from same problem that they focus on method (how something should be done) and not the result (when an action is performed it produces a consistent result):

“The increasing adoption of agile practices has also been criticized as being a management fad that simply describes existing good practices under new jargon, promotes a one size fits all mindset towards development strategies, and wrongly emphasizes method over results” - Wiki

It seems that Scrum, XP and other Agile frameworks are collection of working practices that seemed to have achieve good result in the organisation that it was rolled out in at the time. This can be a good thing as companies can take process of the shelf and just get started. In reality I have not worked for a company that has rolled out pure form of Agile Framework, there are always variants. Personally, I have seen some really good results by rolling out Agile and Lean working practices (even with the variants). However, I have heard of companies that have never seen their Agile process change investment pay back dividends or worse it has grinded their operations to a halt. One might say that in that case issue is with leadership of the organisation, leaders are not bought into the Agile framework and this is the cause of all of the problems. I don’t know if that is true, at some point you have to take a step back and just wonder why so many Agile Framework deployments fail all over the place. When you take paracetamol, it works (for most people), pharmaceutical companies do not say that patients that are not getting better are wrong and that their drug works in all circumstances. Surely the problem is that the solution that is being offered is wrong or is wrong for that company? Industry experts seem to be prescribing the same solution to everyone and then when there are no results, they point the finger at the leadership and never at the panacea. So why are we so fixated with the ceremonies, shrines and the dogma of how we work and not focus more on what actually produces consistent results?

Even if things don’t go wrong during the Agile Framework roll out, once the roll out is complete, then what? You want to instil culture of continuous improvement, but you are out of book, there are no more agile tricks are left up your sleeve. How do you achieve further improvement? If you start to implement changes how do you know that they are compatible with your existing Agile framework? Are your internal Agile purists (years ago I was one of them) going to become dissatisfied that you are changing this pure roll out? How do you know that your changes will improve the process and not make it worse? Also what about the rest of the company, how are they going to get better and integrate with your department Agile process?

It seems that there are three problems with Agile Frameworks:

They rarely get rolled out in pure form as organisation fail to fully embrace them
They don’t tend to consider the whole organisation, just a part of it
Eventually you run out of the book and you are on your own. Your investors will expect you to improve your operations further, further changes are hard to make as there is nothing to guide you

Companies should pick an Agile Framework (after all you don’t need to reinvent the wheel) that suits their organisation and once they hit a wall (and you have genuinely rolled out as much of the framework as possible) they should refocus on to the results. This might feel scary, this is because as humans we derive comfort from being told what to do by industry experts, conformity and sometimes superstitions. This is where that scientific and critical thinking can help you cut through the noise and help you to make your own decisions. Decisions that will help your company and your unique circumstances.

When you say we are going to change process from A to B. How do you know of this change will enable faster delivery? Proposed process change should have some of the following characteristics:

Removes number of handovers required to get something done
Shifts left work by enabling people upstream to get work done themselves earlier in the process
Reduces wait time to get something done
Reduces amount of disruption
Reduces the constraints
Improves individuals domain knowledge, moral or skill
Reduces unknowns, complexity or risk

Ultimately if you are not sure you can try the change and test it quantitatively by seeing if cycle time and lead time has reduced / increased. The main thing is that you can verify or refute process change as ultimate test is reduced average lead time.

Please take note, when it comes to results over methods, it is important to establish organisational values that will create boundaries of how far you will take the results over methods approach. Last thing you want is behaviour in your organisation that does not align with your organisational moral compass as "anything goes" to get the results, this is where your company culture needs to step in.

Pages