Agile Enterprise and Lean: 2014

Monday 24 March 2014

How is the next Jenkins looking? delivery pipeline and cloud!

Jenkins is one of the favorite tools for orchestrating deployment and releases tasks, along with Bamboo from Atlassian. Jenkins begun its story in Sun Microsystem back in 2005 at java.net as Hudson project. Its aim is to easily build the code written by programmers and integrate their works with revision control (SCM). Bamboo is also a nice tool to coordinate release and deployment and they both are based on the Continuous Integration premises of bulding multiple integration on a daily basis.

Even though it does help setup and coordinate some of the automated tasks related to Continuous Integration, it does not embrace entirely the Continuous Delivery vision to use the pull mechanism (customer driven) instead of push mechanism. Due to its nature as Continuous Integration tools, Jenkins is task driven, not flow driven.

Continuous Integration (CI) is a predecessor to Continuous Delivery (CD). The basic principle of CI is simple: “members of a team integrate their works frequently, at least once a day.” The CI method will enable multiple integrations per day at every commit. That is the reason why Jenkins, with its CI philosophy is task driven, because it imposes the frequent update from team members.
Continuous Delivery, on the other hand is an extension of Continuous Integration. Started in 2010, the idea of CD is to move forward and focusing on customers need. Therefore the release in Continuous Delivery is driven by customers, rather than driven by developers like Continuous Integration. When we speak of lean software development in its true form, CD provides the lean environment and continuous improvement.

Today, we have seen some new tools are coming to the market. Let me make a guess at how the next CD tool will look like:

Provide a visual representation of the delivery pipeline
Look sexy.
System Environment aware
Cloud and virtualization aware
Handle manual decisions and manual triggers
Consolidate reports and link them to the right release
Configuration management aware.
Open-Source.

VISUAL REPRESENTATION OF THE DELIVERY PIPELINE

The next generation CD tool must be able to represent the entire Delivery Pipeline: all the way from commit, integration testing stages, up to production.

(visual above from Nhan Ngo)

It will have to represent the different stages of the pipelines, and the tasks being run in each stage.

Jenkins does have a plugin for that. But like everything else with Jenkins, it looks rather ugly and is badly integrated with the other plugins.

LOOK SEXY.

The next generation CD tool must look awesome, no excuses!
It needs to appeal to business analysts, product managers, project managers, testers alike, not only to developers.

SYSTEM ENVIRONMENT AWARE

Jenkins only knows about slaves. But an environment is rarely comprising of only one server.
A good CD tool must understand the concept of the division between production and development environment and allow one to define on which node of the environment the task should run. Here is how an environment might look like:

On top of this, the tool should be able to scale up and down the number of environments, depending on how much code change is pushed through the SCM.

In other words: more environments when the developers are making a lot of changes, less environments when they are sleeping (hopefully at night).

I envision this as maintening a pool of environments:

when there is less commits happening, decomission the environments (and the servers)
when there is more commits happening, comission new servers from templates and setup the environment
make sure there is enough environments warmed up and ready to be used (i.e. 5 in standby, max 20 running at the same time)
allow one to define the roles of the servers in the environments and what task can be run in each server

Those environments will be used mostly for integration and automated testing: regression testing, functional testing, performance testing and so forth

CLOUD AND VIRTUALIZATION AWARE

From the paragraph above you can deduct that the environment will need a Cloud platform. Otherwise at least it needs virtualized resources available on demand.

The CD tool needs to be able to communicate with the Cloud provider's API, in order to commission and decommission servers.

HANDLE MANUAL DECISIONS

Jenkins does not have a concept of manual decision. In a way it makes sense. It's a tool for automation only.

However, Jenkins’ automation of all tasks leads to miscommunication. It is difficult to answer the question of how to track where the location of the build is in the delivery? And whether it has passed the manual gates: decisions, manual testing, exploratory testing, and demo?

The next CD tool must visually assist decision-making and provide full complete history of build-test-release.

CONSOLIDATE REPORTS

The next generation CD tool will be able to consolidate everything and every bits and pieces of information about a release

functional tests reports
performance tests reports
code coverage
commit history
change log
etc

Unfortunately, Jenkins does a poor job with that. One has to dig into every single job in order to know everything that has happened to a build.

BE CONFIGURATION MANAGEMENT AWARE

Tools like Puppet and Chef have come to the market, but unfortunately they don't play well with Jenkins. We must integrate the CD tool to the one that manage the infrastructure, environments and servers.
Wouldn't it be nice if it could understand Chef Recipes and Puppet Manifests ?
What if you could create a new environment from Puppet recipe, and injecting it into your Delivery Pipeline.

Eg: I want to create a new delivery stage for exploratory testing and commission an environment that is similar to the one in Production environment. That stage must be placed right after performance testing. I can use that Puppet repository to find the description of such environment that I want.

In an ideal world, one could copy an entire delivery stage, and create a new one, and do the same for environements.

OPEN-SOURCE

No organization wants to be locked in a specific vendor or consulting firm. A CD tool is way too strategic to take that kind of risk.
Open source is what Continuous Delivery experts and all people in Continuous Delivery community want.

THE PROMISING TOOLS

I will personally monitor Go from Thoughtwork and Legato from CloudSideKick.
Here is a little bit about them:

Go has the ability to automate the cycle of build-test-release and providing a single united platform for business people, development team and people in the operations. Go came from Thoughtwork, the same organization whose people coined the term Continuous Development in 2010. Thoughtworks is also the same organization who developed CruiseControl in 2000 that begin the testing automation in Continuous Integration.

Legato is also a CD tool with its cutting edge approach. It is based on the CloudSidekick experience of continuous integration and continuous delivery in their development environment. Therefore it has embodied the soul of automated build-test-release cycle, because CloudSideKick have been working using CI and CD for 15 years. Legato also has the same focus on 3 issues: pushing code through multi-stage pipelines, testing code automatically in virtual environments, and eventually publishing reliable code faster.

Therefore they both will offer some of the above characteristics of the new CD tools.

CONCLUSION

The next generation of CD tool will focus on the delivery pipeline, embrace the cloud and look nice.
In fact the CD tool must document the process itself. In a single view, one must be able to understand the process in place.

Wednesday 12 March 2014

Costs and risks benefits of Continuous Delivery in one picture

In this article I wanted to post a very simple diagram. I'm not the first one to illustrate that concept. And it's such a good way to explain the differences between traditional software delivery and Continuous Delivery, that I wanted to share it with you again:

Share it if you like it!

Why should your organization Automate and adopt Continuous Delivery ?

Waterfall projects consists in sequential phases. Notably: Development and Release phases.
While the first has often been carefully planned, the second is a lot more hazardous and unpredictable.

(scroll down to the bottom of the page if you are only interested in the big picture)

UNPREDICTABLE INTEGRATION AND RELEASE PHASE

Most of the time, things start to go wrong in the integration, deployment and release phase (in red below). Having waited months to put things together, technical difficultes arise:

integration issues
deployment issues
dependencies and compatibility issues

On top of this, there is potentially a lot of rework to be done, once the team starts receiving feedback from testers. It is always challenging to release a huge chunk of work all at once:

As we can see in the image above, the development phase (in blue) is planned and limited (in scope, cost or duration). The second phase (in red) is almost impossible to predict. It consists in the following activities:

manual deployment
manual testing
fixing integration issues
re-deploy
re-test
go back to 1 until good enough
finally release to production

This phase can be short (rare) or extend undefinitely. The history have seen integration phases take longer than the initial project duration.

THINGS GET BETTER WITH A LITTLE BIT OF AUTOMATION

What about automating some of the steps above ?
In theory everything can be automated:

As you can see things got better with automation:

the deployment, testing and releasing phase always takes the same time
most likely the process is much faster: it is possible to increase the amount of computing resources to go even faster
the process is repeatable and predictable

ADOPTING CONTINUOUS DELIVERY

What if your organization decided to release often, small changes, rather than going for a big-bang release ?

Here is what is happening:

the development steps (in blue) consist in small changes developers make to the source code or configuration
for every code or configuration change, a deployment and automated test is triggered
depending on the organization, the change might even get automatically deployed into Production

THE BIG PICTURE

To summarise all of the above, here is a complete diagram:

Friday 21 February 2014

Defect discovery and risk mitigation in Continuous Delivery

Barry Boehm, the inventor of spiral model of software process and Constructive Cost Model (Cocomo) in 1976, argued that defects are more expensive to fix when they are found in the later stage of software development. His concept is later developed into Cone of Uncertainty (CoE) in his next book “Software Engineering Economics” (1981). The cone of uncertainty concept is simply represented on the diagram below:

The basic premise of the concept is that uncertainty evolves and grows as project enter the next phase. Boehm’s spiral method and Cocomo is to anticipate the risk and uncertainty in software project. For many years, the software development has revolved around the Boehm premise of software development economics.

On the other hand, Laurent Bossavit, author of The Leprechauns of Software Engineering (2012), said differently. Bossavit argue that the “underlying evidence justifying Boehm’s curve…just isn’t up to any reasonable standard of ‘research.’” (see "What does it really cost to fix a software). Bossavit, who also the head of Agile Institute, noticed that Boehm misinterpreted the result of his own studies. Therefore we can question the validity of the concept. However we know that the the risk and uncertainty do exist in the software project. So the most important thing is to deal with the risk and uncertainty.

FOCUS ON IMPACT AND RISK MITIGATION

I suggest not to argue about the exact shape of the curve, neither on whether there is any scientific method involved, but rather to identify the impact of a change and to start identifying some work around and mitigations. Especially in case of software defects (also known as bugs). It is because we need to assess the impact and mitigate the risk. As in other fields, we also cannot avoid risk in software development. However, we must be prepare to encounter the risk. Therefore we have to be able to assess the risk and know the risk we face and mitigate the risk. The major risk in the software development is to be found on the transition stage from testing to deployment. In order to ensure successful deployment in the production server we have to minimize the risk in error. The code error will become defect in the deployment and lead to system failure in the production environment. We can minimize such failure by eliminating the defect from the code error.

DEFECT IMPACT MATRIX

Let us begin with assessing the risk correctly using the defect impact matrix below. At the diagram below I suggest a matrix representing the impact of a defect. Instead of representing it over time though (like Barry Boehm did), I use the delivery stages on the horizontal axis.

Indeed, defects can be detected at a different development and release stages. The impact differs depending on how early or late it’s been discovered. Defect is the outcome of the risks that are not properly identified and assessed, therefore when the risk occurs the organization does not prepare to monitor and mitigate risks.

The content of the matrix is to be adapted to your organization. In fact the one I propose below would describe an organization with rather average software delivery processes:

(Score A to F, A being the better score, F the worse. )

We could implement continuous delivery such as canary deployment or blue/green deployment method to reduce the lowest score, for the following score:

F(1) score could be mitigated by offering the new version of the product to only a subset of your user base (ie 5%). Also called Canary Deployment.
F(2) score could be mitigated using advanced production deployment strategy of the 4th or 5th level of Continuous Delivery Maturity such as: Canary deployment or Blue/Green Deployment

Let us talk a bit about two deployments we mention above: Canary deployment and Blue/Green deployment.

Canary deployment is the deployment where software is tested in production level by routing a subset of users to new functionality. This deployment is important to test how the changes affected the users in general, but the entire system is not affected. Since the canary system is only tested to some users of the system not the entire users.

Blue/Green deployment is the deployment method of creating two identical production environment, this method provide you capability to rapidly roll-back to another environment when anything goes wrong with one environment. Blue/Green deployment enable you to switch the production environment from one environment to the other environment by rerouting the application request. The following diagram explains the Blue/Green environment.

ADOPTING ADVANCED CONTINUOUS DELIVERY AND AUTOMATION PROCESS TO MITIGATE THE RISKS

In his book, Continuous Delivery, Jez Humble and his co-author David Farley suggested a Continuous Delivery maturity level. The maturity model ranges from 0 to 5 where 5 is achieved only by industry leaders such as NetFlix, Twitter, Github and a few more.

The matrix above could be representing an organization that has a well defined and reliable software development process. However organization can improve their maturity level by automating more steps and using advanced deployment and release processes, combined with Agile methodology. We see that in the production stage they can improve the quality of product by using automation process. Either using Canary Deployment or Blue/Green Deployment or both of them, we can correctly assess the risks and able to mitigate them hence improving the process.

We could imagine an organization going from CD level 3 to 5 in a matter of a year and reaching the following result:

As you can see, the organization is performing much better above. Discovering defects late, even in production, is not catastrophic and has almost an insignificant impact.

Therefore, it is better for companies to not hesitate to embrace those risks, keep up with fast pace innovation, while having a way to mitigate and remediate every single risk they can encounter. We cannot avoid risk, because risk is already there, consequently we have to be able to focus on the job and progress and be prepare to deal with the risks associated with software development.

REDUCE SOFTWARE RISK, INCREASE QUALITY AND REVENUE

Once again, the content of the matrix is subjective. I just want to point out the benefits of Continuous Delivery, Automation of processes and Agile methodology.

Combining the three of them together, will allow an organization to improve drastically, keep the customer happy, innovate, decrease risks and increase revenues through well managed technologies.

Wednesday 12 February 2014

Do we really need task estimation in Scrum

I came accross this question on the LinkedIn Scrum Practitioner group, asked by one of the Certified Scrum Master. (http://www.linkedin.com/groups?home=&gid=52030)

Why we have two level of estimation in scrum? What purpose do task estimates serve?

This topics pinch me as I've been pondering about that for a little while:

why do we do tasks estimation and is it a good idea ?

COMPARIN FOCUS FACTOR AND TASK ESTIMATION

I also came accross this blog post recently, from one of my acquantainces:http://www.agilebuddha.com/agile/how-to-do-effective-capacity-planning-on-the-scrum-team/

The author explains the process of calculating team capacity for a given Sprint using Focus Factor, instead of total number of hours available.

For 5 people working 8 hours a day for 2 weeks :

That means the total working hours will be 5x8x10 = 400 hours!

Estimated planning for this capacity will be a disaster. It will lead to team working over time, rushing towards the end, quality cuts and low team morale.

Now let us take into account the Focus Factor of 6 hours per day:

Traditionally, project managers used 6-6.5 hours as planned hours in a day for project execution. Focus factor is team ability to remain focussed on the sprint goals without any other distractions.

The idea here is to sum up all the hours and find out how big the user story is.

Based on the capacity calculated above, you find out how many tasks and user stories can fit into a given the Sprint. Except that such estimations are far from accurate. On top of this, using tasks estimation instead of user stories does not make it more reliable.

Now come the question: Is estimating tasks in hours a good idea at all?

PROS AND CONS FOR TASK ESTIMATIONS

Let's see what people on the Scrum LinkedIn Practitioner group have to say.

PROS:

So here is my top list of arguments for those in favor of tasks estimations:

Task level estimation is used in the daily burndown. It is a great way to show the team where they are each day. It may take more than a day to complete a user story and there isn't a way to see the progress on that story unless you track the task hours.
Tasking helps the team to plan the actual work and provides transparency.
Task estimation allows progress tracking, and supports decision making. Will the stories really fit in the iteration (i.e. a sanity check)?
When it comes to sprints, burning down story points may lead to very large steps where it appears nothing is done, then suddenly a large drop. This frightens people and makes them feel uneasy as generally people want to know if they are somewhat on track.
If the developers are highly experienced and mature as an agile team, they can get away with relative sizing using story points. But as this is seldom the case (most teams consisting of mixed abilities/experience levels), task breakdown is welcome.
From a team perspective, the main reason to have estimates is to be able to detect when things start going off track. It tells us that we need to look and possibly make a decision to go back on track

CONS:

And here is my top list of arguments against tasks estimations:

Dual estimation using points and time is overkill (most of the time). You have to keep in mind that both of them are estimates, thus a guess on how big or much time/effort will be taken up with them.
The major risk of doing task estimation is it leads to falsify direct relationship, that initiate people to start to see between story points and time (time sum of all tasks = story point value of story).
Do we already have a time measurement in scrum: the sprint timebox. In order to find how many story points can a sprint fit in, we have to find the answer by doing, not by guessing
In order to start a sprint you only need enough work for a few days in advance. You don't need to break down everything from the beginning
There's a reason many scrum masters forego the time estimations (on any level): they are imprecise. If your story point estimation is not useful enough to draw a reliable velocity, then definitely the time estimates won't provide more details.
As a very rough indicator I would say each person in the team should be completing 2-4 items a day and marking them as done. More than that is too much administration which will slow the team down, and completing items less than that is too big and does not facilitate good communication and getting stuff done.
The Scrum Guide never uses the word task. It does use the word estimate in combination with product backlog items. So yes, according to the Scrum Guide there is estimation of Sprint Backlog Items
Scrum actually does not define how or when estimation should be done, but merely state that there is an estimation.
The more time you spend on estimation, the more it highlights the dysfunction of pressure which results in technical debt.

From the above, we cannot be convinced estimating tasks is something developers should systematically do. Let's see if there are alternatives solutions to support story sizing and sprint forecasting.

ALTERNATIVES OPINIONS

Here are some ideas, presented by LinkedInders that can be taken into consideration:

What I favour doing is breaking tasks into similar size, approximately half-day to day-long chunks (day long is definitely the upper limit IMHO). Then I look at burn-down by counting the number of tasks yet to complete. I don't care whether they are 4 hour or 8 hour; I don't even want to try and estimate to that level. It all averages out.
If you have a lot of very small stories where each one takes a couple hours and there are not many tasks on each, then by all means burn down the stories. If however, you have only a few stories and lots of task, burn down the count of tasks.
Task estimates are about developers planning technical work. It is less than an out-facing estimate, and more a way to plan how to approach each task, what would "normal time spent", and when to go for help.
Story sizing is not really about creating estimates. Ideal days, story points and relative sizes do not directly translate into hours and days. These are more for creating an understanding for the PO and the rest of the organization how much work each piece of the backlog entails.
We live, eat and breath by the clock and estimating the time is so ingrained in us, that we naturally want to know and predict the future, I would say the majority of people and as I count myself here, I cannot resist estimating in some form or another, even if it is a 2 second gut feeling.

MY PERSONAL OPINION

Below is what I answered to my peers on the topic. This is what I learned from 3 years doing tasks estimates:

In my opinion, when we do task estimation, we tend to put too much power into the estimations. The maths are so easy that we start believing in them: 4 tasks of 2 hours + 3 tasks of 4 hours + 1 task of 8 hours. = 28 hours! Easy, right ? No… Most of the times it never happen , even when we try to use buffer techniques it still never works out

And the PO or SM to say:

so that fits into the sprint, right ? if we look at the hours...

The answer to the question is not. Let us answer the question regarding the team: do you believe you and the team are able to achieve all this into a sprint ? Put away the hours and tasks, let us use our gut feeling. We will know that the sprint length is the only valid time period.

Although I must say breaking down a story into tasks has been a useful exercise to brainstorm and think about the solution. It might not be something we want to do all together though. Another approach is for a couple of developers to spend a couple of hours doing solution design.

I've been doing tasks estimations for so long (3 years) and I've stopped believing in it. It is often a waste of developer's time, and leads to frustration:

is it 3 hours, is it 2.5 hours ? or maybe 3.5 hours ???

Ok, let's simplify, only 4 and 8 hours tasks are allowed. Even that does not feel right: we ask hours estimations and then start to round up and put constraints on what should be a very simple task.

One thing I understood: if you depend on task estimations, maybe your user stories are too big and need to be broken down.

In the end I recommend no more than 5 user stories sizes. Something like:

small
large
xlarge
unknown : might lead to a spike for investigation

DON'T ALWAYS DO TASK ESTIMATIONS, INSTEAD BREAK DOWN YOUR USER STORIES INTO SMALLER CHUNKS

Here are, my take on the topics above based on my experience with task estimation:

Do task break down as a team if it helps brainstorming and designing the solution
Do not systematically do task estimation (putting too much power into task estimations might actually lead to technical debt and lower quality)
Use small user stories, not epics: a user story is an entity that can be tested independently. For example, replace "As a user I want to manage products" with 3 stories: "as a user I want to be able to create products" + "as a user I want to be able to edit products" + " as a user I want to be able to delete products"

What about you, would you use task estimation ?

Thursday 30 January 2014

Cost of manual VS automated Releasing in Scrum (Agile)

Due to the nature of Scrum, which is an iterative and incremental process, it is required to complete the user stories by the end of the so-called iteration (also known as Sprint). In theory, at the end of a sprint, a shippable product is made available. It can be released to production. This means the software has passed successfully all the quality gates in place at that time.

However, in order to complete and effectively close the User stories, the features (or improvements or bug fixes) need to pass all the tests. On top of this, the team must know how to release that software to Production (or to the client).

In other words, and the end of the Sprint, the software can only be:

Potentially deployed
Potentially packaged
Potentially released
Fully tested against the acceptance criteria (whether it’s manual or automated testing, the latest being the recommended option).

In order to achieve releasable and working software, we can consider two options:

Finish the sprint and test/release after the Sprint (deviation from the Scrum process)
Test and release during the Sprint with the goal to complete those tasks by the end of the Sprint (typical Scrum process)

USING A TESTING AND RELEASE PHASE BETWEEN SPRINTS

The first option implicates that the features won’t be tested before the end of the sprint but rather after that. The team will not be able to close the user stories immediately. This is an alternative if the deployment and testing process is not automated and consumes some resources from the Scrum team.

During this transition phase, between two Sprints, the developers and testers will work together to discovering bugs, fixing them, deploying a new version and so forth (until acceptance criteria are met). This process is represented in the diagram below:

Note that the subsequent releases might take longer as the amount of non-regression tests increases (the system comprises of more and more features). Although the manual release process is well known, the testing might take more and more time:

We can identify some issues with this process:

The sprint cannot be closed, as the user stories have not been tested. Therefore the software is not shippable at the end of the work iteration.
Issues and bug will be discovered late after the code was written, increasing the time needed to fix it
The team is busy doing manual deployment and fixing bug rather than working on the next Sprint
The release process takes longer as many more features are added and need to be tested again over and over again.

A better solution is to automate both the automation of testing and the deployment process. This is a required practice of Continuous Delivery.

TESTING AND RELEASING DURING THE CURRENT SPRINT

Once the tests and deployment process have been fully automated, it is possible to continuously deploy and test as soon as the code gets changed. Developers can therefore focus on designing and implementing features. Stakeholders, product owners and testers will have access to the latest version in a Testing environment, soon after the code has been written.

That way, the deployment and testing are embedded in the sprint. See below:

The benefits of this solution are:

Bugs are discovered much earlier and easier to fix
People are busy working on value added work (designing, coding) rather than manually deploying
Repetitive tasks are automated, reducing the risk of errors
A shippable product is available at the end of sprint. It might or might not include all the new features, depending whether the team had done a good job at estimating and scoping the Sprint backlog.

The drawbacks are:

The automation of the testing and deployment process requires an investment in time and energy. As a result, team will have to acquire new skills for that purpose.
The amount of work achievable during each sprint will be less, taking into account the effort to automate the tasks
There is still no guarantee that all the features will be completed, tested and shippable by the end of the Sprint. We can only ensure we have the most efficient process to achieve the same.

CONCLUSION

Due to the nature of the Iterative and Incremental approach of most Agile process (including Scrum), it is necessary to automate the testing and deployment tasks, in order to deliver shippable product at the end of each iteration (Sprint).

Then only the business is able to predict what will be delivered at the end of the next iteration, while keeping the cost of releasing and testing low, even in the long run.

Wednesday 15 January 2014

Kanban, Scrum and Continuous Delivery: Where do they fit in?

Since the day it began, Scrum is well known for short term project delivery. The strength of Scrum is in its speed and flexibility, much like a sprint running. Within this philosophy, Scrum development process is even called Sprint.

On the other hand, the Continuous Delivery (CD) is known for its long distance running that keeps on producing the code on the pipeline through several “pit-stop” before it achieves the state of “releasable.” CD is well-known for its high-productivity.

The continuous delivery, as the name implies, produces the codes continuously, regardless the code will be implemented in the software or not. Meanwhile Scrum only produces the code which listed on the backlog (Scrum initial documentation).

Let us see how they fit in together and deliver the product that focus on customer as Scrum emphasizes, with the Continuous Delivery high productivity.

MAKE SCRUM WORK FOR DEVOPS AND OTHER CD TASKS

The idea of this post came after reading the following article:

"How does a Scrum team account for infrastructure tasks in the planning meeting?"http://programmers.stackexchange.com/questions/61529/how-does-a-scrum-team-account-for-infrastructure-tasks-in-the-planning-meeting

WHERE IS THE BUSINESS VALUE ?

The link above talks about Infrastructure and how Infrastructure tasks can fit into Scrum. The truth is that they probably don't.

Scrum focus on deliver value to user. Therefore, any work that does not deliver directly any value to the user, but is nonetheless key to a successful delivery of working software, might not fit in Scrum.

This is due to the Scrum backlog that requires a user story to be told and the development will run based on that backlog.

SPRINT COMMITMENTS TO AUTOMATION OF PROCESSES

In order to understand the issue, we need look back at the ideas of Sprints and commitments. It is a tool to allow the business to prioritize the user stories. And they get to know what will be deployed to Production in the next iteration. In other word, they know what value will be delivered to the business and the users.

We can question though the purpose of committing to sprints for works that do not deliver direct value to the business or the users.
For example, automating the deployment of the component X is necessary but it is hard to assign a business value to the component, since component X is the value itself.  So the only advantage to using Sprints, in this case, is to keep the illusion that we are keeping track of the progress.

SPRINTS ARE NOT FLEXIBLE ENOUGH WHEN THE SPRINT IS RUNNING

Another issue can arise: sprints are not flexible enough. As a matter of fact, they are not meant to be flexible. We only develop user stories that are in the Sprint, without add anything, nor remove anything.

However, sometimes, something goes wrong. For instance the deployment is broken, the testing framework has become too slow or a new process needs to be automated.

WHY PLANNING THE UNPLANNABLE ?

The issues listed above cannot always be planned during a planning session. There might be unplanned situations arise, for example:

We are not sure yet that we are going to introduce a new component during the Sprint, and the deployment of this component needs to be automated
Developer adds 100 new automated tests and suddenly the test suite is running too slow. It takes an hour to get the test reports which becomes a major bottleneck for the team (which relies 100% on the tests before releasing)
An automated deployment that use to work stops working. It seems a new version has broken the build, but nobody saw it coming

In some cases, we don't know yet what is going to come in the next 2 or 3 weeks at an automation and delivery level. Most often, it is not worth delaying any further and waiting for the next Sprint planning.
The Continuous Delivery mechanism works like a car factory. If any of the operation of the assembly chain were to fail, the entire production machine is stalled. Why should you keep producing cars when you cannot put all the screws together?

In the TQM (Total Quality Management) which adopted as Lean Management in America, when the assembly chain noticed an error, there is only one solution: stop the production, fix the error and then resume your work. This is called Poke Yoke, which means error-proofing in Japanese language. The idea of this method is to stop the production when error is noticed, quickly fix the error before the error becomes the defect and defect lead to failure. Using this method, we are able to reduce the severity of error and reducing the amount of cost spent to correct the codes.

In software development, the implementation of Poke Yoke is: stop coding, fix the deployment and resume your work.

This Poke Yoke requires all code to be inspected before delivery to the next stage (pipeline) of Continuous Development. Normally the CD pipeline consists of: Coding, Unit Testing, Integration, Acceptance Testing and Deployment.

Note that it's not recommended to let developers write too much code when that same code cannot be tested. Nobody wants to have to do big-bang commit and testing once everybody resumes their work.

HOW DOES ONE PHRASES A "TECHNICAL USER STORY"

Another time when developers struggle is the Sprint planning and phrasing technical user stories.
Some bad examples:

As a tester I want to make the test run faster so that I get faster feedback
As a sysadmin I want to refactor the script that automates the configuration of our web servers

User stories are great because they start with "as a user" and force you to think about what you are going to build for the user.

However, when user stories do not serve directly the user, you will have hard time writing it. This is the time to stop writing that kind of user stories and find solution to “technical user stories.”

POSSIBLE SOLUTIONS

"HIDE" THE WORK IN USER STORIES

The first option is to "hide" CI/Infrastructure tasks into other user stories.

Say we need to need to automate the deployment of the component X and Y. We can add a few more tasks in User Story X, and maybe a few more in the story Y.

Unfortunately, that approach does not always work:

you are wasting your time doing this
your story Y is not independent and is waiting for story X

Since component X and Y is not a user, which we cannot make a story out of component. Therefore for some cases, this might work, but not for others.

A CONTINUOUS TECHNICAL USER STORY

For another project we had a continuous technical user story. The challenge is to explain the Product Owner why your team needs to spend 10 points on a technical story that does not provide any business value.

Unless the Product Owner used to be a software engineer, you will have a hard time justifying it.

A CONTINUOUS DELIVERY SUPPORT TEAM

In a recent Enterprise size project, we ended up having a "project support team" for any Continuous Delivery work. The CI/Infrastructure/Data Migration effort was so huge; so that we had to separate it out from the main development stream.

In fact we were running a data and infrastructure migration project as well as building an asset search website.

That Continuous Delivery team was autonomous and would reprioritize almost daily, depending on the new issues and the long terms tasks. Their goal was to make it possible to test and release the software at all time. They understood the priorities: serve first, maintain second, and improve later. They need to justify the value of works they have been doing by some management tools.

KANBAN

Kanban is a Japanese term for “Sign-Board.” It is a visual management tools that started in the 1970’s world-famous Toyota Production System (TPS). Its main purpose is to view, track and monitor the process in the production line, This production line in CD is called pipeline. In Agile world, Scrum adopted Kanban as one of their artifacts: Taskboard.

In Kanban board, it divides the status of Work In Progress according to the stage of the work. Starting from the Planning to Execution. In Scrum, the stages are User Requirement – To Do – In Progress – To Verify – Done.

The Kanban board will show exactly the job to do and the workflow instantaneously. A Kanban board is useful for more adaptability when urgent situation arises. In a way, the CI/Infrastructure is there to support the developers. One has to be ready to jump in and take responsibility. That means to make the situation as his priority to fix and improve (like support engineers do for other business units).

Kanban helps limiting the Work In Progress and allows one to prioritize the work nonetheless. Thus, when the situations arise, we can add the situation into the Kanban board so every team member can see the situation, and anyone can take action upon the situation immediately.

SCRUM IS NOT SUFFICIENT BUT ...

Scrum seems easy to adopt for pure software development tasks that are serving users. However, when we are facing a complex situation, we propose to implement Kanban as solution to deal with such situations, such as the following:

you are running a very technical project like Data migration or Infrastructure transformation
you have automation tasks that are required but the tasks do not serve the user directly
you have very technical tasks and are having a hard time phrasing user stories

Depending on the size of the project or the team, Kanban might be of a great help. In any case don't try to use user stories when you are not building a feature, but rather supporting the delivery of the feature.