Sunday, August 14, 2011

Technical Debt: Consider Operational Requirements

Technical debt

The term “technical debt” has been popular for a while and has been used to refer to varying scenarios, not just in software development but across the IT industry. It was originally coined in 1992 by Ward Cunningham to help us recognize that quick and dirty work can set us up for increased future effort. One of the most overlooked forms of technical debt is created when operational requirements are excluded from the product development life cycle (PDLC), particularly in the project planning stage.

In many companies, operational support is not considered part of the PDLC. Operational and project teams are often completely decoupled at the lower and middle levels of the IT department, only dovetailing because both divisions report to the CIO or VP of IT.

The disconnect

This disconnect is created when communication is limited between IT operations and the rest of the business, including development. When the business defines objectives and timelines for a project with little or no interaction with the operations team, the operational requirements needed to support various projects are captured in an abbreviated form or left out entirely.

The project team might not even know that operational requirements exist until the code is almost ready to go live. It isn’t unusual for operational needs to be unearthed after code is deployed. Sometimes, these needs only come to light when issues are encountered because the existing infrastructure does not have the capacity to support the functionality or customer load created by the newly deployed code.

Case in point

I recently participated in an emotionally-charged meeting in which operations, development and business stakeholders discussed the root cause of the failure of a client-facing system. The project team was angry with the operations team for not supporting their code in a highly-available (HA) environment. The operations team was defensive, stating that they didn’t have the time or resources to address their backlog, including building an HA environment.

The project team had specified an HA environment in their runbook as part of their handoff to IT operations but the business didn’t prioritize the work against existing operational work. The HA system was acknowledged by the operations team as a requirement necessary to adequately support the deployed code, prioritized it and assigned the creation of an HA system to an operations resource, but the prioritization and operational timelines were not aligned with project milestones.

The HA system’s completion was not considered release-gating by the project team, so the code was deployed on an inadequate system and the project was declared complete. The customer-facing system failed and recriminations flew.

Unmet expectations

The project team expected their deployed code to be considered part of the production system, with accompanying up-time SLAs. The IT operations group was working on the HA environment, but didn’t have dedicated resources on the project, so its completion was a long, drawn-out process. The same people working on HA were also doing regular maintenance tasks, working on systems for other projects, and putting out fires.

The key point of failure in this conflict was in not consciously determining the value of the various requirements in the project. Although the work was in the project’s prioritized backlog, a good process wasn’t in place to prioritize the project work against other operational work or to align timelines.

Miscommunication

Miscommunication started right at the beginning of the project, when people from the operations team were not included in the planning process. It was exacerbated when the code was permitted to launch into a non-HA environment. It really snowballed when the project team went on to the next project without fully completing all requirements in that iteration of the product development life cycle. The operations team pushed code live without ensuring that all project requirements were completed, and then repeated the pattern over multiple releases.

After just a few project cycles, it becomes clear that the decoupling of operational requirements from business requirements within consecutive projects creates a backlog for operations that can’t be surmounted without drastic remediation efforts.

Finding a solution

Fortunately, this problem has a simple solution. It’s not easy, because it requires being realistic about the true time and resource demands of a PDLC. Paying attention to operational requirements can initially slow project completion rate or require the addition of people to your operations team.

Pay attention to three key points to prevent a buildup of technical debt as a side-effect of project work:

  1. Make operational requirements release-gating.
  2. Don’t launch on temporary environments.
  3. Bring operations people into the project early and often. When possible, make them an integral part of the project team.

Projects are not complete unless the systems on which they live are complete.

Jen Stone Browne
www.jenstonebrowne.com

This post is a re-work of an idea I originally created for http://leantechnologytransformation.blogspot.com. Check out that blog for great ideas by Patrick Phillips, a skilled Lean and Agile practitioner and coach.

For more on technical debt, see what Israel Gat has to say at http://theagileexecutive.com/tag/technical-debt/.

No comments:

Post a Comment