DEV Community

Guo Xiang (Harvey) Ng for AWS Community Builders

Posted on • Originally published at aws.plainenglish.io

2 1

Breaking the Waterfall: Using Azure DevOps Boards for Agile AWS Infrastructure Delivery

Introduction

After experiencing quite a bit of AWS cloud infrastructure delivery work, I’ve noticed and socialized a persistent challenge: while software development teams have widely embraced agile methodologies, day 1 infrastructure delivery teams often remain trapped in waterfall planning and project status reporting. On the other hand, cloud operations teams may operate in a reactive mode, responding to tickets and service requests with limited capacity for proactive planning or improvements.

The traditional approach to new infrastructure projects typically follows a sequential model: design the architecture, build the foundation, implement the components, and then operate the result. In today’s public cloud-native world, this approach has serious limitations. “Cloud infrastructure” needs to adapt continuously as new workloads are deployed, requirements evolve, and operational insights demand changes.

For tech consultancies/system integrators and small lean cloud operation teams, many IT scopes could fall under the purview of a single squad (i.e. security, networking, observability, DevSecOps, database administration, infrastructure work like systems administration, configuration management and container platform operations) instead of having established siloed teams.

This article aims to share a simplified and generalized agile methodology utilizing Azure DevOps Boards — specifically with the Basic process template — to manage AWS infrastructure delivery in an agile manner. Whether you’re an infrastructure project manager used to waterfall who is looking to adopt agile practices or a cloud engineer wanting to implement self-organization, hopefully this read could help you with a starting point to manage the team and deliver results more efficiently and responsively.

Why Public Cloud Infrastructure Projects Need Agile Approaches

Infrastructure delivery has fundamentally changed with cloud adoption. Consider these shifts:

  1. Continuous Evolution: Infrastructure is no longer “built once and forgotten.” AWS environments continuously evolve as new services become available and as business/non-functional requirements change.

  2. Infrastructure as Code: With tools like Terraform, CloudFormation, and CDK, infrastructure changes can be versioned, tested, and deployed using the same agile principles as application code.

  3. Mixed Work Types: As mentioned earlier, Cloud delivery/operations teams might need to simultaneously handle planned infrastructure development, operational support, security improvements, platform feature requests across many IT domains.

  4. Cross-Team Dependencies: Cloud infrastructure/operations teams have complex dependencies with multiple application teams, security teams, and third parties all with different timelines.

These factors make the traditional waterfall approach increasingly ineffective. When infrastructure projects follow rigid phases, they struggle to adapt to changing requirements, can’t easily incorporate operational learnings, and create bottlenecks for application teams.

The most relevant point for myself is point 3 on managing mixed workstream types. For a day 1 tech consultancy team, traditional project management approaches often fail to account for this mix, focusing only on planned development work while treating the rest as distractions or incomplete planned development work.

In this approach, we explicitly acknowledge all these work types and use Azure DevOps boards to visualize them and make informed decisions while providing an actual view and history of the work on the ground for management if required.

Azure Devops Board

While Azure Devops has many features including integration with code version control systems, pipelines, artifact repository, we want to focus on the project management and agile tooling which is its Kanban Boards feature. Let’s get into it and set up a beginner friendly board and i’ll ‘talk’ through my thought process of how to utilize it.

Create a new project and select the ‘Basic’ Work Item Process Template.

Image description

There are 4 default processes currently and basic is the most lightweight and is in selective preview currently:

Image description

The simplicity of the Basic process template (with just Epics, Issues, and Tasks) makes it accessible for infrastructure-focused teams who do not wish to follow scrum or over-complicate the agile process, but need better tracking than spreadsheets or emails provide.

Epic-Level Organization

You can use Functional/Technical domain areas as epics such as “network security”, “infra compliance/hardening” or “Architecture” but this will create never-ending buckets of work which prevents tracking at epic level (go-ahead if the team is small and tracking at issue level is good enough for getting started).

Else, we want to organize our epics as time-bound, completable milestones. This approach maintains the essential agile principle that epics should eventually reach a “done” state.

Consider the following sample epic categories for either capturing AWS infrastructure milestones or value stream/capability additions:

  • “MVP Landing Zone Implementation” / “Establish Secure Multi-Account Foundation”

  • “Establish Infrastructure Provisioning pipelines and tooling”

  • “Production Network Deployment” / Epics that encompass the deployment of all resources for specific network zones (accounts/VPCs)

  • “Security Controls Baseline”

  • “Setup Container Orchestration Platform” / “Setup Kubernetes / Nomad cluster baseline”

  • “System Monitoring Baseline with Cloudwatch/Dynatrace”

  • “Creation of baseline EC2 golden images” / Other way to establish baseline Server capabilities (i.e. Remote access, vulnerability scan, patching)

  • Consider a dedicated “Ongoing Operational Support” epic to capture smaller operational tasks, bug fixes, and workload support activities that don’t naturally fit under specific capability epics (just so those issues can have a parent)

Issue Classification and Management

Under each epic, issues represent the actual units of work. In the Basic process template, the “Issue” work item type is versatile enough to handle various kinds of infrastructure work.

Explore issue creation in different ways:

  1. Team Self-Creation: Good engineers identify and create issues proactively

  2. Tech Lead Assignment: Tech lead/PM can choose to assign the entire epic to good engineers with ownership or assign at the issue level as well (please don’t micromanage by assigning tasks :X )

  3. Service Requests: Application teams can submit requests through integrated channels (*see below)

  4. Monitoring Alerts: Automated creation from monitoring systems for operational issues

The approach you choose should align with your team’s maturity level, culture and ability. Teams with experienced engineers who understand the broader architectural context can benefit from more self management.

*Azure DevOps doesn’t natively function as an ITSM tool but it can be integrated with platforms like ServiceNow or Jira to automatically create work items from service requests. Other Microsoft native options include Teams, Forms and Outlook Email.

Organizing Work Through Tagging

It is worth exploring a simple dual-layer tagging system to enable filtering and organization of 1.Technical Domain 2.Work Type. |

Consider the non-exhaustive example list below.

Technical Domain Tags:

  • networking: VPC, routing, firewalls, Certificates, IoT

  • security: IAM, security groups, compliance controls

  • compute: EC2, container platforms, serverless

  • storage: S3, EBS

  • monitoring: CloudWatch, metrics, alerting

  • sysad: Patch Management, Windows Active Directory, Backup Management

  • middleware: Configuration

  • logging: Cloudwatch, log transformation, syslog, splunk

Work Type Tags:

  • new-build: Net new infrastructure builds

  • enhancement: Improvements to existing builds

  • support-task: Tasks supporting application teams

  • bug-fix: Issues that need remediation

  • compliance: Regulatory or security compliance work

  • cost-optimization: Cost reduction efforts

Kanban Board Configuration

Let’s configure the Azure DevOps board with columns that represent our workflow:

  • To Do: Identified work yet to be assigned/picked up

  • Doing: Currently being worked on

  • (Optional) Blocked/Pending: Cannot be completed due to factors beyond the team’s control

  • (Optional) Review/Validation: Initial implementation completed but require review/testing/validation to be sure it meets the DoD

  • (Optional) To be propagated/replicated: Generic Solution/mechanisms implemented and tested to have worked but need to be replicated for multiple workloads/accounts/automations/etc.

  • Done: Completed work

For infrastructure teams, the Review/Validation column is particularly important, as it represents the time between technical completion and confirmation that the infrastructure is working as intended in the actual environment.

Sample Board Illustration

Here’s a sample board for illustration purposes for the completely uninitiated. It doesn’t include the optional workflow states mentioned earlier.

Image description

To modify and include more workflow columns, navigate to

>Settings>Column>Add Column

It can then look like this if you choose to add additional columns:

Image description

*If you would like a script to generate the 5 epics, 10 issues and 10 tasks above at one go. Here is a link to the PS file:

https://github.com/guoxiangng/azure-devops-board-templates

Using Swimlanes for Work Types

Azure DevOps boards support swimlanes, which we use to separate different types of work visually. This gives the team immediate visibility into how much effort is going into each category:

  • New Capability: Brand new infrastructure components

  • Enhancement: Improvements to existing infrastructure

  • Operational: Work related to running the platform

  • Urgent/Unplanned: Critical fixes or urgent requests

With this visualization, it becomes immediately apparent if one type of work is consuming too much capacity or if critical work is not progressing quickly enough.

This approach to swimlanes complements our epic structure by allowing us to see work types across all epics. While an epic might focus on a specific capability (like “Implement Secure Container Platform”), the swimlanes show us how that epic’s work is distributed across new features, enhancements, and operational tasks.

Here’s how you can add swimlanes on Azure Devops Board:

Image description

Common Challenges and Key Lessons Learned

These are some challenges from my experience and research (online and asking around):

  1. Estimation Difficulty: Infrastructure tasks varied significantly in complexity, making categorization of issues/tasks for work estimation challenging.

  2. Operational Interruptions: Unpredictable operational issues/management escalations still can disrupt planned work or disrespect the Agile Process.

  3. Cross-Team Coordination: Coordinating dependencies with other teams required additional communication channels (and infrastructure blockers can take quite long to resolve if other teams work by waterfall and had a planning gap)

Key lessons learned:

  1. Start Simple: Begin with basic board usage and gradually introduce more agile practices as the team adapts.

  2. Make All Work Visible: Ensure operational and support work appears on the board alongside planned development.

  3. Adapt Planning: Adjust capacity allocations based on emerging needs rather than sticking rigidly to initial plans. (i.e. adapt whether to have sprints and standups based on team maturity, adapt sprint duration based on agreed service SLAs, adapt sprint planning approaches to protect the team from management/app team expectations)

  4. Experiment and Optimize the Process: With the metrics at hand and query-able, try to define ways to measure success and optimize for work process improvements.

Concluding Remarks

With all this said, remember that the best methodology is the one that fits your team’s culture, capabilities, and constraints. Use this approach as a starting point, but always adapt it to your specific context and improve on processes over time.

Lastly, once again this article is for the uninitiated… and may not be relevant for seasoned operations teams with their own ITSM tooling nor for well-oiled co-located teams with clear role delineations.

If you made it to the end, hope you got something out of the read. There are many more features that Azure Devops Boards and similar Project Management Softwares provide. Have fun exploring!

Sentry image

Make it make sense

Make sense of fixing your code with straight-forward application monitoring.

Start debugging →

Top comments (0)

👋 Kindness is contagious

Explore this insightful write-up embraced by the inclusive DEV Community. Tech enthusiasts of all skill levels can contribute insights and expand our shared knowledge.

Spreading a simple "thank you" uplifts creators—let them know your thoughts in the discussion below!

At DEV, collaborative learning fuels growth and forges stronger connections. If this piece resonated with you, a brief note of thanks goes a long way.

Okay