Successful DevOps automation requires dynamic documentation and keeping humans in the loop
Technical teams are drawn to the power and promise of automating everything. And it’s clear why: there’s too much manual toil in DevOps processes today. Updating documentation, pulling together data from various tools, changing cloud configuration, fulfilling requests — the list of tasks required often goes on and on with no end in sight. The explosion of best of breed SaaS tools and the continuous change of cloud infrastructure means the manual nature of any given process can eat up precious human resources that are better spent on initiatives that deliver value to customers. The simplicity that the cloud was meant to deliver has instead brought with it an enormous amount of complexity, catapulting the desire for automation to an all-time high.
While there’s undoubtedly too much manual toil, the complexity of today’s DevOps processes gives most organizations pause. How do you even begin the journey to meaningful automation? What can and should be automated? Luckily, the path forward is not as overwhelming as it may initially seem. Once we approach automation from a new perspective — as a journey rather than an all-or-nothing solution — the process of implementing automation becomes far less daunting.
At its core, this perspective recognizes that humans and machines can not only play nice together, but thrive together. Some industries like HR have found ways to automate huge pieces of their processes because many tasks are deterministic. However, in the world of DevOps, processes are more prone to change and have many more unpredictable pieces. So the goal should really be to use the unique skills of both humans and machines — humans using intuition and context and machines taking on repetitive, predictable (but time consuming) tasks like logging in to systems or pulling data.
All or nothing automation is not the solution. Instead, automation should be done incrementally — starting by codifying processes then working up to more advanced, fully automated scenarios (when appropriate). Automation that will have long term impact for your team and organization is a step by step process, with benefits all along the way.
The path to meaningful automation starts with documentation. We can’t automate what we don’t know. First, teams need to codify institutional knowledge. The danger of not documenting institutional knowledge is that it lives only in the heads of people all throughout the organization and can walk out the door with them. It also means that processes are done differently by different people. Automation requires consistency…so we start by developing consistent processes.
The type of documentation you’re using, dynamic vs. static, will make a substantive difference here. Static documentation is hard-coded — often living in wikis where it quickly goes out of date, making it difficult to find the current source of truth. Dynamic documentation, however, consists of rich, real-time information, functioning as a live, managed-as-code, data-centric, and integrated center of knowledge. Dynamic documentation sets the foundation for teams to begin their automation journey.
Transposit runbooks are an example of dynamic documentation that enable teams to codify the free-form processes of the real world. Runbooks may first look like a checklist or a series of steps. One of the great advantages of dynamic rather than static documentation is that it automatically records both human and machine data, giving teams a clear sense of exactly how docs are being used, and by whom. This not only ensures accountability but also a continuous feedback loop. Teams can easily analyze processes, see what’s working or isn’t, and adjust accordingly. At this phase, it will also become clear what steps are predictable and repetitive — steps that are the most ripe for automation.
The complexity and variety of DevOps processes means no two will look the same. There are some entire processes that could be fully automated, but most processes will have a mixture of humans and machines doing the job together — what we call human-in-the-loop automation.
Runbooks are able to connect to any API, meaning you can inject automation into any step in the process. Say, for instance, you have an incident management runbook created. You see that every time a PagerDuty alert comes in that meets a certain criteria (i.e. web 500s are over threshold), your team creates a Slack channel, invites the right people, does an AWS service check, and creates a Zoom bridge and Jira ticket. Those are consistently done every time, so they’re great candidates for automation. This alone may save your team upwards of 15 minutes — already reducing toil and reaping the rewards of automation.
However, after that initial stage, it’s likely that your team will want to use judgment and context to decide the next steps. This is where humans take over — pulling data, talking things through, analyzing the situation. It’s a human checkpoint in the automation process. Now, for example, your team may decide to scale an ECS instance. Right from a runbook, teams can take that action with one click — eliminating context-switching and enabling actionability right from where you're already working. This is the sweet spot between humans and machines.
Many processes will never be fully automated. However, some are ripe for it. Now that you’ve codified your processes, you have the high level view needed to decipher what processes are performed the same every time. For example, some teams may choose to automate service requests that don’t need an approver, so they let the machine take a user’s credentials and simply give access, without ever bothering someone on the ops team.
You may start to automate infrastructure as code practices, enabling self service infrastructure. Runbooks can provide the guardrails to ensure changes are safe and consistent, enabling machines to take input from the user, get data from another system, and dynamically generate code — ensuring that the state of infrastructure is as expected before making any modification.
Ultimately, automation is a unique journey for every team. Incremental automation helps build trust in processes, provide clarity about what to automate, and ensures automation is delivering value in a meaningful way.