While many organizations see automation as a way to resolve incidents faster, a majority are still relying on manual and chaotic processes. If orgs want to incorporate automation into their practices, why are so few doing so? Our Founder and CTO Tina Huang spoke this week at DevOps Enterprise Summit, sharing two major challenges to automation.
First, there's a lot of fear around automation. How do I run the script? How do I debug it if it breaks? Who will I reach out to if something goes wrong, and what if that person has left the company? Out-of-date, inaccessible documentation and institutional knowledge are the culprits. The second challenge is ambiguity—you can't automate a process you don't fully understand. And the truth is that full automation without any human involvement is rarely feasible. Human intuition and judgement mixed with automation—what we call "human-in-the-loop automation" — is actually the goal we should be aiming for.
With these challenges in mind, what are the first steps your organization should be taking to develop a human-in-the-loop automation strategy for DevOps? Here are five simple steps.
You can't automate what you don't know. Begin by consolidating documentation and making sure every step is clearly defined via checklists. Next, take documentation out of wikis (where scripts go stale and are hard to find), and create workflows in Transposit so documentation is easily searchable and simple to iterate on overtime. This provides your team with a single knowledge base and source of truth as a baseline for workflow automation.
To benefit the most from automation, you need to know which automatable tasks are taking up the most engineering hours and causing the most friction. Focus on automating tasks that are manual, repetitive, practical, and increase with scale. This is unique for every organization. Some may save ample time and reduce bottlenecks by automating the ability to update internal and external stakeholders and others by automating ways to pull data from various services at the time of an incident.
Not everything can or should be automated. A machine cannot diagnose what is wrong simply based on an alert or choose the nuanced language needed for a status update; for that, you need human judgement. Transposit's interactive runbooks provide workflow automation to amplify the unique skills of both humans and machines, guiding teams to faster resolution and reduced recurrence through enhanced daily operations. Bring humans into the loop of automation with one-touch direct action commands that execute automated workflows for common remediation actions.
Random scripts in runbooks or minor automations based on webhooks can be a stop-gap measure, but sustainable automation requires powerful integrations that can change as API mechanics and the toolchain evolve. Make sure your team has invested in the right integrations to not be brittle. Transposit's powerful integration platform serves as a "universal translator" for APIs, abstracting away the details of specific API mechanics so engineers can focus on what data they'd like to retrieve instead of how to retrieve it.
Use automation to keep a complete, auditable record that can feed additional automation and continuous improvement strategies. Post-mortems should bring to light ways to enhance both development and automation processes. Transposit keeps a full incident timeline not just after an alert but across operations and incidents so that your team has the full context of what went wrong, while escaping from the manual, time-consuming, error-prone work of incident reconstruction. The feedback loop of continuous learning between operations and development has never been easier.
If you're ready to get started with automation, download the 5 Simple Steps to DevOps Automation 1-sheeter to share this information with your team. Automation does not have to be complicated. In fact, it can be done in five simple steps.