Baker Approved: Incident Response Automation

Baking competition favorites like The Great British Bake Off, Sugar Rush, and Nailed It!, remind us why capturing all incident related data is essential to successful incident management

Baking tools, Holly, and Gingerbread cookies dusted in flour


While tasty treats are welcome at any time of year, the Holiday season really brings baking into the fore. As many of us delve head first into baking up a storm, others may choose to watch experts from the sidelines — often in the form of baking competition shows. What I’ve realized while salivating on the couch and rooting for my favorite baker(s) is that these competitions can also boost our understanding of the current incident management landscape, or at least help explain what you do at your next family get-together.

Recipes : Bakers :: Runbooks : On-call Engineers#

An essential part of cooking and baking, a recipe is a set of instructions for preparing a particular dish, including a list of the ingredients required. Runbooks provide a similar function to engineers — they are sets of troubleshooting steps and tips that are valuable when responding to an incident or performing other operational tasks. Even the most skilled chefs use recipes to reference key steps and keep note of the multitude of ingredients and measurements needed to ensure a high-quality output. Similarly, no matter how skilled an engineer you are, runbooks help capture important information and steps that ensure incidents can be responded to quickly and efficiently.

The Great British Bake Off#

The famed show seeks to find the best baker in the land through multiple weeks of competition with three challenges each week based on a unified theme. These challenges — the Signature Bake, the Technical Bake, and the Show-Stopper — test each contender in a variety of ways.

It’s the technical challenge — which requires contestants to produce a specific finished product when given very limited instruction — that illustrates the importance of an accurate and actionable runbook. The recipes these highly-skilled bakers are given during the technical challenge are paired down significantly. They provide minimal direction and require incredible technical knowledge, experience, and familiarity with the dish to execute at a quality level. Many contestants are forced to make guesses or test a variety of methods in their quest to reproduce these dishes, and often lose valuable time in the process, much like on-call engineers without a runbook to guide their incident response.

During an incident, even the most skilled engineers need guidance when working with code or a system that they are unfamiliar with or have not been exposed to fully. A runbook (one that is actionable, accessible, accurate, authoritative, and adaptable) coupled with a platform like Transposit's, which can capture and incorporate human actions and data from past incidents, will make a significant difference in a team’s ability to decrease MTTR and keep incidents at bay.

Sugar Rush#

Whether you are a baker in a competition or an engineer responding to an incident, the clock is an additional component you have to keep an eye on. In Sugar Rush, four teams of two bakers compete in three rounds of challenges. After each round, one team is eliminated. Here’s the rub — the faster you complete your first challenge, the more time you have banked to complete the challenges that come later — provided you are able to advance!

When it comes to responding to an incident, engineers are also racing against the clock. Every minute down can mean the loss of productivity or revenue. It also takes time away from your on-call team’s other responsibilities both professionally and personally. Like on the show, working quickly must not compromise work quality.

As we innovate in all areas of life, devices have been created to help automate or simplify the standard steps and processes we continuously undertake. In baking, automatic mixers resemble build tools, electronic thermometers work like monitoring tools, and food processors facilitate repetitive tasks such as chopping or blending, much like a script automates a task that would otherwise be executed one-by-one by a human. Each of these help to free up time to focus on more specialized or technically sophisticated tasks. Automated workflows seek to do the same for engineers responding to an incident. Transposit enables teams to connect all the systems in their environment through interactive runbooks that automate routine tasks — like creating a JIRA ticket, updating a status page, or surfacing specific graphs or metrics when a particular trigger or series of events occur. It also streamlines your communications and actions in one place, removing the back and forth between tools and making it easier to keep stakeholders informed. If only our kitchen appliances could do that!

Nailed It!#

If The Great British Bake Off highlights the importance of a documenting routine steps and ingredients, Nailed it! shows the value of capturing human data and actions alongside it. This baking show features three low-skilled bakers competing to replicate complicated cakes and confectionery. While the recipes they receive may be sufficient for a highly-skilled or professional baker with a plethora of institutional knowledge, these contestants require much more information than a standard recipe provides to be able to reproduce these complex dishes effectively.

When new or junior engineers join a team, their lack of familiarity with the systems or code can make it difficult to have them be on-call or resolve incidents as quickly as more senior team members can. Much of this is due to a lack of access to institutional knowledge that comes from working with a system for a longer period of time. This is where Transposit comes into play. Our platform gathers a completely new data set across human and machine interactions, such as automated incident timelines. Through this extra information and data, teams are able to reduce the contributing factors that often cause recurrence. It also allows for continuous improvement of runbooks and workflows, enabling them to be robust enough to help engineers respond quickly to an alert, no matter how new to the team they may be.

A healthy serving of gratitude pie#

Whether you are the one baking something delicious, you prefer to serve in the ever important role of ‘taste tester’, or you are behind a screen writing code and responding to incidents, we are grateful to those who endure the hard work so that we can all benefit from a job well done.

Get insights from Transposit in your inbox monthly.

Subscribe

< Back to the blog