Report, Record, and Learn: Automate and Improve Post-Incident Insight with Transposit

Save two hours with automatic post-incident reports and incident timelines

Jessica Abelson, Director of Product Marketing
Mar 30th, 2022
Share

The Report, Record, and Learn stage is the fifth and final incident management step, following the Intake, Classify, Engage, and Remediate stages.

Incident remediation is not over when all the tickets are closed and things are back to normal. There is a lot that can be learned, both in terms of preventing future incidents and continually improving incident response. It’s an extremely valuable iterative process that allows teams to get to the underlying infrastructure, team collaboration, and root causes of issues.

Unfortunately, though, the process of actually capturing and activating these insights requires work and effort on the part of DevOps teams, the incident manager, and/or the team’s problem manager. Depending on the status of other incidents or important tasks for these team members, this effort can sometimes fall lower on the priority list. This myopic approach can cause more problems or slow down the remediation process in the future, because everyone is so caught up in fighting fires and closing tickets that they are missing the opportunity to solve systemic problems.

Transposit puts these insights at the fingertips of teams without any extra effort on the part of the team, making it much easier to tap into and learn from post-event insights. The Transposit platform is able to automate the process of pulling and creating incident reports to provide a holistic picture of the event and its process. This insight can easily be applied to improving future incident management processes, and even stopping potential incidents before they start, all of which make an organization’s entire system more reliable.

The status quo

Typically, the incident manager and/or the problem manager must manually go back through each individual tool that was used to gather information on what happened. This might take the form of harvesting screenshots, reviewing the report logs or timelines produced by each disparate platform, manually cobbling together a timeline of events, and/or copying and pasting comments into a centralized hub. This process itself can be time-consuming, depending on how many different tools and platforms were involved.

Generally following this process, if there are systemic or procedural things that need to be addressed, it is up to the problem manager to create tickets about the changes. It is at this point, generally, that all players and relevant stakeholders must be invited to a retrospective meeting about the incident. The attention to any of these steps is the first thing to be deprioritized if another incident arises in the interim. Even if there were major items that need to be addressed, as time passes and the memories of the recent undertaking fade, the likelihood of actually using this information becomes more unlikely.

How Transposit solves these problems

By offering customizable, automatically-generated reports and incident timelines, Transposit immediately eliminates the biggest roadblock to post-incident analysis. Instant access to these resources provides incident and problem managers with insight into both human and machine actions throughout the remediation process.

This postmortem report, which automatically harvests event timelines, severity, impacted services, links, and other data from every platform that was used in the remediation, is exportable and automatically syncs to platforms like Confluence or Google. This makes it possible to easily share this information with stakeholders throughout the organization, and provides a thorough, comprehensive record for benchmarking and posterity.

Aside from the automation that streamlines the process, the insights that are readily available also make the job of problem managers easier. They are given instant access to insights about everything that took place in the incident, both in terms of activities in the system and regarding human-driven remediation behavior and processes. Armed with this information, the problem managers are able to create and assign activities related to the problems that arose in the process, and fine-tune the processes themselves. Involving Transposit in the post-incident process ultimately turns the failure that caused this incident, and any shortcomings in its remediation, into opportunities for future success and improvement.

How it works

Transposit helps teams learn and improve with automatic and exportable postmortems as well as a fully incident timeline.

Customizable, exportable postmortems

In the runbook builder, add one of these actions (either to export to Google or Confluence):

  • Generate Confluence Postmortem
  • Generate Google Docs Postmortem

Where should I add the actions?

  • Runbook body: Create a “Post Incident” section and add one of these above actions, as well as other actions to close out the incident (i.e. close Jira ticket, archive Slack channel, etc.).
  • When runbook state is done: In this section, the postmortem will be automatically generated after you’ve closed out the incident. Marking the incident as done can also save the incident commander time in wrapping up the incident itself by automatically closing tickets in Jira or your other ticketing system, and notifying key stakeholders, whether through Email, Slack, Statuspage, or another productivity tool, and ultimately archiving Slack channels.

Automatic incident timelines

The Transposit Activity automatically captures every human and machine action during the incident, from actions taken across the stack to Slack conversations.

This long-live history of automated and human actions ensures you have the thorough audit trail needed to perform retrospectives, investigate root cause(s), and drive continuous improvement. This audit trail is also important for future incidents — giving operators a historical record of how something was solved previously, should a similar situation occur.

Running incident management with Transposit

The Report, Record, and Learn stage is the fifth and final incident management step. See how Transposit can help your team automate and streamline the first four stages: Intake, Classify, Engage, and Remediate.

Share