Maintenance, scalability, core competency, extensibility...here's what to consider when facing the "build vs buy" decision.
Here's a story we hear often: Sarah has been working on the SRE team for 3 years, and they've recently rolled out platforms with shared responsibility across platform engineering. The open-source tools meant to make life easier are now highly available, critical services that consume much of the resilience work they had planned. Terraform, Kubernetes, Prometheus, Grafana, Vault, etc. are now her core competency.
Now the executives are asking her to improve incident management and eliminate incidents with extensive customer impact. She has to make a critical decision: build and operate another service, or implement a SaaS solution?
While there's no one-size-fits-all answer, your unique environment, resources, and requirements will guide your choice. In this post, we'll unravel the nuances that shape this decision and weigh the advantages and disadvantages of each route.
Customization, unique requirements, and verifying a need
Some organizations have unique workflows, compliance needs, or security concerns that off-the-shelf solutions might not meet. Building in-house enables your team to tailor it to your specific needs — providing complete control over features, functionalities, and integrations, ensuring the solution aligns perfectly with your team's workflow and requirements.
Deploying open-source tools or crafting an in-house prototype can also serve as a strategic litmus test, helping you discern the critical features and functionalities that genuinely resonate with your incident management needs. This approach may facilitate a more persuasive case to secure budgetary approval for a more comprehensive SaaS solution.
However, you should be prepared for the long-term investment of this decision. For one, the journey continues after constructing the platform. It's just the beginning of a commitment to maintenance, updates, and security fortification. The necessity for round-the-clock operational support cannot be understated.
Secondly, as time marches forward, so do your organization's needs. Your bespoke platform might elegantly address current requisites but could falter when confronted with the inevitable demand for new features, integrations, and scalability. You'll need to allocate resources for building new features and functionality, squashing bugs, and updating integrations that go brittle from changes in APIs. The entailing cycles of planning, designing, testing, building, and operating can be time-consuming and resource-intensive.
Lastly, before building in-house, you'll want to evaluate the capabilities of your development team. The leap from scripting a solution to crafting a full-fledged automation platform that flawlessly orchestrates diverse workflows across your environment demands a specialized skill set. The risk of knowledge loss from staff turnover (especially if sudden) further underscores this complexity.
Cost: upfront costs vs. long-term investments
Depending on the availability of your in-house developers, building from scratch or customizing open-source platforms could yield cost savings over purchasing a commercial solution.
Building in-house will invariably require more upfront costs than long-term investments, and it's easy to assume the long-term investments will be low. But here's our advice: do the math. Our State of DevOps Automation Report 2022 found that 39% of organizations have at least one full-time engineer dedicated to building automation in-house, with 26% having two or more. This is not even to mention the cost of operating the platform. Say your engineers are taking home $225K per year; that's a massive cost that must be set aside yearly for this work.
Aside from the cost of employees, organizations must also account for the continued costs of maintaining the server and infrastructure.
Expertise in feature development and user experience: One of the most significant benefits of a commercial incident management solution lies in the singular focus of the dedicated company behind it. You're buying from a company that has hired every person and written every line of code with one mission in mind: make an amazing incident management product. They've devoted time, money, research, and resources to delivering the features and functionalities most teams need to improve their incident management. While some companies have built extremely comprehensive internal solutions, they will rarely provide the same user experience and usability as an off-the-shelf solution.
Best practices: Battle-tested commercial solutions encapsulate industry gold standards and provide an opinionated approach to incident management born from user feedback and market trends. This is an especially enticing aspect for teams just starting to standardize and automate incident management. A SaaS solution can guide your team step-by-step through the journey to drive operational maturity (and should still leave room for your team to add your own best practices and specific workflows).
Core competency: Building a robust automation platform that can connect to any system, tool, or API is a far cry from building bash scripts or 1-to-1 integrations that provide a single automation. Understanding the underlying infrastructure requires developers with years of expertise. (Learn more about the platform requirements for a powerful integration and automation engine in our eBook: Not All Integration Platforms Are Built the Same.)
When adopting a SaaS solution, you're investing in both the platform and the expertise of the people building it. You don't need to worry about hiring this expertise in-house or what to do if those experts leave your organization.
Maintenance freedom and seamless scalability: By embracing an off-the-shelf solution, you entrust the laborious upkeep and enhancement tasks to a third party. This emancipates your internal engineering and operations teams, enabling them to concentrate on initiatives that drive core business value.
Rarely does an in-house platform attain the same level of availability as a SaaS solution. The contractual reassurance of service level agreements (SLAs) from a vendor offers a tangible promise of reliable scalability.
We've talked to thousands of organizations of different sizes and with different toolsets, needs, and skillsets, and the general advice we give is: buy for industry standards and build for the gaps. Only a few years ago, there were not the same quality of SaaS solutions available, making building in-house the go-to solution. But today, teams can rely on best-in-class tooling to facilitate their incident management needs.
However, we can't forget the second part of our advice: "build for the gaps." There will always be requirements outside of what a SaaS platform can provide. When choosing a SaaS incident management platform, it's important to choose one that is extensible — either extensibility through code so you can tailor workflows to your needs or through AI, enabling your team to create new automations quickly and easily without needing to wait on the tool to create it for you.
Ready to try a SaaS incident management solution? Here’s why Transposit is better than the alternatives: