Building Better Products with Continuous Testing

The why, where, and how of testing throughout the SDLC

Paul Jones
Apr 14th, 2022
Share

Throughout my career as a software developer, I’ve noticed a common pattern amongst new developers — a reticence to spend time writing tests. Often, they are hesitant to take the time to write tests and would rather jump straight into the next feature. Usually, this is because delivering features counts toward our perceived productivity while tests are not something that directly adds to the product. The belief is that our job is to write code that users use, not write tests that our end-users will never even know exist.

The stickiness to this belief is that it is a half-truth. As software developers, we live and die by shipping new features to our customers. However, as I grew in my career as a developer, my definition and understanding of the role of a software developer has evolved. A software developer’s job is to ship features to customers rapidly and in a stable manner. Those features we yearn to ship do us, and our product, no good if they are dead on arrival. The stability of one’s product is essential to its success.

It’s important to note that most application errors result from a code change. So by that logic, if we don’t ship changes to our code, it will remain stable. This however is the antithesis of the DevOps culture we strive for — we want to be continuously integrating, and continuously deploying. So how do we rapidly iterate on our product in a stable manner? The answer is through testing.

Why do we test?

There are many reasons to test our code. I’ve bucketed them into 3 categories:

1. To establish trust in our code

If we don’t run our code before we release it, how will we know it does what we think it does? How do we even know it will work? Writing a new feature and shipping it to production without checking it is simply acting as an agent of chaos. This isn’t only bad for us, the developers, but also for our customers. If we regularly ship broken code, they will lose faith in our platform, and ultimately leave it. A product without customers usually isn’t long for this world.

2. To protect ourselves from future us

I know that I’ve written a new feature, kicked the tires in dev, and shipped it. It worked as intended! A month later, our product team wants to change what it does. New tickets are written, and I go back to the code I previously wrote. Unfortunately, after a month has elapsed, previous me and current me aren’t quite on the same wavelength. Code is changed, the new functionality is tested, and we ship it. Job done! Then the alarms go off. Turns out some of the methods that got changed broke other parts of the codebase. Tests can catch this and save us from ourselves.

3. To write better code

This isn’t a blog post about TDD (Test Driven Development), maybe one will follow, but writing your tests first leads to better code. It forces you to think about how your code is invoked, and what it needs to do. This results in cleaner and simpler code, which is the goal of a top-notch software developer. For the sake of me not getting even more pedantic, trust me on this one.

Now that we know why we are testing our code, that begs the next question:

Where and how do we test our code?

Manual testing will lead to failure. I’m not saying you shouldn’t run the code you’ve written and check that it does what you expect, I encourage everyone to do exactly that. What I’m saying is that you cannot build an effective, scalable, process around manual testing. So as devs, we are responsible for building the system that gives us confidence in our code. That means all of our testing needs to be automated! If it runs ‘automagically’, I don’t have to think about it.

Unit tests

The easiest way to automate testing is to set up a unit test suite in your application. These are tests designed to specifically test the functionality of the code you are about to write (remember from a few paragraphs ago? Testing first equals better code!). At this point, most languages ship with this as part of the standard library. Go has the testing library, Ruby/Rails has RSpec, Python has pytest, JavaScript has Jest, etc. If I didn’t list the language/framework that you are working in, a quick search will almost certainly help you get started with what you need! Once you’ve written your tests, most editors can be configured to run them when you save a file. This means that every time you make a change, your tests will make sure you didn’t break any of the functionality you’re testing for. This is amazing! It completely protects you from breaking things you’ve tested for, making refactoring a breeze. There are few things a software dev likes more than a good refactor, and now it can be done safely.

Integration & smoke tests

With our unit tests set up, it’s now straightforward to run our test suite in a continuous integration pipeline. Part of the pull request review process should be having all the tests run and report back with any failures. This way when you pull other people’s changes back into your code, their tests will run and notify us of any breaking changes.

At this point, we’ve run the full test suite and are ready to merge our code. It’s time to bake a docker image and kick it into our deployment pipeline. Early on at startups, I’ve found this to be sufficient testing to deploy. If your unit test suite gives you high enough confidence in your code, and you have an easy way to roll back your deployment quickly if something goes wrong, deploy it! An integral part of this process though is having monitoring in place to alert you if your code fails, and for the developer that made the change to go sanity check it in production once it has shipped so you know if things are broken.

If the unit test suite isn’t where we need it to be yet, or your product is more mature and has an active user base, it’s time to write some smoke tests. These tests run as part of your deployment pipeline and check a core set of functionality that must work in order for you to deploy. The key thing here is not to test everything. That would dramatically increase the duration it takes you to deploy code, creating deployment queues and slower feedback to the developer.

Knowing what to test

In a previous job, I worked on a website that was focused on reviewing one’s college professors. We really cared about 3 pieces of functionality in order to ship code:

  1. Does the homepage load? If we couldn’t load the homepage, or it didn’t contain the expected text, we needed to stop the pipeline. In general, it’s never good if your homepage breaks. This is really an important smoke test for everyone to have in place.
  2. Can you search for a professor, and get to that professor’s page? This was the core functionality of what our site provided. This one flow succinctly tested the code in the search path, as well as the code in the render professor path. If any of that broke, our site was fundamentally broken, and we needed to stop the deploy
  3. Do ads render? We all have to get paid. If you’re not paying for a service you’re the product. Ads were how the business made money and I earned my paycheck. If we made a code change that prevented ads from loading, we were actively losing money. Annually we would run a site sponsorship. If we broke their ads, we could put that recurring source of revenue in jeopardy. If ads didn’t load, we needed to stop the deployment.

Think about these things for your application. What are the core features you provide? If it’s very user-specific, perhaps you should test your log-in flow. Is searching and rendering data your core competency? Test this! It’s worth testing these things as part of your deployment pipelines to prevent a major outage from a code change.

Writing the tests

There are lots of different ways to write these tests. At the time, we used both TestProject and Playwright. TestProject is nice because it allows non-technical people to record clicks through the site and record that sequence into a test. Playwright requires a more technical person to write the code for the test. Ultimately, I found Playwright to be more repeatable and less likely to create sporadic failures as UI elements change. However, we were also in the midst of a redesign at that point, which probably caused TestProject to be more brittle than it would be on a stable product.

Deployed! Done and dusted, right?

Not quite! We are deployed, but we should still test and monitor our code. The first thing we should do is navigate to our site and test the feature we just deployed. Does it work as expected?? If appropriate, include your product or design teams in the process. Now is the ideal time for them to spot things they would like to change in a subsequent pull request since it is still fresh in your mind.

Monitoring deserves its own post, but I like to think of it as an extension of testing. Good monitoring lets us know if and when our code stops behaving as expected. This may be due to circumstances outside of the code we shipped, but the requests are still flowing through our code. It can help us identify places to make our code more resilient in the future.

Conclusion

By testing throughout the entire development lifecycle, we’ve dramatically elevated our confidence in the code we ship. We know through unit testing, the code meets the product requirements. We know through integration testing that we aren’t breaking the important paths through the code. Finally, through monitoring, we will know if something changes and breaks our application. The added benefit to having this in place becomes the increased speed at which you can safely and confidently make changes to your app. You know through testing that your change is valid, and conversely, that if you broke something it would not deploy. That mental safety makes all the difference when deploying code.

Share