Why write unit tests? 🧪
Table of Contents
Today’s post is a bit longer and discusses unit testing and why you would want do it.
It is a lightly edited extract of a book on unit testing that I started about 10 years ago but never finished. Still, it has aged reasonably well. I hope it can help people new to unit testing, but also give some food for thought even for those who have done it for a long time.
Let us start with a brief introduction to the topic of unit testing.
What is unit testing? #
Unit testing is the practice of taking an individual unit of the source code, isolate it from the rest of the code, and verify that it works as intended. This practice can provide confidence that each individual unit works as expected before integrating multiple units to perform a bigger task.
A “unit” refers to the smallest piece of code that can be logically isolated in a system. Often a unit is a single function, but it can also be a group of related functions, or even even a whole class.
Unit tests are narrow in scope. They should be small, fast, and easy to understand. A unit test is written by, and primarily intended for, software developers. Testers and users simply benefit from them by having fewer bugs. Unit tests are themselves functions typically written in the same language as the unit under test. A unit testing framework is used to execute each unit test and provide a summary of their results.
The practice was popularized by Kent Beck at the turn of the millennium. From there, it gained much momentum during the following decades and has become a standard tool in the toolbox of many developers.
So what’s the problem? #
While unit testing today is widely recognized as “a good thing”, not everything is perfect. Many developers find it hard or boring to write unit tests. Some feel that unit testing is taking valuable time that should be spent on writing the actual code. Yet others believe it is not the task of the developer to write tests at all.
Furthermore, it is not only a question of writing unit tests or not. Not all unit tests are equal. Some unit tests are good, some are bad. Good unit tests can help by finding bugs, catching regressions, communicate intent, and guide the design of the unit under test. Bad unit tests are hard to read, require much maintenance, and provide a false sense of security by not actually verifying what they seem to.
The rest of this post attempts to discuss how we can think to write better unit tests.
Why unit test? #
Perhaps the first questions we need to answer is “why should I care”? Why should I spend my valuable time on writing unit tests? What good will they give me? Here follows a few reasons you may consider unit testing.
Pro tip: The answer “because my boss says I have to” is not the answer we’re looking for.
To understand #
One reason to unit test is to better understand the problem you are solving.
American inventor Charles Kettering said “a problem well stated is a problem half solved”. That is very true for unit testing. If you can write a test which verifies that something works, it is usually rather simple to actually implement it.
In fact, one could go as far as to say: if you cannot express the wanted functionality with a unit test, you are not ready to implement it.1
Another aspect of understanding is to understand existing code. A unit test can often help explain not only what a piece of code does, but also why it does it. This leads us to the second reason.
To explain #
Unit tests are a great tool for explaining your code. How it is intended to be used? What is expected to work and what is not? What do the parameters mean? Tests can put the unit under test in perspective and give more information than the code by itself. You might even realize that you need that explanation yourself, when looking at your own code a few months later.
A great way to look at tests is to see them as telling a story about the code. In fact, Kent Beck who is also the father of the XUnit family of test frameworks said:
Writing tests really comes down to telling a story about the code. Having that mindset helps you work out many other problems during testing.
Finally, keep in mind that while computers understand any valid code, humans do not.
To drive design #
By writing a test before the code it is supposed to test, you are forced to think about the new functionality from a usage perspective rather than from an implementation perspective. This is the basic premise behind Test-Driven Development (TDD). While this post won’t discuss TDD very much, this idea is so powerful that it should be mentioned.
It helps you get code which has a natural interface (explicit or not), just feels “easy to use” and fits nicely with the rest of the design. In fact, it can be argued that testable code and good software design, very much go hand in hand.2
Another angle on this is by Kent Beck (again), saying:
Tests should be coupled to the behavior of code and decoupled from the structure of code.
To feel safe #
The word “test” implies that we want to verify something, and obviously we do. Using unit tests, a developer can gain confidence that her code is working. Not sure whether that if
statement which rarely gets triggered actually works? Write a unit test to find out!
The above guideline is similar to a maxim in the unit testing community which says “test until fear turns to boredom”. Expressed differently, write unit tests until you feel that you are wasting your time.
Finally, most programs will represent a successfully executed test with the color green, and a failed one with red. Over time, you will learn to love the feeling of “all green”. It gives you a good feeling and you feel calm.
To prevent future bugs #
While similar to “to feel safe”, writing tests to prevent future bugs are a bit different. Adding tests to feel safe deals with exploring the unknown. Adding tests to prevent future bugs focuses on preserving what is known.
Perhaps there is a part of the code which you know is tricky. Some part of the code which is non-obvious, or even counter-intuitive. Add a test to avoid future developers “fixing” the code in ignorance.
To save time on testing #
If we do not write tests that run automatically, we have to test manually. That takes more time, is tedious, repetitive, and thereby error-prone. In reality, it means that we test less often, less thorough, or perhaps ignore it completely and let our users be our testers. Therefore, we want to make our unit tests run automatically.
With a comprehensive suite of tests that cover your code, you can also feel much safer when working with the code. Especially when refactoring, that is when cleaning up the code without changing the functionality.
Another thing we want to avoid is regressions – bugs that we’ve already fixed that slips back in again. By writing a unit test every time you fix a bug, you make it very unlikely that the bug will reappear.
Over time, you build a valuable regression suite. In fact, theoretically you could follow only this rule and end up with a high quality product. It could be seen as a backwards kind of way of doing test-driven development.
To get faster feedback #
While writing code, you often go through cycles of writing code and then running it to see if it works as you intended. In many environments, these cycles tend to be rather long and include things such as packaging the application, launching an application server, starting the application, setup test data, and then navigating to the feature to be tested. As those who have experienced this can tell, this is rather time consuming.
It is worth noting that slow feedback loops really kills productivity. Not only do you have to wait longer in the first place, if you have to wait more than a few seconds it is quite likely that you open a web browser and slack off instead.
With unit tests, this workflow can be drastically improved. Instead of launching the full application to run the code you just wrote, you simply run one or more unit tests which exercise that code instead. Since the tests run in a matter of milliseconds, you can go through “make a change and run the tests” cycles very quickly.
There are even tools which detect the changes you make and automatically run the unit tests which cover the changed area.
To make work more fun #
Having a suite of unit tests that can prove in a second that your code is working is very satisfying. It also helps picking up after an interruption because you can see which test is failing and needs fixing.
When trying to pinpoint a tricky bug, unit tests can also help because you can construct any scenario. That might be tricky to do through the user interface of the program.
To help where other types of tests struggle #
Finally, unit tests have a special role to play in the testing ecosystem. While there are other types of tests which are also valuable, none of them can quite take the place of unit testing.
All other types of tests are some form of integration test. An integration test connects multiple units and verifies that they work together as expected. This can be anything from a few classes to the full system.
What is great about integration tests is that they look at how things actually are – they do not make assumptions. In contrast, unit tests often assume lots of things about the environment in which the unit under test runs.
However, integration tests have a couple of drawbacks as well.
- They require a “real” environment to run in. They usually require resources like database instances and hardware to be allocated for them.
- It is hard to cover your whole system with them because the number of possible execution paths through your system is so large.
- They are larger and take longer to execute which gives a longer feedback loop.
- They typically have to test through some high-level entry point which makes it hard to pin-point a specific functionality (especially error cases).
One test can fulfill multiple purposes #
This section has listed several reasons why you might want to write unit tests. Keep in mind that these are not categories or different types of test. They are reasons to write tests.
In many cases a single tests may serve multiple purposes. A single test may help you both drive design, explain the intended use for others, and to get fast feedback.
Other times, it may serve different purposes over time. It may have have started out as a exploratory test to help you understand the functionality to implement, but then be repurposed as a more focused test to prevent future bugs.
How much should I test? #
Now that we know why we unit test, we can ask ourselves: how much should I test?
“100% coverage” #
When it comes to measuring unit testing, the measure “code coverage” often comes up. While there are some different types of coverage, they generally try to measure how large part of the program that is executed through tests.
Commonly, teams agree on something like “at least X % of the code shall be covered by tests”, where X is a number like 100, 80 or 50. Some teams use “happy path” testing, where they test only the code when everything goes as expected, ignoring error handling.
The biggest problem with this approach can be expressed through Goodhart’s law, formulated by British economist Charles Goodhart:
Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes.
In our context, that means that once you start using code coverage as an indicator of test quality, people will find ways to increase code coverage that does not necessarily give you more valuable tests.
Especially, it should be noted that the fact that a unit test covers some code, does not mean that it verifies its functionality. A simple example would be a unit test which performs some action without asserting any result afterwards.
With that said, a team which realizes this can of course still use code coverage as a way to find out where they need to put more effort into their testing.
A means to an end #
What is important to remember is that unit tests should be a help, not a hindrance. If your tests do not help you, something is wrong.
If you look at the unit tests you’ve written the last year, how many of them have actually helped you? How many of them have caught a regression? How many helped a new programmer understand the code it’s testing? How many would still work if you refactored the code under test? We could go on and on. Unfortunately, in all likeliness, many of the tests are a waste of time and should be removed!
Therefore it is important to know the cost of unit testing. Unit testing is a means to an end (better quality), not an end in itself. If we could write better code without unit tests, we would!
Return on investment #
To put it another way, we can borrow some terminology from economy: a test should have a positive Return On Investment (ROI). That means as follows.
The value you get out of a test should be higher than the cost to write and maintain it.
If this is not the case, the test needs to be fixed or deleted.
Keeping a bad test just because “it’s already written” is a dangerous road to take. Tests still cost money to maintain. If it is in a part of the code that isn’t changed anymore, fine, keep it. If it tests code that is under development, then do something about it.
Furthermore, you get diminishing returns on each test you write. The first test you add will tell you if the code works at all. The second test will test some aspect that the first test missed, and so on. The more tests you add, the less the chance is that there is still a bug lurking in the untested code, simply because there is less untested code.
If you find yourself having an unhelpful test on your hand, you should take one of the following steps, in this order.
- Make sure that the problem lies with the test, and not your understanding of it.
- If you do understand it, and realize that it is a badly written test – fix it!
- As a last resort, if fixing the test is not worth the cost, the test should be deleted!
Something is better than nothing #
With the above being said, it is still better to have some tests than no tests. The fact that other parts of the code are not tested should not stop you from adding tests where you think they will be valuable.
To quote software developer and author Martin Fowler:
Imperfect tests, run frequently, are much better than perfect tests that are never written at all.
Finally, it is very hard to know which out of a group of possible tests that will actually be helpful. Therefore, it is often a good idea to err on the safe side. Write a few more tests that you probably would need, rather than a few too little.
What should I test? #
Knowing why we test and having realized that we probably don’t want to test everything – what should we test?
Test what is important #
We get back to the unit testing maxim “test until fear turns to boredom”. That means you should test only what you is afraid is going to break. Obviously, you need to be realistic when you decide what might break. Too much hubris and you will decide that nothing might break – after all, you wrote the code!
A variant on this is to identify parts of the system which would cause you a lot of trouble if they failed. If a certain type of failure would cost you a lot of money or incur irreversible data loss, you likely want to test it more thoroughly.
Test for common mistakes #
Put simply, you should write the tests that will help you. If you don’t typically make certain types of mistakes, don’t write test for that mistake. Instead, figure out what type of mistakes you often do make, and write tests for that. If you’re working on a team, figure out which type of tests you do as a team, and write tests for that.
Don’t test what’s already tested #
Also, don’t test what’s already been tested. Let’s say we have a class Entry
which is a simple value class with little logic. It is used by class Map
which we’ve covered extensively by unit tests. In this situation, there is little need to test class Entry
separately. It is already covered by Map
’s tests. Except wasting time on testing something which is already tested, you also make future refactoring harder since more tests will have to be changed.
Test that which is suitable for unit testing #
When writing unit tests, you will find that there are two primary characteristics that determine whether a piece of code is easy to unit test; complexity and dependencies.
First, for unit testing to be valuable the unit under test must have some non-trivial logical complexity. Simple getter methods are good examples of something that is trivial enough not to warrant a separate test. The cost of writing unit tests for every getter in your system will most likely be far higher than the cost of dealing with the very occasional bug in one of them.
Secondly, unit testing is much easier for units with few dependencies or side effects. It should be isolated from other units, but also from external resources such as databases. The more dependencies you have, the harder it becomes to isolate the unit under test. Before you know it, the code to set up the unit test is more complex than the code you are testing. Also, the more you need to fake3 in order to test a unit, the less those tests will tell you about the real world behavior of that unit.
For code which has many dependencies you might be better off with integration tests where you don’t need to spend time isolating the unit under test.
To summarize, unit testing is more suitable the more complexity and fewer dependencies your code has. This means that to make your code testable you should keep logical complexity and dependencies separate from each other. A good way to achieve that, is writing code in the style of functional foundations.
Conclusion #
This post has talked about what unit testing is, why you may want to do it, as well as how much and what to test. I hope it has given you some food for thought and perhaps some ideas for how to improve your unit testing.
There is of course much more to say bout writing good unit tests. What properties they should have?4 In what style should I write them? What tools should I use? All of that is unfortunately out of scope for this article. (Perhaps I’ll find some time to blog on that too sometime?)
Now, these were my thoughts. What are your thoughts on unit testing?
Featured comments #
glyn: I really enjoyed your post and thought it deserved a response.
https://underlap.org/why-write-unit-tests
Updates #
- 2024-04-24: Added the sections “To prevent future bugs” and “One test can fulfill multiple purposes” inspired by discussion on Mastodon.
-
If you can’t explain it, you don’t understand it applies very much to unit testing. Write a test to prove that you understand! ↩︎
-
There is a deep connection between testable and reusable code. ↩︎
-
Martin Fowler has a good overview of different type of “test doubles” that are used to fake dependencies during test. ↩︎
-
Kent Beck writes about test desiderata – properties he expect unit tests to have. ↩︎