Testable code is reusable code ♻️

There is a link between pure functions, testability and reusability that I have been thinking about for a while.¹

In Functional foundations, I argued for the value of pure functions – functions which always return the same result for the same arguments, and does not have any side effects. The advantages of pure functions include that they are easier to reason about (especially in a concurrent environment), easier to reuse and compose with other functions, and easier to test!

At the same time, writing perfectly pure and side-effect free code can be hard. The good thing is that sometimes, even getting a function partially pure can be very valuable!

An impure example #

Let’s say for example we have an exportUser function which reads a user from the database, converts it to a export format, and syncs that data with a file on disk.

function exportUser(userId: number) {
    // Read user from database
    const user = database.findUserById(userId)
    
    // Sync export to file
    const filePath = `${user.id}.json`;
    let update = false;
    if (!user) {
        // User does not exist, remove file if it exists
        fs.unlinkSync(filePath);
    } else if (fs.existsSync(filePath)) {
        // User exists, check if an update is needed
        const fileContent = fs.readFileSync(filePath, 'utf8');
        const existingUser = JSON.parse(fileContent);
        const previousExports = existingUser.exports
        const lastExported = previousExports[previousExports.length - 1];
        if (lastExported < user.lastUpdated) {
            // File is out of date, update
            update = true
        }
    } else {
        // File does not exist, write user to file
        update = true
    }
    if (update) {
        user.exports.push(Date.now());
        const userJson = convertToExportJsonFormat(user);
        fs.writeFileSync(filePath, userJson, 'utf8');
    }
}

This function is a bit tricky to follow. (You dear reader would obviously never write such convoluted code, so let’s imagine it is part of a legacy system. 😉)

It is also tricky to test. Testing it requires us to deal with both the database and the filesystem. Of these two, the database will likely be the hardest. While it is not optimal to include filesystem operations in unit tests, they tend to be reasonably fast and stable. However, being dependent on the database means we will have to start an actual database for the test, or to fake one. Both of these options have their drawbacks.

Setting up a real database (effectively doing integration testing rather than unit testing) produces trustworthy results as we run our code against a production-like database, but adds complexity to the test setup and may make the tests much slower. Faking the database, using a mocking framework for example, may keep the tests fast but encodes a lot of assumptions about how the database will act. If any of these assumptions are wrong, the test will be misleading.

Let’s look at an example test where we use a real database and file system operations.

describe('exportUser', () => {  
  
    beforeAll(() => { /* Start and connect to database. */});  
    afterAll(() => { /* Disconnect from database. */ });  
    afterEach(() => { /* Reset database state. Remove test files. */ });  
  
    test("should update if existing file is out-of-date", () => {  
        // Insert test user data  
        const testUser = {  
            id: 1,  
            lastUpdated: Date.now() - 10000  
        };  
        database.insertUser(testUser);  
  
        // Create an "out-of-date" user file  
        const filePath = `${testUser.id}.json`  
        const outOfDate = { ...testUser, exports: [Date.now() - 20000] }
        fs.writeFileSync(filePath, JSON.stringify(outOfDate), 'utf8');  
  
        // Perform the export operation  
        exportUser(testUser.id);  
  
        // Read the file back to verify it was updated  
        const fileContent = fs.readFileSync(filePath, 'utf8')
        const updatedContent = JSON.parse(fileContent);  
        const previousUpdates = updatedContent.exports
        const latestUpdate = previousUpdates[previousUpdates.length - 1]
        expect(latestUpdate).toBeGreaterThan(testUser.lastUpdated);  
    });  
  
    // More tests...  
});

Getting rid of one side effect #

What can we do to improve the situation? It is clear from both the function and the test that the complicated part is the file system operations. The database call is as simple as it can be (at least implementation-wise) and the JSON conversion has already been extracted to a separate function. However, the file system logic it not trivial. It is also coupled to the data being written making it is hard to extract.

What we easily can do is to remove the database access from the exportUser function.

// Accept user as argument instead of reading it from the database
function exportUser(user: User) {
	// No more call to database.findUserById()
	
	// The rest of the implementation is the same
}

// Somewhere else
const user = database.findUserById(userId)
exportUser(user)

This makes our tests of exportUser much simpler. We no longer have to set up or fake a database to test the function. We can write unit(ish) tests that focuses on verifying the user export and its tricky interaction with the file system.

Testing is reuse #

An interesting aspect is that the original exportUser is a bit of a “one trick pony”. It will always read from the database, convert the data to another format, and write it to disk. Nothing else. That makes it hard to test, but also to reuse.

If we have similar use case where we want to export a user to disk, but that user came from another source than the database, we would be out of luck. That does not fit the original function. However, if we have the modified function which takes a user as argument, then it would be easy to reuse as it does not care where the user came from. By moving the side effect of the database read out of the function, we can make it both more reusable and more testable.

Note that that we got value from removing one side effect from the function (the database call) even though we did not get all of the side effects out (the file system interaction is still there).

The full implications of this change are deep. Not only does it show that code with less side effects is easier to test and reuse. It suggests that testable code is reusable code.² Testability and reusability goes hand in hand. When you test a function, you run the code in another context than it was built for (which would be the actual production use case).³ If your code is not reusable you will feel that pain in your tests.

Put simply, testing your code is reusing it.

My early thoughts on the subject was captured in my blog post How unit testing changes your design. ↩︎
The talk The deep synergy between testability and good design by Michael Feathers goes deeper into this kind of thinking. ↩︎
Unless you do test-driven development, then technically the test was the first use case and the production use came after. 😉 ↩︎