Extract the logic (what’s left is glue)
Unit testing friendly code is code that has enough complexity and few dependencies. Therefore I would like to talk about a technique for massaging your code in that direction.
Let’s say we have an important class with both high complexity and many dependencies. It is really in need of unit testing because of its complexity and importance, but is hard to isolate properly. Trying to unit test such a class typically leads to long tests that are hard to read and maintain, and that need a lot of complex set-up.
This is my “go to” method for handling this kind of situation.
- Extract the logic – move the pure, (mostly) dependency free code which performs the actual functionality of the system out into separate units. Unit test these new units as much as you want.
- What’s left is “glue” – the original code is now a “web” of dependencies that wire various units of logic together to perform what the user wants. Use integration or system tests to test this wiring.
- Improve the tests – write, modify, or delete tests as needed.
The result is code that follows a pattern that I call Logic+Glue.
Now, we will go through these steps in some more detail.
Extract logic #
Of course, simply “extracting the logic” isn’t quite as simple as it sounds, most of the time. As a help, here is a general outline of the process I use to do it.
- Inline current methods into one large method. Do this within reason. If a method is recursive, or already dependency free, it might be better left alone.
- Extract local variables for all method calls. Each method return value should be assigned to a new variable. We do this in order to make it clear what data is actually used in our code.
- Treat any non-locally created data as dangerous. This includes constants, static fields, instance fields, parameters, as well as any object returned by a method call to such an object. These “dangerous” variable assignments are what will become the “glue”.
- Try to make all local variables final. The following steps will be much easier if variables are not reassigned. For the same reason, the method should have as few return points as possible.
- Extract blocks of logic into methods. Look at the code – whatever is not “dangerous” variable assignments is the actual logic. Look for
for
loops,if
statements, blocks of non-dangerous local variable assignments, and so on. Such a block should not contain any of the “dangerous” variable declarations (but may use the variables!). Generally, we want as large blocks of code as possible that use as few variables as possible. Also, the more primitive (as in, using primitive data types) the interface is, the better. Use the “Extract Method” refactoring feature of your IDE to “preview” potential methods. Experiment, extract, inline, and extract again, until satisfied. - Move methods of logic into new classes. Look at the newly created methods. They should now hopefully contain “meaty” logic but few or no references to class variables. If you realize, by looking at such a method that it would naturally fit in an existing class, move it there. Especially, see if it could fit in one of the parameter or instance variable types. Otherwise, create new classes and move a method, or a group of related methods, into it. Try to turn any static methods into non-static methods on the new class.
- Refactor and clean up. While doing the above steps, the code may get a bit messy. Now is the time to clean up the code you’ve extracted.
Feel free not to follow these steps if you feel you have a better way to do it, they are simply meant to be a help to get started.
What’s left is glue #
Whatever code that is left in the class you started with becomes the glue. Ideally, this is now nothing but a bunch of the “dangerous” variable declarations and some calls to your newly extracted methods.
- Refactor and clean up the glue code. Feel free to inline some of the variables or extract methods if it improves readability. The glue initially has the same interface as the old complex class. You can change this if you see room for improvements.
One way to verify how well we managed to extract the logic is by looking at the import statements of the glue class compared to the logic classes. Most of the import statements should be in the glue code, often from multiple different packages, whereas the logic classes should have rather short and homogeneous import sections.
Improve the tests #
Finally, it is time to look at the tests. You have two primary strategies, which can be combined if wanted.
- Write new tests. If you did this refactoring because the tests original tests were bad or non-existing, go ahead and write better ones for the newly extracted logic.
- Keep the old tests. If the old tests worked well, you can choose to keep them since the glue is acting as a protecting layer between the logic and callers. They then become slightly higher-level unit tests where the “unit” now is a group of classes rather than a single class.
By doing this kind of extraction, you’ve moved your code base one step closer to a more testable code base. (Most likely, more in line with fundamental object oriented principles as well.)
Comments #
David Hudspeth at : Could you provide an example of steps 1 and 2? I am confused how you have any methods left after you inline all of the existing ones. I am probably just not understanding some of the terminology.
Henrik Jernevad at : Hi, thanks for your comment! 🙂You are right in that inlining all methods would leave you without any at all (except perhaps a single really big one). That was the effect I was after, but limited to a single class. In situations where I have done this, there has often been one class that has been really messy. It might be a few hundred or up to a thousand lines of code, divided into various methods. But it is not clear or easy to understand. Method names are not very helpful, and the level of abstraction between different methods vary greatly.
Clarification of step 1: Very often there is one or a few public methods which act as entry points to this class. Those methods I do NOT inline. However, I do inline all the other methods (typically private) IN THAT CLASS. That way I end up with one (ideally) or a few very large methods which are probably looking even worse than what it was before, but I still find this a better starting point for performing the refactoring.
Clarification of step 2: After step 1, the code still has a bunch of method calls. These are all calls to various dependencies, i.e. other classes. I do NOT inline these. So the variables you create in this step is to hold the values returned by any call to a dependency, not for any local method calls (we’ve just inlined all local methods, after all). The exception to that rule would be if there is another class which is very tightly coupled with the one I’m working on (such as an previous unsuccessful attempt at a “logic” class) in which case I would inline that too.
Hope this clarifies what I mean a bit. If I did not manage to completely answer your question, feel free to ask follow-up questions!
Updates #
- 2024-04-24: Republished this post which was originally written for my previous blog.