Skip to main content

Algebraic data types 🧩

So you’ve mastered functional foundations and use immutable data structures every day? What then is the next step on the journey to reap the benefits of functional programming?

A puzzle piece being composed of other pieces.

I would argue that algebraic data types fits the bill. This post will take a quick look at them, from the perspective of a curious object-oriented programmer.

Algebraic Data Types #

Algebraic data types, often abbreviated as ADTs1, are a way to compose types from other types. They can be used to describe complex data in a clean and type safe way.

What does “algebraic” mean, you may ask. In school, you probably learned to solve equations such as \(2*x + 3 = 7\). That kind of math is called algebra and deals with symbols and the rules for manipulating these symbols. Just like the equation I just wrote is made up of products (\(*\)) and sums (\(+\)), algebraic data types are made up of products and sums, too.

Products and sums #

Algebraic data types are built out of two kinds of types.

  • Product types: A combination of multiple other types. The keyword is “and”, as in \(T = A\ and\ B\ and\ C\). Typical product types include classes, records, or tuples. A simple example could be a Java record defining a Circle as a center Point and a double radius.

    record Circle(Point center, double radius) {}
    
  • Sum types: A choice between multiple other types. The keyword is “or”, as in \(T = A\ or\ B\ or\ C\). Sum types are also called union types. Typical sum types include interfaces and enums. An example in Java could be a sealed interface Shape defined as a Circle or a Square or a Triangle.

    sealed interface Shape permits Circle, Square, Triangle {}
    

    There is another common syntax for union types that you may have seen. Here’s an example in TypeScript.2

    type Shape = Circle | Square | Triangle
    

ADTs are sealed #

An important property of ADTs is that they are sealed or closed. That means that their definition contains all possible cases and no further cases can exist. As opposed to regular interfaces, this means it is not possible to add more implementations without adding it to the list of permitted types. This has some interesting implications that we will come back to later.

An example #

It is easy to look at the formal definition of ADTs and get the impression that they are a bit complicated. In fact, they are not. They are pretty natural to use, they also very useful!

Let’s start with an example of basic geometric shapes such as circles, squares, and triangles. (As usual, the model is overly simplified to fit into a blog post.)

I’ll use Java for the example, as it is well known and has good-enough support for the relevant concepts.

sealed interface Shape permits Circle, Square, Triangle {}
record Circle(Point center, double radius) implements Shape {}
record Square(Point topLeft, double side) implements Shape {}
record Triangle(Point p1, Point p2, Point p3) implements Shape {}

There are some key points to note in this example.

  • We use a sealed interface (sum type) which means there can never be any other implementations than the ones we explicitly permit.
  • The different shapes are modelled as records (product types) and have different properties. (Could have been regular classes as well, but records are more concise.)

Pattern matching #

At this point you may be asking yourself, what is the point of using sealed interfaces. Why add such a limitation when you could just use a plain old interface?

While it is true that the above example could be implemented using a regular interface, there is one important property that we would lose. Since the interface is sealed and explicitly lists the possible cases, the compiler knows at compile time what types implement the interface.

This is very handy if we want to do pattern matching. Pattern matching is a way to make different decisions depending on the type and contents of an object. More specifically, it lets you check if the objects fits a certain pattern, and then access parts of that value directly. It’s like a more powerful and concise version of a traditional switch statement that can work with complex data structures.

An example #

Let’s look at an example of a function that uses pattern matching to calculate the area of a shape.

double calculateArea(Shape shape) {
    return switch (shape) {
        case Circle(Point _, double radius) -> PI * radius * radius;
        case Square(Point _, double side) -> side * side;
        case Triangle(Point p1, Point p2, Point p3) -> {
            var a = p1.distance(p2);
            var b = p2.distance(p3);
            var c = p3.distance(p1);
            var s = (a + b + c) / 2;
            yield sqrt(s * (s - a) * (s - b) * (s - c));
        }
    };
}

There are some interesting things going on here.

  • We can use Java switch expression (notice the -> syntax) where each branch produces a result. No mutable variable or break keyword necessary. The yield keyword is used to signal the result in a multi-statement block.
  • We can destructure the object and name the fields we are interested in on the left-hand side, so we can access them directly on the right-hand side. No dots or casting is necessary.
  • There is no default case in the switch statement. This is because the compiler knows that there cannot be any other possible types. We say that the match is exhaustive. This means that if we add a new implementation of Shape, we’ll get a compiler error in calculateArea until we’ve added an case branch for it.

Why not polymorphism? #

Developers used to object oriented programming and polymorphism may think this code is a bit weird. “This is basically instanceof, which I’ve been taught to never use!” Why is calculateArea a separate function instead of an abstract method on the Shape interface, with each concrete type providing its own implementation?

To an object-oriented programmer, this would look more natural.

sealed interface Shape permits Circle, Square, Triangle {
    double calculateArea();
}  
record Circle(Point center, double radius) implements Shape {
    public double calculateArea() {
        return PI * radius * radius;
    }
}
// The rest of of the implementations...

Why would we want to implement the calculateArea function outside of the Shape type hierarchy?

The expression problem #

We are approaching a well-known problem in computer science, called the expression problem. It deals with the challenge of wanting it to be easy both to add new types and to add new functions operating on these types. And to do so without heavily changing existing code. There are two primary ways to approach this.

  • Polymorphism (object-oriented programming): The object-oriented approach using polymorphism (the last example above) is to declare a virtual function on the interface, and require each subclass to implement it. It works well when the system needs to handle new types frequently, but the operations on these types are relatively stable. It allows you to add new types without modifying existing operations. However, adding new operations require you to modify each existing type.

    The ideal use case would be a plugin system for an application, where new plugins can be loaded dynamically at runtime, but the API provided to those plugins is stable.

  • Pattern matching (functional programming): The functional programming approach is to define functions outside of the data structures (the previous example) and use pattern matching to implement these functions. It works well when the set of types is stable, but the operations on them might change or grow. It allows you to add more operations without modifying the types. However, adding more types require you to modify each existing operation.

    The classical use case for this approach would be a compiler, where the types of the abstract syntax tree are relatively stable, but the operations to be performed on it vary.

It is worth noting, that there is not really a solution to the expression problem (yet). It is the kind of situation where you can’t have your cake and eat it too. There are some problems where one of the solutions is obviously a better fit. In other cases, you need to ask yourself whether the data structures or the operations performed on them will be more likely to change in your system.

The two approaches can also be combined. You can have algebraic data types with some stable polymorphic methods, complemented by separate functions using pattern matching. (Don’t tell your fundamentalist FP or OO friend, they may be angry.)

Separation of concerns #

I would like to point out, that some operations feel like they would be a good fit for polymorphism, even though they are not.

Serializing objects to JSON is a good example. A direct approach is to add a toJson method to the Shape interface and let each shape implement its own serialization. But what if there are two possible JSON formats to describe shapes? Or five? If serialization is implemented as separate functions, it is easy to add and remove such functions as needed.3

Implementing JSON serialization outside of the Shape type hierarchy also means the shape implementations does not need to depend on any JSON serialization library. That in turn means that you can use the Shape implementations in scenarios which do not need JSON serialization, without having any unnecessary dependencies.

This can be thought of as separation of concerns. Let the data structure know how to represent the data. Then, except for operations which are intrinsically tied to the data structure itself, let external functions know what to do with the data.

Enums on steroids #

Another common construct that is similar to sum types is enumerations (enum in Java). They are typically used to represent a known set of constant values, even though each value typically has the same type.

In Java, you could see something like this.

enum ShapeType { CIRCLE, SQUARE, TRIANGLE }

Here, ShapeType is an enum to describe a kind of shape. However, the enum typically does not hold information about each instance. Because all values share the same type ShapeType, each type of shape cannot have different data attached to it. We cannot express that circles should have a radius, and triangles are defined by three points. We also cannot have two different instances of circle, each with a different radius.

The solution would be to add another object to hold the enum and any additional information. However, this still does not allow us to have different fields for different types of shapes. The compromise would be to make such fields nullable.

enum ShapeType { CIRCLE, SQUARE, TRIANGLE }
record Shape(
	ShapeType shapeType,
	// Only set for circles
	@Nullable Point center,
	@Nullable double radius,
	// Only set for squares
	@Nullable Point topLeft,
	@Nullable double side,
	// Only set for triangles
	@Nullable Point p1,
	@Nullable Point p2,
	@Nullable Point p3,
) {}

When using algebraic data types, we merge the Shape and ShapeType together, without losing the intentional type limitation embedded in the enum.

In other languages, such as Haskell or Rust, enumerations and sum types are two sides of the same coin. The same construct can be used to define both simple enums like Java, as well as complex enums where each alternative carries data.

Null safety with Optional #

Besides expressing domain models, another common use case for algebraic data types is error handling.

The perhaps most well-known example is Optional. It is a container object which may or may not contain a value. It can be thought of as a collection of size 1. In the terms of algebraic data types, it is a sum type that could be thought of something like \(Optional\ of\ T = T\ or\ null\).

An example could be a function which attempts to interpret a rectangle specified by two points as a square. If the rectangle is also a square, the returned Optional contains the corresponding Square instance, and otherwise it is empty.

Optional<Square> getSquareFromRectangle(Point topLeft, Point bottomRight) {  
    var width = abs(topLeft.getX() - bottomRight.getX());  
    var height = abs(topLeft.getY() - bottomRight.getY());  
    if (width != height) {  
        return Optional.empty();  
    }  
    return Optional.of(new Square(topLeft, width));  
}

The big advantage of Optional over using null in Java is that the compiler can catch errors that would otherwise result in a NullPointerException in runtime.

Optional<Square> notASquare = getSquareFromRectangle(
        new Point(0, 0),
        new Point(2, 5) // The sides are not equal
);
// notASquare will be Optional.empty

// You cannot get a NullPointerException as the type system forces you
// to handle the distinction between present and empty.
notASquare.ifPresent(square -> System.out.println(square.side()));

If getSquareFromRectangle instead returned a Square or null on other rectangles, we could have the following problem.

// Calling an unsafe version which returns null instead of using Optional
Square notASquare = getSquareFromRectangle_nullable(
	new Point(0, 0), 
	new Point(2, 5) // The sides are not equal
);
// notASquare will be null

// If you forget this null check, your program will still compile but
// you will get a NullPointerException in runtime.
if (notASquare != null) {
	System.out.println(notASquare.side)
}

Some languages has null safety built in. Kotlin is one example, where a variable of type T cannot contain null. To allow null values, you need to specify the type as T? (with a question mark). But then you cannot access properties of T unless you have performed a null check on the variable. This is effectively Optional built into the language.

// The type is declared with "?" to allow null values  
val notASquare: Square? = getSquareFromRectangle_kotlin(  
    Point(0, 0),  
    Point(2, 5) // The sides are not equal  
);  
// notASquare will be null  
  
// If you forget this null check, your program will not compile!  
if (notASquare != null) {  
    // notASquare is automatically cast to type "Square" without "?"  
    System.out.println(notASquare.side)  
}

It is also common that languages provide a syntax for effectively working with such nullable types. Kotlin uses a ?. notation to allow null-safe access to nullable types, and Rust uses the ? to automatically unpack its value, propagating None if missing. Haskell provides a generic do notation which abstracts away the handling of absent values.4

It takes practice #

In this blog post, I’ve used an over-simplified example model with one interface and three implementing records. In reality, the model would have been much bigger. The Shape type would then likely be used as a part of other product types (such as classes or records), which would have been part of other product types or perhaps as an alternative in a sum type. And so on.

Getting used to product and sum types may take a bit of practice. (Though, in many cases getting used to the names “product” and “sum” may be harder than actually using the concepts.) Perhaps the biggest change from a traditional object oriented thinking is that you define a sum type out of other types, rather than defining an interface that other types then may choose to implement. The dependency kind of changes. With sum types, the interface now knows what types implement it.

Pattern matching may also take some time to get used to. Many developers trained in object oriented programming have learned the rule “don’t use instanceof” (though they may not know why). With pattern matching, this is turned on its head. But the key here is that it is not uncommon to work in code bases where the model types are more stable than the functions that operate on them. In such cases, you may very well find that pattern matching is a better fit!

I suggest you give it a try! If nothing else, you get a good mental workout. 🏋️

Emil Axelsson: Just wanted to point out that the sum-of-products explanation is probably not the original reason for the term "algebraic data type", as explained here: Where does the name "algebraic data type" come from?

That said, I think sum-of-products is what most people have in mind nowadays, so that doesn't change much 🙂

  1. Note that the abbreviation “ADT” often refers to “abstract data types” which is something else. But within a functional programming context, the term ADT can be used for “algebraic data types”. ↩︎

  2. I should mention that Java’s sealed interfaces are a more limited form of sum types than many other languages allow, in particular functional languages. The Java-style solution with using an interface means that each option in the sum type needs to implement that interface. In the more general form, sum types does not have that requirement. It is perfectly fine to create a sum type of completely unrelated types. ↩︎

  3. The idea of avoiding polymorphism for serialization is also applies to serialization annotations such as @Json or @Serializable. How do you annotate your class if there is more than one way to serialize it? (Which reminds me of a former colleague who grumpily quipped that “people who annotate their classes are the same loonies that put CSS in their HTML”. 😉) ↩︎

  4. Haskell’s do notation takes us “dangerously close” to talking about monads. While it is a powerful style of programming, it was intentionally left out of functional foundations, and therefore not further discussed here either. ↩︎