Functional foundations ⚙️
Table of Contents
I would like to share my thoughts on a style of programming that I’ve found very powerful.
This is a rather long post, so grab a cup of coffee and follow along. Examples will be in Kotlin because it does a nice job at expressing these ideas, but hopefully it’s easy enough to follow even if you have another background.
The rise of functional programming #
During the last decade, functional programming concepts have become more and more popular. Things that used to be discussed only in academia are now common in industry. Mainstream languages like Java and C# have added support for immutable data structures, higher-order functions, lambdas, and much more. On the web front, React has helped popularize a functional style.
Many developers look at functional programming with a bit of hesitation. Fully functional programs tend to look quite abstract or even incomprehensible. This is true for me too. Maybe it is only a matter of training, but going full functional programming sometimes feels like a bridge too far. However, having been exposed to it to some degree, I’ve also seen some of the elegance and power that it holds.
With that said, I think it is possible to get a lot of the benefits of functional programming through a careful selection of basic but powerful concepts. During the last few years, together with my good friend and former colleague Tobias Gedell, we have converged on such a selection.1
We called these functional foundations.2
Functional foundations #
Functional foundations is a set of functional programming concepts that we have found helpful in everyday programming, even for programmers not trained in functional programming. It includes the following concepts.
- Pure functions: Functions always return the same output for a given input and cause no side effects. (But we will allow some parts of our code to have side effects.)
- Immutable data structures: Variables and data structures are never modified. To change a value, make a copy and apply the change to the copy.
- Collection pipelines: Combine higher-order functions like
map
,filter
, andreduce
to process collections of data. Use lambdas to make code concise.
If you are not familiar with functional programming, the meaning of some of these terms may be unclear. In the sections below, I try my best to explain and motivate them from the perspective of someone who is not used to functional programming.3
For those trained in functional programming, you will find that functional foundations is a very light-weight form of functional programming. It is intended to bring much of the value of functional programming to groups of people who do not know or even want to learn functional programming. The advice in this post will often suggest pragmatic compromise over conceptual purity.
Pure functions #
Getting started, the concept of pure functions is the first pillar of functional foundations and perhaps the most essential of all functional concepts.
The term pure function is used to describe a function that has these two properties.
- It always returns the same output for the same input. This means the function cannot rely on any state or external resource. Anything that the function can use must be passed in as an argument.
- It does not cause any side effects. This means it does not change any state or cause any I/O, like updating the screen, writing to disk, or causing network traffic. The only way a pure function can make a difference is by returning a result.
They are heavily inspired by mathematical functions like:
\(f(a,b,c) = a^2+b^2+c^2\)
A simple example in Kotlin could look like this.
fun square(number: Int): Int {
return number * number
}
This function does not depend on anything except its argument number
, the type Int
, and the *
operator, neither of which is a cause for impure behavior. The function does not have any effects besides returning a value.
A contrived example of a very impure function could look like this.
var counter: Int
fun notPureAtAll(list: MutableList<String>) {
// Impure: Both reads and writes state outside the function
counter += 1
if (list.isEmpty()) {
// Impure: Aborts regular execution
throw Exception("List is empty")
}
// Impure: Changes the state of the provided list
list[0] = "zero"
// Impure: Depends on the state of the Random number generator
val maybe = Random.nextBoolean()
if (maybe) {
// Impure: Depends on the state of the system clock
val time = LocalDateTime.now()
// Impure: Writes to the console
println(time)
} else {
// Impure: Writes data to disk
File("output.txt").writeText("some output")
}
}
The value of pure functions is that they are predictable and easy to reason about. You could say that pure functions are honest functions.4 They do what their signature says and nothing else.
Because they do not have any side effects and always return the same output for the same input, they are easier to test and debug. They are also easier to reuse and compose with other functions because they do not depend on any external state.
These properties also make pure functions suitable for running in a concurrent or parallel environment, something that only becomes more important as computers get more cores. After all, there is only so much power you can squeeze out of a core before you reach its physical limits.5 Adding more cores has been one of the driving factors behind increasing CPU and GPU performance over the last decade, perhaps even the most important.
Functional core, imperative shell #
Pure functions are great, but there is a catch. A big one. Side effects are necessary to make a program. A program without any side effects will do… nothing. It cannot output anything to the screen, write anything to disk, or do any network traffic.
Functional programming has come up with various pure ways to express these effects, the most common being the IO monad. I won’t go into further detail about how that works, other than saying that it is often perceived as complex. It was intentionally left out of these functional foundations.
The way we think about it is that since you will have side effects at some point, why not just keep it simple and perform those effects straight up? If you need to write to disk, just write that file directly. If you need to send a network request, just do it.
How does one balance writing pure functions with performing side effects when needed? I recommend the notion of “functional core, imperative shell”.6 That means that we should try to write as much code as possible following these functional foundations. Then add imperative code7 which performs the effects “at the edges”, before or after the pure code runs. Just try to keep side effects out of the core as much as you can.
The main rule is that the imperative shell may call the functional core, but not the other way around. The following figure provides a schematic view of the pattern.
Getting this balance right is not always easy. But the good thing is that there is great value in getting even half way there. The more functions are pure, the easier the whole becomes to understand. The more clear the separation is, the simpler the mental model needed to understand the system becomes.
A good place to start is at the “leaves” – the functions that do not call any other functions. Try to move any side effects out of them if possible. Then some pure functions can start calling other pure functions. Eventually, you have moved the border between pure code and code with side effects enough that you start to develop your functional core.
You could also think of “functional core, imperative shell” as separation of concerns. You separate code that computes stuff (pure functions) from code that interacts with the environment (impure functions). Those two types of code often have quite different characteristics and are often best kept apart.
To be pragmatic, it is common that people allow certain types of side effects because they are not considered observable to the rest of the program. Typical examples include debug logging and caching. While we as programmers can see their effects, they are transparent to the rest of the program.
As a final pragmatic choice, throwing exceptions in truly exceptional situations is in the spirit of functional foundations. For example if a programmer calls a function with invalid arguments. The alternative is for the function to return a value which represents the error, but checking and propagating these errors can quickly become complex.
Immutable data structures #
The second pillar of functional foundations is immutable data structures.
A data structure in this context can be something like a class, struct, array, or tuple. Immutable means that it cannot be “mutated” or changed.
On a small scale, the first implication is that variables should never change. Once a variable has been initialized to a value, it should never be reassigned. Depending on the language, this is often associated with keywords such as const
, final
, or val
.
A simple example of this could look like this.
val salary = 1000.0
salary = salary * 2 // ERROR: Val cannot be reassigned
val newSalary = salary * 2
On a larger scale, data structures as a whole should never change. It should not be possible to update the value of a field on an object or an element in an array. Some languages may use the term “frozen” to describe this. In a traditional OOP context, it could be a class with a constructor and getters but no setters.
An example can look like this, using Kotlin’s record-like data class
concept.
data class Employee(
val name: String,
val startDate: LocalDate,
val salary: Double,
)
val employee = Employee("John", parse("2024-01-01"), 10000.0)
employee.salary = 20000.0 // ERROR: Val cannot be reassigned
None of the fields on Employee
can be modified. Neither can String
, LocalDate
, or Double
. So what do you do if I want to change a field? A functional programmer’s answer is that you copy the data structure and apply the changes to the copy. After such an update, you have two objects: the unchanged original and an updated copy.
Below is an example using Kotlin’s automatically generated copy
function.
val originalEmployee = Employee("John", parse("2024-01-01"), 10000.0)
val updatedEmployee = originalEmployee.copy(salary = 20000.0)
println(originalEmployee)
// Employee(John, 2024-01-01, 10000.0)
println(updatedEmployee)
// Employee(John, 2024-01-01, 20000.0)
The copy
function is a convenience function that creates an identical copy of the current object, except for the arguments provided. The same effect could have been achieved by creating a new Employee
object manually.
Collection types like List
, Set
and Map
are immutable too. You cannot add or remove elements, only create new collections that reflect the desired changes.
val employee = Employee("John", parse("2024-01-01"), 10000.0)
val employees = emptyList<Employee>()
employees.add(employee) // COMPILER ERROR: `List` has no function `add`
val updatedEmployees = employees + employee
Coming from an object-oriented background, having unchangeable objects feels weird! Why copy the Employee
object instead of changing it? Then we have two Employee
objects in memory representing the same employee at the same time!
Having immutable data structures may feel like a limitation8 (because it is!). But this is one of those situations where you may have to change your mindset. Think of an Employee
object not as the single continuous representation of a certain employee, but rather as a snapshot of the information about an employee at a certain point in time. It’s like having version control instead of saving files on a shared drive.9
While this may sound unnecessary and even inefficient, it provides us with several nice properties.10
- We gain predictability as we know that the value of a variable will be exactly the same anywhere in the scope in which it is defined. It will never change under your feet. You don’t have to think about whether a variable was modified by the function you just stepped over in the debugger.
- We can run a (pure) function that “updates” the salaries as many times as we want without fear of accidentally increasing the salaries twice. In a debugger, you can drop a frame and safely rerun that function if we want.
- We know that data will be in a consistent state even if a function throws an exception. There is no possibility that some changes were applied but others were not.
- We can share
Employee
objects between threads without fear of concurrent modification. We can safely put it in a cache without fear that it will change. They can safely be used as keys in a hash map. - We gain the ability to easily compare two snapshots of the same employee. For example, we can use a debugger to compare the
Employee
objects returned by salary update function with the ones we sent in.
You may be concerned about the performance implications of copying objects all over the place. In functional programming languages, data structures are often cleverly created to reduce this effect. For example, adding an element to a list really just creates a new object holding that element and pointing to the original list. (That is safe to do because we know the original list will never change.)
Even in languages where this is not true (like Kotlin), the effect of copying objects is often very low. Especially in traditional business applications, where the vast majority of latency is caused by user interaction rather than computation. The JVM (and other runtimes) are also optimized for handling many, short-lived objects. In most cases, I think the value gained from making the code more predictable more than makes up for it. And as always, measure before you start optimizing!
Collection pipelines #
The third pillar of functional foundations is collection pipelines.11
This idea is actually two-fold.
- We use higher-order collection functions like
map
,filter
, andreduce
to process collections of data. Higher-order functions is functional programming jargon for a function that accepts another function as an argument or returns one as its result. This approach is an alternative to an imperative solution using loops, conditionals, and mutable data structures. - We then chain these functions into pipelines, like building a model out of Lego blocks. If you’re a C# developer, you may be used to this coding style under the name LINQ.12 Unix pipes also represent the same concept.
These collection functions allow programs to describe what to do with the collection rather than how to do it. This minimizes repetitive code patterns and lets us focus on operation logic instead of loop mechanics.13 And as the name “collection pipelines” suggests, you can easily compose different functions, creating sequences of operations where the output of each function becomes the input for the next.
These higher-order functions become much more expressive with the use of lambdas. A lambda is an anonymous function that typically has a very concise syntax.
In Kotlin, lambdas are written using a curly brace syntax as shown below. The explicit parameter list can be omitted in favor of the it
keyword if the lambda only takes a single argument.
{ x: Int, y: Int -> x + y } // lamba with two explicit parameters
{ e: Employee -> e.name } // lambda with one explicit parameter
{ it.name } // lambda with an implicit parameter
Let’s look at some of the most well-known higher-order collection operations.
map
applies a function to each element in a collection to create a new collection.14val employees = listOf( Employee(name = "John", /* ... */), Employee(name = "Jane", /* ... */) ) val employeeNames = employees.map { it.name } println(employeeNames) // [John, Jane]
filter
applies a boolean function (known as a predicate) to each element and produces a new collection with those elements where the function returnedtrue
.val employees = listOf( Employee(name = "John", isIntern = true, /* ... */), Employee(name = "Jane", isIntern = false, /* ... */) ) val interns = employees.filter { it.isIntern } println(interns) // [Employee(John, true, ...)]
A complete but simple example could look like this.
val topTenPaidManagers = employees
.filter { it.role == "manager" }
.sortedByDescending { it.salary }
.take(10)
While it can initially seem a bit foreign, you will quickly get up to speed. Once you do, you will likely find that it better captures the intent of the processing you wanted to do. Put simply, it becomes easier to understand what the code is supposed to do.
My personal favorite is groupBy
which turns a list into a map with groups of elements based on some criteria.
val employeesByDepartment = employees.groupBy { it.department }
Look at how clear that is compared to a typical imperative Java implementation.
Map<String, List<Employee>> employeesByDepartment = new HashMap<>();
for (Employee employee : employees) {
Department department = employee.getDepartment()
if (!employeesByDepartment.containsKey(department)) {
employeesByDepartment.put(department, new ArrayList<>());
}
employeesByDepartment.get(department).add(employee);
}
Another advantage is that operations like map
and filter
typically do not guarantee in what order elements will be processed. This makes them suitable for running in parallel, making better use of the many cores of today’s computers. In comparison, the for
loop explicitly defines the iteration order and does not allow for parallelization.
Apart from the most basic functions mentioned above, you will likely find yourself looking at flatMap
, flatten
, fold
, zip
, forEach
, and more. How far you want to go will be up to you and your team. Using the fold
operation can solve a lot of problems in a compact way, but it can also look intimidating for those not used to it.
An illustrative example #
Enough talking! It is time to look at some code showing how these concepts of functional foundations fit together.
Disclaimer: It is always hard to create an example that both represents real-world use and fits in a blog post. This is an attempt to show how the three components of functional foundations can work together in a realistic(ish) case. Still, it is just a short example.
The scenario #
In this example, we are developing a feature in a hypothetical system that is responsible for updating the salaries of all employees.
More specifically, it is supposed to update the salaries of eligible employees with a certain amount determined by their role. Something like “developers get 1 000 more, and managers get 10 000”. Apart from updating salaries in the employee database, we are also expected to return the gap between the highest and lowest salary within each role.
The imperative shell #
Let’s dig into the first function of this example. It is a “controller” function in an HTTP-based API, sending a response to an incoming request.
fun performSalaryUpdate(request: HttpRequest, response: HttpResponse) {
// Impure: Reads from network
val requestBodyJson = request.readBodyAsString()
// Impure: Reads from database
val employees = database.findAllEmployees()
// Impure: Depends on external state
val today = LocalDate.now()
// Pure calls
val increasesByRole = fromJson<Map<Role, Double>>(requestBodyJson)
val eligibleEmployees = determineEligibleEmployees(employees, today)
val updatedEmployees = updateSalaries(
eligibleEmployees,
increasesByRole
)
val salaryGapByRole = calculateSalaryGapByRole(updatedEmployees)
val responseJson = toJson(salaryGapByRole)
// Impure: Writes to database
database.updateEmployees(updatedEmployees)
// Impure: Writes to network
response.respond(200, responseJson)
}
This is an impure function. It interacts with both the network, a database, and the clock. That means we have to know the current state of all of these to be able to understand what this function will do. That makes this function harder to understand, debug, and test. But thankfully, we only have one such beast in this example.
As is often the case with impure functions, it mostly handles communication with external resources and then delegates business logic to pure functions. This is an example of the “functional core, imperative shell” principle at work. The function we are looking at is part of the “imperative shell”.
Let’s look at the other functions. I’ve excluded the fromJson
and toJson
functions for brevity, but there is no reason they cannot be pure functions.
Isolate pure business logic #
The next function is called determineEligibleEmployees
and is pure business logic. It embeds the knowledge that “eligible” means employees who are not interns and started their employment before the start of the current year.
fun determineEligibleEmployees(
employees: List<Employee>,
today: LocalDate
): List<Employee> {
val startOfYear = today.withDayOfYear(1)
return employees
.filter { !it.isIntern }
.filter { it.startDate < startOfYear }
}
However, to evaluate the start date constraint, we need to know what the current date is. We could have directly accessed LocalDate.now()
, but that would make the function impure as the clock is external state. Instead, we ask the caller to provide us with the current date on which we can base our calculation. This separates the business rules from the ability to determine the current time. It makes the output of the function simpler to predict, not to mention easier to test.
Update without modifying #
Next up is the function that updates the salaries of the employees. What is noteworthy here is that while this function conceptually updates the salaries, it does not modify any Employee
objects. Instead, it returns updated copies.
fun updateSalaries(
employees: List<Employee>,
increasesByRole: Map<Role, Double>
): List<Employee> {
return employees
// Link each employee to the increase for their role
.map { it to increasesByRole.getOrDefault(it.role, 0.0) }
// Filter out employees with no increase
.filter { (_, increase) -> increase > 0.0 }
// Update the salary of the remaining employees
.map { (employee, increase) -> employee.increaseSalary(increase) }
}
data class Employee(
val name: String,
val startDate: LocalDate,
val role: Role,
val salary: Double,
val isIntern: Boolean
) {
fun increaseSalary(increase: Double): Employee =
// Creates a NEW identical object, except with a new salary
copy(salary = salary + increase)
}
This code again uses the Kotlin copy
function and gives us a convenient syntax for creating a copy of an object with some fields changed.
The increaseSalary
function also shows that pure functions can still be defined as methods on a class, as long as any fields the method references are immutable.
Processing collections #
The last function is calculateSalaryGapByRole
which gives us an opportunity to talk more about collection processing. We’ve already seen filter
and map
calls in the previous functions, and here we get to meet groupBy
, max
, and min
as well.
fun calculateSalaryGapByRole(
employees: List<Employee>
): Map<Role, Double> = employees
.filter { !it.isIntern }
.groupBy { it.role }
.mapValues { (_, employeesInRole) -> calculateSalaryGap(employeesInRole) }
fun calculateSalaryGap(employees: List<Employee>): Double {
val salaries = employees.map { it.salary }
val maxSalary = salaries.max()
val minSalary = salaries.min()
return maxSalary - minSalary
}
These two functions group all non-intern employees by role, and for each such group computes the difference between the highest and lowest salary.
I hope that these examples show some of the expressive power of using higher-order functions for collection processing.
A note on testing #
In our example, all functions but the first are very easy to unit test.15 You just provide input and verify the output. That means you can spend your time finding edge cases and adding the right tests, rather than setting up complex mocks. Even something “technical” like converting objects to and from JSON is typically pure and can be tested in isolation from the task of actually sending JSON over the network.
How do we test the impure performSalaryUpdate
function? It is worth questioning whether this should be unit tested at all. Doing so would require setting up a lot of external dependencies, including simulating a network call, a database, and a “frozen” clock. Even more importantly, it would require us to make a lot of assumptions about how these external dependencies behave. If we fail to anticipate their behavior correctly, the value of our test is severely reduced. Therefore, functions with a lot of impure behavior are often more suitable for tests at the integration, system, or end-to-end level.
Out of scope #
Now we’ve talked a lot about pure functions, immutable data structures, and pipelines of higher-order collection functions. While the concepts are integral to functional programming, they really only scratch the surface.
There is a lot of stuff that was intentionally not included in functional foundations. Concepts that can be very powerful, but whose learning curves are steeper and thus can cost more than they are worth to a team of developers that are not familiar with functional programming.
- Generous use of higher-order functions.
- Generous use of recursion.
- Monads (for example IO).
- Currying and partial application.
Just to be clear, I’m not saying any of these things are bad. They may well be the best things ever. I’m just saying that I and many developers find them hard to understand.16 And who knows, maybe these “functional foundations” can serve as a gateway drug to full functional programming? Or maybe they are simply good enough for many uses.
Give it a try #
If we take a step back, why should you care? Isn’t functional programming something that only academics care about? Isn’t it just another trend that will pass away?
While trends come and go in programming, I honestly do not think the above functional foundations will go out of style. No matter if you are in academia or industry. Making code easier to understand, clearly expressing intent, controlling side effects, writing code that can be run concurrently, and processing data effectively. These are not needs that will go away. They are some of the most fundamental aspects of software development.
The functional foundations also have the advantage of allowing gradual adoption. You can make one function pure or add a little collection pipeline somewhere without having to change the whole system. Each small step makes the system a little better.
If you were not convinced before, this blog post may not make you “see the light” either. But I hope that I have at least piqued your interest a little bit. If nothing else, give these concepts a try just to flex your mental muscles. Understanding more programming paradigms will only make you a better programmer.
I encourage you to experiment with pure functions, immutable data structures, and collection pipelines. They are very powerful concepts, and while they may take a bit of time to get used to, they can make your code much better!
If you give them a try, I have a strong feeling that you will not regret it. 😊
(And if you’re curious for more, see my post on algebraic data types as well!)
Updates #
- 2024-03-12: Original post published.
- 2024-03-19: Added a note on how “functional core, imperative shell” can be seen as separation of concerns.
- 2024-05-06: Added a footnote about the link between “functional core, imperative shell” and unit testing.
- 2024-06-22: Added a footnote with a link discussing how constraints can be positive.
- 2024-08-28: Added a link to my post on algebraic data types.
- 2024-09-06: Added a footnote linking to my comparison of loops vs higher-order collection functions.
- 2024-10-10: Added a footnote on John Carmack’s thoughts on functional programming in C++.
-
These ideas were developed in a team where three members had a PhD in functional programming, Tobias included. So I dare to say that the limited selection of “functional foundations” was not made out of ignorance. ↩︎
-
The term “functional foundations” should not be confused with Functional-Light (JavaScript), which goes a lot deeper into functional territory. ↩︎
-
You may also be interested in reading John Carmack’s thoughts on functional programming in C++. He argues that “No matter what language you work in, programming in a functional style provides benefits. You should do it whenever it is convenient, and you should think hard about the decision when it isn’t convenient.” ↩︎
-
This paraphrases Michael Feathers who said “functional code is honest code”. ↩︎
-
Chip designers are coming up against the ultimate limit, the speed of light. ↩︎
-
The term “functional core, imperative shell” was coined by Gary Bernhardt. My own early attempt to describe similar ideas was extract the logic (what’s left is glue). ↩︎
-
In imperative programming one specifies how tasks are to be executed step by step to achieve a desired outcome, unlike in declarative programming where one specifies what outcomes to achieve. ↩︎
-
Don’t forget that limitations can be a good thing. ↩︎
-
Version controlled state is a pretty good description for Redux and similar React state managers. They use it among other things to enable time-travel debugging. ↩︎
-
As a thought exercise, think about why
String
is immutable in almost all popular programming languages. ↩︎ -
Martin Fowler does a great job at explaining collection pipelines in detail. ↩︎
-
While LINQ is like functional programming with higher-order functions and lambdas, the syntax was inspired by SQL which more C# developers were familiar with. ↩︎
-
My post There is no loop explains in more detail, the idea of expressing what to do, rather than how to do it. ↩︎
-
The term “map” comes from mathematics, where a map is a function that associates each element of one set with an element of another set. ↩︎
-
Adopting the “functional core, imperative shell” pattern is quite beneficial for unit testing. For code to be easily tested, most complexity should be in classes with few dependencies, and most dependencies should be in classes with little complexity. ↩︎
-
As the joke goes, “it would be a pure function if not for the side effects on your sanity”. ↩︎