Reduce new problems to known solutions 🔽
Reading Lars-Christian Simonsen’s blog, I stumbled upon the following comment by Rachel J. Kwon regarding static site generators like Hugo.1
I still struggle to wrap my head around the concept that the whole site gets rebuilt and republished even if I just change one word on one page.
As a programmer, I’m pretty used to how systems like this work. So doing what Hugo does seems natural to me. And when you are used to something, it is easy to think that it is obvious and that everyone else will think so too.
Therefore, I really appreciated the above comment as it made me think about why they work that way. This blog post is my attempt to describe why it makes sense for Hugo to generate the whole site from scratch even though only a single word has changed.
Generating a site is easy #
The short answer is: because it is easier.
That may sound counter-intuitive. After all, a builder does not tear down the whole house and rebuild it just because the architect made a small change to the blueprint.2
But when it comes to computers, things are a bit different. Computers are very fast and very good at following instructions. So generating the whole site from scratch is not a big deal for the computer.
But wouldn’t it be faster to just update the parts that have changed?
Determining what has changed is hard #
The problem is to determine what needs to be changed. That single word that was changed may influence the output in several ways. Let’s say you change the title of a page, then not only that page needs to change but every other page which links to that page, and maybe a “last updated” timestamp in the site footer as well.
Coming up with a water-tight way of determining exactly what needs to be updated is harder than it sounds. In Hugo’s case it is even harder, since the output is determined by third-party themes.3 And a solution which is able to update the output correctly most of the time would not be good enough. In the end, figuring out what has changed is a much harder problem to solve than just generating the site from scratch.
In fact, I don’t really know about any publishing platform which attempts this. Besides static site generators, most platforms tend to render pages per request. So instead of generating the page whenever something is changed, it would generate it from scratch every time the page is requested by a visitor!4
Reduce new problems to known solutions #
So when Hugo needs to ensure that the generated site accurately reflects the new changes, it chooses the easiest solution—to generate the whole site from scratch. After all, it already had that capability since it could generate the site in the first place. Coming up with another way to update the site when it changes would have made the solution more complex and error-prone.
This is in fact a very broadly applicable principle. If you can make an unknown situation look like one you already know how to handle, you can apply a well-known solution. That often makes sense, even when it may require “unnecessary” step, as long as performing those extra steps are cheaper than trying to avoid them.
This is also common in mathematics where the easiest route to solve a new problem often is to transform the problem into one of the vast number of already solved problems.
There is an old math joke which nicely captures this.
A mathematician is asked to make tea and provided with a kettle, a water tap, stove and tea leafs. The mathematician fills up the kettle, puts it on the stove to boil, and then adds the tea.
The next day they are asked to make tea again, but provided with an already filled kettle. The mathematician simply dumps out the water from his kettle and exclaims “Now we we’re back to the same problem as yesterday, and I already know how to solve that!”
-
This blog happens to be published using Hugo as well. 😊 ↩︎
-
Even in the real world, you may be surprised by how often it is more cost-effective to tear down an old house and build a new one, than to remodel the old one. ↩︎
-
Though it is not recommended, a Hugo theme can even use randomness or time as a factor to determine the output, so the output could be different each time it is generated. That makes it virtually impossible for Hugo to predict what will change. ↩︎
-
A platform which generates pages on request will most likely use caching to avoid generating the same thing over and over again. But that cache is only useful as long as the page has not changed. When it is, the cache must be discarded and the whole page is generated again. So in practice, a dynamic content management system becomes very much like a static site generator in terms of generating output. The main difference is when the page is generated. ↩︎