The Law of Leaky Abstractions in the Age of AI

In 2002 Joel Spolsky mused about The Law of Leaky Abstractions in an essay that explains how simplifications that make it easy to reason about complicated systems eventually break down, and has forever shaped the way I relate to technology. And while many of Joel’s examples of the Law of Leaky Abstractions seem dated today, the law itself has never been more relevant.

Over the years, I’ve seen different people with different reactions to the concept of Leaky Abstractions.

Some people reach the conclusion that “abstractions are bad and we should avoid them.” But this is nonsensical. Abstractions are what allow us to deploy a web service without caring whether the server runs on WiFi or Ethernet, and without caring whether the user is visiting the site from Chrome on Android, Safari on iOS, Edge on Windows, or whatever the heck is running on the center console on their car. The abstraction might leak, and we might have to make some small adjustments so the page looks the same on Edge as it does on Chrome, but that’s a much smaller challenge than avoiding abstractions entirely. Usually, when this approach is tried, it results in a Not Invented Here syndrome that leads to even worse home-grown abstractions.

The next conclusion people tend to reach is that abstractions need to be air tight, so they never leak. This is nice when it works, but when these abstractions do leak (and they eventually do) they tend to be much harder to resolve than abstractions that are designed to work for most use cases and step aside gracefully when appropriate.

In his original essay, Spolsky noted that:

Code generation tools which pretend to abstract out something, like all abstractions, leak, and the only way to deal with the leaks competently is to learn about how the abstractions work and what they are abstracting. So the abstractions save us time working, but they don’t save us time learning.

I find this snippet fascinating for two reasons. First, 23 years ago Spolsky was talking about code generation tools pretending to abstract something out; we’ll come back to that. But more immediately is the notion that abstractions save us time working, but don’t save us time learning. This is certainly true for code generation, and roughly true for other abstractions.

In most cases, I find that it’s sufficient to understand where your knowledge of an abstraction starts to get hazy. I write a lot of code in Go. I have a very thorough knowledge of Go’s semantics, standard library, runtime, etc. I know how to use Go’s compiler, but for the most part I let the compiler be a black box until I end up in a situation where I need to understand it. The important thing is that I know the compiler is a black box. If I start running into weird behaviors that aren’t explained by what I know about the language, I know the next thing I may need to learn about is the compiler. And on some occasions I have had to peek behind the curtain; at one point while working on the first iteration of PluGeth I got into challenging issues with dependencies in dynamic linking, and dug into the compiler to better understand the problem. The black box doesn’t have to stay a black box, the important thing is recognizing the current limits of your knowledge and being unafraid to dig deeper as needed.

Much of the time abstractions can save you time learning, but you need to be prepared to learn when the abstraction leaks. I used Kafka for years without needing to understand how messages got routed to specific partitions, until I had a key structure that resulted in all of the messages going to the same partition and had to learn what was going on. I’ve used Django backed onto SQLite, MySQL, PostgreSQL, and only occasionally needed to do something complicated enough that I needed to understand how different databases would behave differently. It’s fine to lean on these abstractions to save time, but you can’t be afraid to dig in and learn what the abstraction is actually doing when the need arises.

Now, back to code generation.

When Spolsky wrote about code generation tools in 2002 he was talking about template systems that would have been considered fairly simple by 2020 standards. Code generation tools in 2026 is a completely different beast, and some of its proponents will advocate that it breaks through the Law of Leaky Abstractions and actually will save you time learning. And while it might save you a little time learning, it’s probably not as much as you hope.

To be clear, I’m not opposed to using AI for code generation. The website I’m publishing this on was written using ChatGPT Codex, but I made a number of architectural decisions to be sure it was something I could comfortably support if Codex ever falls short.

In my experience, AI code generation tends to work pretty reliably for simple tasks like generating a web page, and tends to work the first time about 80% of the time for more complex tasks. Further, when you do run into a bug, about 80% of the time it can take a compiler error or a stack trace, it’s able to diagnose and fix it for you. That’s a heck of an improvement over the simple templatized code generation of yore, but it creates a whole new set of problems.

AI code generation can let you get a whole lot further into the process without any human having to understand what’s going on. If you used 2002 era code generation wizard, there were people who knew how that code got generated. If you used a more modern library or framework, there were people who understood that codebase and likely good documentation on how to use it. But with AI code generation, there may not be a single human who understands what the generated code was intended to do, and if you’ve generated tens of thousands of lines of code without a human who understands them, that can be a lot to untangle when the abstractions inevitably leak.

So what do we do?

It starts with recognizing that while AI can help us avoid doing work, it can’t help us avoid learning. We need to keep a human in the loop, reviewing the code AI generates and understanding what that code does, so that when the abstractions leak somebody knows where to start. For somebody new to programming, that may be a tall order, and it may be very tempting to bypass and hope the abstraction never leaks. But it will.

While AI may not be able to help you avoid learning how your code works, it can help you learn what you need to know. Unlike the 2002 code generation tools Spolsky referenced, AI code generation tools can explain themselves. When you commit to understanding the code AI generates, the same AI that wrote the code can help build your understanding of it. When you use AI to generate code, review it line by line. Understand what it’s doing. If you don’t understand what’s going on, ask questions until you do. Occasionally you may have to break out of the AI chat window and go read some documentation, but much of the time the very tool that generated the code can explain it at whatever level is necessary to help you internalize it. Take advantage of that when you’re building your product in the first place, not when the abstraction starts to leak. You’ll be glad you did.