Stop Refactoring and Start Factoring

Summary: Refactoring is focused on the quality of code, while factoring aims to uncover the underlying beauty of the problem domain, as expressed in code. Instead of cleaning up your code, try factoring.

You have some code. You notice that it's not too readable. Maybe it's a little messy. There are some obvious code smells: some repeated code and large functions.

You start refactoring. After a while, it's a clean, neat bit of code. It's very understandable and will be cheaper to modify next time.

But is it correct?

I don't mean in the "all-the-tests-pass" kind of way, because refactoring takes care of that. I mean: does the code do what it should? Refactoring only says that it does not modify the outward behavior of the code, not make it more correct. And although it's clear what the code does (thanks to all that cleanup), it's not clear that the code does what it should.

I am a big fan of the book Refactoring by Martin Fowler. It's an edifice of analytical thinking and presentation. Go read it now. It will make you a better thinker and programmer. However, I have a slight, semantic beef with refactoring. Here's Fowler's definition from the book:

noun: a change made to the internal structure of software to make it easier to understand and cheaper to modify without changing its observable behavior

verb: to restructure software by applying a series of refactorings without changing its observable behavior

That's a great definition of refactoring. My beef is not with the definition. My beef is with its purpose, which is to "make it easier to understand and cheaper to modify". Again, it's a great thing to make your code easier to understand and cheaper to modify. But that's not what I'm after, most of the time.

What I'm after is code that models the problem. This is the only reliable way to make software that works. Code that inadequately models the problem is littered with nested conditionals for special cases, is unnecessarily bound in time and context, and is generally obtuse. You might be able to understand what the code is doing, but it's unclear whether it should be doing it.

The only known way to write code that models the problem is to factor. Let's get a definition:

verb: to decompose code to reveal the structure of the problem

Factoring is inherently about decomposition. It means splitting functions into smaller functions (along the structural lines of the problem). It means finding those functions which are fundamental to the problem (you can tell they are fundamental because they are used in multiple places). It means revealing symmetries. It means separating concerns. Factoring is about uncovering structural beauty in problem domains.^1 Symmetry, proportion, and harmony.

The problem with factoring is that it takes a long time. And you actually have to understand the domain. You have to explore the problem a lot longer, perhaps trying different variations in the code, before you can be satisfied that the code models the problem. Time is not something we have in our "Just ship it!" modern world.

The feeling of refactoring is like bringing order to a room: you put things away, you label things clearly, you might even throw out some old junk. But the feeling of factoring is like rebuilding a room for a specific purpose. Refactoring is cleaning up the kitchen. Factoring is taking the kitchen apart and building a new kitchen better suited to the styles of the individual chef. It's not practical to rebuild your kitchen all the time, though it is practical to tidy up. But when you do it, it makes all the difference.

That metaphor gets at the other fundamental difference between factoring and refactoring: refactoring does not change the behavior of the code, while factoring might. It might because the code might turn out to be incorrect for the problem. Refactoring can reveal bugs. But if you're going to fix the bug, you've stopped refactoring and gone to something else. In factoring, changing the behavior is just part of the process. From the factoring perspective, you're not fixing a bug. You're correcting the expression of your problem.

Refactoring by design and definition is focused on the code itself. Factoring is more of a process. It's a journey the programmer takes into the heart of the problem. In its wake, the hills and valleys of the problem are mapped out in the code. And the programmer ends, like in most journeys, a different person.


  1. I suggest you choose a good notation.