Review: All the Little Things, by Sandi Metz

Today I'm reviewing a talk by Sandi Metz from RailsConf 2014 called All the Little Things.

In the talk, Sandi Metz refactors a large, nested if statement version of an implementation of the Gilded Rose Kata. The implementation is in Ruby. What's remarkable is that she does not read the description of the Gilded Rose. She only assumes it is well-tested. The refactoring shows how, through a series of refactorings, she can arrive at a simpler solution, as measured with Flog.

Her main message is that you can make code remarkably less complex using refactoring, though sometimes it will get more complicated before it gets better. All you have to do is follow the code smells and you'll see what to do next. Along the way, she has a few large points to make about duplication, team dynamics, and OO design.

I love Sandi Metz. The presentation is wonderful. But I think she misses a major point. Although she does use refactorings and the place she ends up has a lower Flog score, the reason the code is simpler is because she has found a better domain encoding.

One of my pet peeves over the last 10 years has been an overemphasis on code instead of on domain. That is, we talk about the problems in the code without reference to how they relate to the domain we are modeling. This talk is guilty of that. Metz talks about code complexity and trying to improve it. Flog looks only at the code. While measuring the code complexity is important, I don't think it quite explains what's going on. The code isn't more readable or better because it is less complex. It's more readable because the domain concepts take center stage. And they do that because the code is a better encoding of the domain.

She goes from a large, nested if statement to classes named based on domain concepts. There happened to be the same four methods each class needed to implement, a clear sign that you've found domain/model fit.

The interesting question is how she thought to switch on the name variable primarily. She claims it was that the tests were grouped and named that way. Good thing they were. Would she have found it if the tests had bad names? I'm a little skeptical when the tests are nice and clean while the implementation is a mess. It doesn't seem to generalize that you can so easily find the domain concepts.

She emphasizes a principle: "Duplication is far cheaper than the wrong abstraction." We tend to DRY up our code, even if it means making bad abstractions. Notice that this is a domain argument. What makes an abstraction "wrong"? An abstraction is a mapping between the concrete domain and a simple idealized domain. An abstraction is wrong if the mapping doesn't fit (is lossy, etc.).

She claims that the reason it's a problem is that programmers look for a simple way to make a change, they add an argument to the function, and now the abstraction is even more broken. Over time, the abstractions get worse. However, what is to say they don't do the same thing to "correct" abstractions? The phenomenon is real, but it doesn't point to why a particular abstraction is wrong.

The real problem is that the team doesn't share an understanding of the domain, the "theory" of the code, as Peter Naur put it. If you don't understand why an encoding fits the domain, you're likely to change it in ways that reduce its fit.

She is making an argument about domain modeling without realizing it. She overemphasizes the refactoring process, and underemphasizes her deep intuitive process of seeking out a good domain model.

She also recommends "refactoring through complexity." Each step of refactoring led to more complex code by the Flog metric, until the last one which deleted unused code. She says that the reason most people don't get to the simpler code is that they don't keep going long enough to find it. I'm skeptical. I think that is often the case, but in my experience, it is not a linear path. Sometimes you take wrong turns. So I would amend her advice: There is some simpler code out there but you may have to move forward further than you think, backtrack, and try multiple paths.

And, finally, something is missing. While this was a great example codebase given the constraints of presentations, it doesn't show what happens when you have a codebase that isn't quite working. I've encountered codebases that are so complicated they don't handle corner cases correctly. These cannot be handled purely by refactoring, since correcting the bugs would be a behavior change. They're so complicated any changes would only make them worse. I'm not knocking the presentation, but I wish there was a way to simplify these, where behavior change is part of the improvement. Perhaps a hybrid solution would work.

What's remarkable is that she used code smells and refactoring (some of which are not in the book) to completely rewrite the implementation with a better domain encoding. She discovers the important concepts in the domain along the way, instead of trying to shoehorn them in. It's a testament to the interplay between code problems and domain problems. This phenomenon gives me hope that we don't need to do so much upfront domain analysis, though I'm still skeptical. Perhaps we can refactor our way to better domain models.