PurelyFunctional.tv Newsletter 432: Specific vs. general

Issue 432 - June 28, 2021 · Archives · Subscribe

Clojure Tip 💡

Specific vs. general

Last week, we left off an exploration of what makes Clojure more direct than some other languages. Just making a more concise Java doesn't get at the difference. Yes, there is less boilerplate, but there is something more to not defining concepts explicitly. We'll talk about that today.

Let's compare one language with itself, using two different styles. That is the approach Stop Writing Classes takes. It's a classic talk that compares writing classes in Python with not writing classes in Python. The examples come from actual code and are illuminating. Here is one:

In this example, a real-world API library went from 660 lines to 5 lines, from 20 classes down to 0 classes (just one function). The question is: Where did all the domain concepts go? I'm not exactly sure since he doesn't present all of the code. But I can guess. How many classes wrapped an existing type, such as a String or a HashMap? How many classes defined redundant ideas, such as new error types, already existing in the standard library? How many classes held unnecessary state that you could pass as an argument? So, the answer is that defining those domain concepts did not add any behavior. Your custom EmailAddress class is constructed with a String and eventually unwraps to a String anyway. Why not just use a String?

Domain concepts are very context-dependent. The same idea could mean something slightly different in an adjacent context. A third-party library does not know what context programmers will use it in. If it defines many new concepts, they are probably near duplicates of what you have in your first-party code. Since they are only near duplicates, you'll probably have to write lots of code to translate the subtle differences between them. It would have been better to use a standard type with a well-known interface. Yes, a class for handling URLs is more convenient than a String for many use cases. The URL class would know how to decode the query parameters and pick apart the scheme from the domain. However, Strings are the least common denominator and much more suited to a third-party library's generality level.

That might be true of third-party code, but what about in your first-party code? Should you capture domain concepts and explicitly encode your assumptions about them? This question is harder to answer.

On the one hand, the OOP world has a code smell called Primitive Obsession, which describes the use of built-in classes like numbers instead of defining a new class, like Currency. The suggestion is to identify uses of these built-ins and wrap them in a new class to give them a place for related methods to accumulate. Wouldn't capturing all of the implicit knowledge about those concepts in explicit, centralized code improve your codebase?

I'm very open to being wrong here. I'm nowhere near certain. But I believe that, no, capturing the implicit knowledge leads you down the wrong path. The original Smalltalk team seems to agree. The developers of the Python API library above wrote 20 classes to capture what they could accomplish in one 5-line function. Smalltalk-76 had "about 50 classes" that "included all of the OS functions, files, printing and other Ethernet services, the window interface, editors, graphics and painting systems, ... the famous browsers for static methods and dynamic contexts for debugging in the runtime environment." The Smalltalk team intended for us to use classes differently.

Like I said before, I'm not confident what the difference is. But my hunch is that we should encode concepts at a high level of generality. That means the concepts have fewer built-in assumptions. They can therefore travel across contexts more easily. Contexts are continuous gradients of assumptions with no discrete boundaries. Nothing encodes more specific context than the local code using a piece of data. It is a mistake to think that specific code can travel to a more general position, such as into a method on a class, entirely out of context.

I would bet that the 50 classes in Smalltalk-76 were extremely general. They defined things such as linked lists, numbers, points, and rectangles. In contrast, the Python API library used classes to create specific concepts. Smalltalk's classes were at about the same level of generality as Clojure's data structures.

Clojure's data-oriented approach leads to smaller systems because trying to state all your assumptions explicitly bloats your code. Yes, all things being equal, explicit beats implicit. But as your code grows with explicit assumptions, it also becomes harder to read, if only because it is long. Programmers mistakenly make new abstractions (e.g., classes) to describe something specific, when they should use abstractions to describe something general. We are much better off using a general construct to implement a specific thing and allow the code to define the context. It leads to less code and, more importantly, more flexibility.

Podcast episode🎙

This week on the podcast, I read from David Parnas's important paper on modularity, On the criteria to be used in decomposing systems into modules.

Book update 📘

Folks, the Kindle and iBooks versions of Grokking Simplicity are unreadable. I apologize if you bought those versions. I've been in discussion with the publisher and they are pulling them temporarily while they improve them. They are getting third-party help to do a better job at those eBook versions. Thanks for your patience.

Meanwhile, the print and PDF versions are excellent! You can order the print book on Amazon. You can order the print and/or PDF versions on Manning.com (use TSSIMPLICITY for 50% off).

Podcast appearance 📢

I was honored to talk about functional programming on Conversations about Software Engineering.

Quarantine update 😷

I know a lot of people are going through tougher times than I am. If you, for any reason, can't afford my courses, and you think the courses will help you, please hit reply and I will set you up. It's a small gesture I can make, but it might help.

I don't want to shame you or anybody that we should be using this time to work on our skills. The number one priority is your health and safety. I know I haven't been able to work very much, let alone learn some new skill. But if learning Clojure is important to you, and you can't afford it, just hit reply and I'll set you up. Keeping busy can keep us sane.

Stay healthy. Wash your hands. Wear a mask. Get vaccinated if you can. Take care of loved ones.

Clojure Challenge 🤔

Last issue's challenge

Issue 431

This week's challenge

Atbash Cipher

The Atbash Cipher is simple: replace every letter with its "mirror" in the alphabet. A is replaced by Z. B is replaced by Y. Etc. Write a function to calculate it.

Examples

(atbash "") ;=> ""
(atbash "hello") ;=> "svool"
(atbash "Clojure") ;=> "Xolqfiv"
(atbash "Yo!") ;=> "Bl!"

Please maintain capitalization and non-alphabetic characters.

Thanks to this site for the problem idea, where it is rated Very Hard in Python. The problem has been modified.

Please submit your solutions as comments on this gist.

Rock on!
Eric Normand