Clojure Error Messages are Accidental
_I've recently shifted my thinking about Clojure error messages. It is more useful to think of them as non-existent than to think of them as bad. We end with the role Spec can play in improving error messages. _
I have the good fortune of helping many people learn Clojure. One of the most common complaints is the bad error messages. I have been looking for different ways to improve the error messages for beginners. I've tried better stack trace printers. I've tried implementing some prototype solutions to catch bad arguments to functions. I've been looking at other languages, particularly Elm, for inspiration. I've also just been exploring how error messages are actually implemented in Clojure. What I've discovered has been kind of surprising. I have a completely new perspective on error messages in Clojure, and I want to share that with you.
Before I go on, I want to make something clear: Clojure error messages are bad. They make it harder for beginners. I don't think anyone really argues with that. I like Clojure and think it's well designed in general. I wanted to figure out what was going on with error messages.
It will be important to distinguish between the error messages and the stack traces. The stack traces are those horribly long printouts you get when you get an exception thrown. It's one of those cases where Clojure is relying on the host platform. That's just what JVM stack traces look like.
The worst sin of stack traces is that the most important information is printed first, then followed by less and less important information, usually a few screenfuls of it. In essence, the stack traces are printed backwards in the terminal. You have to scroll way back, looking for the beginning of the trace. The default stack traces are ergonomically difficult.
The second worst sin is that lots of details from Clojure's implementation is included in the stack trace, making them longer and noisier, and, frankly, more intimidating. Both sins suck. I lived with them for years. And they are easily solvable. I happen to use and like Pretty, which reverses the stack trace and filters out noise.
But both of these sins combined are not as bad as the Exceptions themselves. Let's dig into those.
There are actually two major types of errors in Clojure. There are compile-time errors and run-time errors. Compile-time errors are things like syntax errors, unresolvable symbols (missing variables), and errors during macro expansion. These errors tend to be decent, on average. The Exception type makes sense and the error message points to the problem.
(f)
;; throws clojure.lang.Compiler$CompilerException:
java.lang.RuntimeException:
Unable to resolve symbol: f in this context, compiling:(*cider-repl pftv*:2780:15)
It's not beautiful, but it's acceptable. Some Clojure macros had incomprehensible error messages, but that seems to have gotten better. It used to be:
(defn foo a)
;; throws IllegalArgumentException:
Don't know how to create ISeq from clojure.lang.Symbol, compiling:(*cider-repl pftv*:2780:15)
It seems like Clojure just assumed the parameter list was a vector and tried to iterate over it. Now (Clojure 1.8 even) it's much nicer:
(defn foo a)
;; throws java.lang.IllegalArgumentException:
Parameter declaration "a" should be a vector
Runtime errors, on the other hand, are not as good. There has never seemed to be any consistency to them. The Exception type is sometimes correct, but sometimes it seems to have nothing to do with what I am doing. The error messages rarely describe the actual problem. Here's an example where the error message is just wrong:
(def my-val 1)
@my-val
;; throws java.lang.ClassCastException:
java.lang.Long cannot be cast to java.util.concurrent.Future
Who said anything about a Future? Why is Clojure bringing that up? You don't know how many times I've looked for Futures in my code, only to realize I had a stray @ somewhere.
And what's more, very often, no error is thrown at all. Maybe nil
is
returned, or some other unexpected value. Here's an example:
(keyword 5) ;=> nil
(keyword nil) ;=> nil
That's not very helpful. Try passing different types to keyword
to see
what happens. Wouldn't you expect an error?
Before I started exploring, these behaviors seemed lazy and neglectful. However, they have been something I've learned to live with. Sometimes the error is something I've seen before. But more frequently, I don't even read the error message. I work in small increments. Any error must be somewhere in the code I just wrote. I didn't pay much attention to error message design until I started exploring it more.
Through that exploration, I've realized that it's not so much that Clojure's errors are bad. It's more that they're accidental. Clojure's core functions are, for the most part, implemented without checks on the arguments. They only code the "happy path". They assume that the arguments are of the correct type and shape, and proceed without caution. When they do explicitly throw an Exception, it's deep inside a conditional where there clearly isn't a way to proceed.
What happens if they're not correct? That's up to chance. Sometimes you
get an Exception from something that does implicitly check its
arguments. For example, trying to call any method on nil
will throw an
Exception. And numeric functions will die on non-numbers. But sometimes,
after trying all of the tests in a conditional, the conditional fall
through to nil
. That's what happens with
keyword
.
It appears that it worked, when nothing was done at all, and nil
is
returned.
I've come to believe that Clojure's errors aren't bad by design. No, it's something totally different. Clojure's errors are actually missing.
We're used to languages doing runtime argument checks. One robust way to implement checks is to implement them in the most central core of the language. Then you build new functions on top of that core. Those new functions can choose to check their own arguments and throw meaningful errors, or rely on on the underlying core's errors if they're sufficient (or you're lazy). It's a systematic way to ensure that bad runtime behavior throws an error. Even a language like JavaScript will eventually bottom out with "undefined is not a function" (meaning method not found) or a null pointer exception. Even if the errors are bad like JavaScript's, at least they exist. That's one way to do it, but that's not what Clojure does.
What has Clojure done?
Clojure, in typical "de-complecting" style, has separated out the implementation of a function from enforcing the preconditions of that function. We have the implementations, which are the "happy paths". What is missing---and has always been missing---are the preconditions. It's almost as if Clojure's functions' implementations assumed some kind of external check on their arguments would happen.
Enter Spec. Spec is that external check. Spec is not there to make the error messages better, as is widely believed. Spec is there to have error messages at all. Since the runtime error messages you see are accidental, any consistently applied error checks will be beneficial. Once we have error messages, we can begin the work of making them better.
Speculation
There's a concern that Spec may make error messages worse, and those concerned point to some macros' error messages where it actually did get worse once the macros were Specced. This is a valid concern. However, I believe that it is far worse to have accidental error messages (like we have now) than consistent error messages, even if they're bad. Spec's messages may be bad (I don't like them that much; they remind me a lot of Haskell's type error messages), but they will cover the core functions of the language. Functions that call those will also get error messages. I used to be skeptic al of the benefit of Spec, but now I'm looking forward to its release.
I think Spec is going to surprise us. Once Clojure's core functions are
specced, we will be
surprised by how much of our code violates the assumptions of the
functions we use, but somehow worked anyway by accident. For instance, I
can imagine nil
s flowing through functions like get
that seem to
tolerate them and returning nil
themselves. Since nil
is a valid
return from get
, we handle it and it may work out. But is nil
really
a valid argument to get
? Probably not, and that would be reflected in
the Spec. We've probably got a lot of code like that. When we turn on
Spec instrumentation and run our tests, we'll have to face all of these
violated assumptions that happened to work. There will be many errors in
our existing code.
Even those of us who have worked in Clojure for a long time will have to internalize those assumptions. We're not used to a pre-condition checks. We're used to thinking of Clojure's functions as loose and dynamically typed. Before we internalize the logic of Spec, we will write code that doesn't pass the spec the first time. We will be like beginners. And some may not like the language that it will have become because it will be picky in unfamiliar ways. Lucky for them, they can completely turn it off. But we will be better off with instrumentation on.
I think we will be equally surprised by how simple the types are. For
example, it's easy to see clojure.set/union
as a really complex
function. It takes two collections and
returns a new collection with all of the elements of both collections.
However, sometimes the type it returns is from the first argument, and
sometimes from the second argument, depending on which one is bigger.
That's so complicated.
(clojure.set/union #{0} [1 2 3 4]) ;=> [1 2 3 4 0]
(clojure.set/union #{0 1 2 3} [4]) ;=> #{0 1 4 3 2}
(clojure.set/union '(0 1 2 3 4) [5]) ;=> (5 0 1 2 3 4)
When we first use union
, we are trying to apply rules that make sense
in other areas of Clojure. Shouldn't the arguments be coerced to Sets,
like all collections are coerced to seqs in the
sequence functions? Or
shouldn't it really always use the type of the first argument, like in
protocol methods? And there can't be something wrong with the types,
otherwise it would have thrown an error, right?
But these rules don't apply. All of the union
calls above are wrong.
The fact that they do anything at all is an accident. The type of
clojure.set/union
is so simple. It simply assumes all of the arguments
are Sets. If that's true, it will always return a Set. In Haskell, you'd
say Set -> Set -> Set
. And that type does work with Clojure's union
.
We can predict that Spec will choose exactly that type. Most experienced
Clojure programmers will agree to that. So what was once complex
behavior will be replaced by simple behavior or an error message.
But I think there will be some cases where we don't agree, or at least we will have to rebuild our understanding.
Finally, there will be times when speccing a function will inform the
implementation. Currently, the type for
clojure.core/nth
is quite complicated. There are eight different types that it can take.
And there's no nice abstraction that covers them all. I mentioned this
on Twitter and Alex Miller pointed
out that we
might need a new notion here, perhaps called nthable?
, which would
simplify the spec.
The least obvious consequence of my new perspective (that we are adding
error messages where none really existed) is that we have an opportunity
to correct some mistakes in the implementation. For example, should you
really be able to get
out of anything?
(get (java.io.File. "hello.txt") :foo) ;=> nil
Does that make any sense? Is that the expected behavior? Maybe we should stop wondering about the expected behavior when you pass in garbage input, and talk instead about the expected input. Spec will let us talk about that.
Maybe Rich has learned a lot since a lot of these functions were defined
ten years ago and he wants to use the release of Spec as a way to
correct those mistakes without breaking backwards compatibility. Or
maybe the core specs will be extremely conservative. It wouldn't
surprise me either way. But if the official spec doesn't restrict get
,
Spec will let people redefine the spec for get
for themselves. They
can choose whatever subset of the type they want. I can imagine many
companies adopting strict specs that catch bugs they find in their code,
like some companies release their linter
configs
today.
In fact, you can do this right now in your code. If get
is a source of
bugs for you, define a spec for it, instrument, and run your tests.
Finally, this new perspective has helped me understand Cognitect's stance on the error message issue. People have been complaining about error messages for years, and the response from Cognitect has been alienating. Perhaps I'm dense, but it hasn't been until I made this mental shift that I've understood some of what they're getting at.
Conclusions
I'm finding it useful to see Clojure error messages as missing. In practice, they're still bad, but this perspective helps me understand why and gives Spec more meaning. Clojure's core specs, where every function has a spec, will finally give us error messages. However they are, they are better than what we have now. Will core specs also uncover lots of problems in our existing code? Will core specs change our understanding of the difference between runtime behavior and valid input? Who knows. But I'm looking forward to a release of clojure.core.specs. It will contain lots of insights into Rich's design and about how to best understand Clojure. And once we have error messages, we can finally begin the work of making them better.