Problems with the JVM

Well, my last few articles might make you think I'm all rosy on the JVM. It's got all sorts of investment in performance, cross-platform compatibility, and backwards compatibility. It has industry adoption and a giant ecosystem of libraries and tooling. But there's still stuff that was just a mistake.

Null pointers

Null pointers are completely unnecessary. They muddy up the whole mess. You have to do null checks before method calls. Their inventor calls them "My Billion Dollar Mistake". Who hasn't been bitten by a NullPointerException?

Java has null pointers. And so does the JVM. Clojure would be a better language without null pointers (in my opinion), but being on the JVM, you're going to get one sooner or later. So Clojure has made nil a common, first-class value. That means you deal with it all the time. I think I use nil way more in Clojure than I ever did in Java. And it's less dangerous, since Clojure uses extensive nil punning.

For instance, in Clojure nil has a type (which is nil), while Java's null has a weird type without a name. nil is a seq and an empty associative collection. It's not perfect, but nils aren't as big of a problem as nulls are in Java.

Tail call elimination

Tail call elimination (TCE) (also incorrectly called tail call optimization) is a standard technique when compiling functions, but the JVM does not do it.

Tail calls are function calls that are the last thing a function will ever do. For instance, in this function:

(defn math [x y z]
  (+ x (* y z)))

The last thing the function will do is add, so that call to + is a tail call. The multiplication isn't a tail call because there's more work to do.

Because it's the last thing the function will do, you don't really need to remember to come back to the function after the addition is calculated. You can skip that and go straight to where the call to math would return. Typically, programming languages will use a stack to store the return address. You can save a lot of memory in the stack by eliminating these unnecessary return addresses. Moreover, you save so much that you eliminate a whole class of stack overflow errors. Languages that use a lot of recursion take advantage of TCE. Recursive definitions are difficult or impossible without it.

It's one of the central techniques Guy Steele and Gerald Sussman talk about in the Lambda the Ultimate papers. Guy Steele has been campaigning for tail call elimination since Java was originally released. Here is a succint and recent plea based on an older essay.

Rich Hickey has commented a bit about the lack of tail calls when asked in presentations. I can't really find the link, but the gist is that he considers function/method calling behavior to be part of the platform. He's not going to hack something in. And once the JVM and JavaScript support tail call elimination, Clojure will get it, too.

However, Clojure has made minor practical compromises to eliminate what would be a problem in a language focused on recursion. First of all, the Clojure compiler will set locals to null just before doing a tail call (called locals clearing). This lets the garbage collector get started on memory that otherwise would have a reference from the stack.

Another compromise is the recur form. It is known as explicit tail recursion. A big reason to have TCE is to allow for tail self-recursion, that is, a function calling itself from a tail position. The recur form does exactly that. It's only allowed in the tail position (the last function called from a function) and it does self-recursion without adding to the stack. It's an interesting compromise and it solves one of the biggest uses of TCE. However, pervasive TCE is still important and un-addressed.

When there is a lot of mutual tail calls (meaning functions that call each other in tail position), Clojure has another feature called trampolines. The solution is clever: instead of doing a tail call, just return a function that would do the tail call. Ugh. It's hard to explain in English, easy in code. Let's say you had this setup:

(defn even? [x]
  (if (zero? x)
    true
    (odd? (dec x))))

(defn odd? [x]
  (even? (dec x)))

Alright, this is totally contrived, and it only works for positive integers. But just notice that odd? and even? are in tail positions. If I call this with a big number (even? 100000000), I get a stack overflow. Each call to even? or odd? adds another frame to the stack. I can't use recur because they aren't tail calls. I have to use a trampoline.

(defn even? [x]
  (if (zero? x)
    true
    (fn [] (odd? (dec x)))))

(defn odd? [x]
  (fn [] (even? (dec x))))

Now I can just call (trampoline (even? 100000000)) and it works just fine. It's slower than TCE, but it doesn't take all of the memory of non-TCE function calls.

All trampoline does is check the value you pass it. If it's a function, it calls it and loops with the return value. If it's not a function, it returns it. So it loops until it doesn't have a function any more. It's quite a simple thing. Check out the source.

Notice also that, if you know it, you could use the # syntax to turn those tail calls into functions, which makes it look a little nicer:

(defn even? [x]
  (if (zero? x)
    true
    #(odd? (dec x))))

(defn odd? [x]
  #(even? (dec x)))

The last thing I want to mention about tail call elimination with Clojure is that someone did try to add TCE to the Clojure compiler by doing source code transformations. I don't recommend you use it, but it might be interesting to check out.

Startup times

Are JVM startup times really slow? This article has gone long, so I'll deal with this next time.

Meanwhile, if you learned something about Clojure and the JVM and want to learn more, check out JVM Fundamentals for Clojure. It's a five-hour course covering lots of JVM stuff that I use every day when programming in Clojure. It is not a complete JVM course, of course, just the important stuff for using Clojure. Buy it today or buy a membership.