Clojure Gazette 1.80
Lean Clojure: An Interview with Reid McKenzie
Clojure Gazette
Issue 1.80June 15, 2014
Dear Clojurists,
Our next interview of the Google Summer of Code in Clojure is with Reid McKenzie.
Clojure Gazette: Are you a student? Where and what are you studying?
Reid McKenzie: I am an undergraduate at the University of Texas at Austin majoring in Computer Science, technically on the Turing Honors degree plan which places an emphasis on research and near graduate level work. In the fall I will be returning to UT for my 4th and final year of school. I hope to repeat Ambrose B. S.'s success in using GSoC work for my undergraduate thesis, hence part of my interest in GSoC.
CG: How did you get started with Clojure?
RM: I have been playing with lisps since high school when they were recommended to me as an inroad to functional programming. The idea of macros also appealed to me as it was evident that they were tightly coupled with the way that the language compiler worked and I've always been interested by how human readable code was translated to machine code.
Clojure was suggested to me by William Ting, a friend from school, as being a lisp family language which would satisfy the discontent I was feeling in exploring Scheme and Common Lisp. I picked up Clojure in the spring of 2012 through a combination of reading Joy of Clojure, Practical Clojure, Clojure Programming and most importantly asking freenode's #clojure for help. Clojure did indeed satisfy my complaints against other Lisp dialects and was interesting to learn as the data immutability model of computation forced me to move away from the imperative update style I was used to from C, Python, and other Lisps towards a much more functional style. Today Clojure has replaced Python as my go-to language for side projects and other work where I have language flexibility because I find that I'm impractically more productive writing Clojure than I am in C or Python because my bugs are less involved, and that anecdotally I have more fun writing Clojure.
CG: What project are you working on?
RM: My GSoC project is to build an alternative Clojure compiler which takes a whole program optimization approach to emitting efficient Clojure bytecode. In order to enable the dynamic REPL driven development model that we're familiar with and love in Clojure the core Clojure compiler trades off execution speed and memory footprint in favor of enabling dynamic behavior like runtime var redefinition and metadata introspection. While these features are awesome from a developer's standpoint, their costs are at present still felt at deployment time when the benifits of runtime dynamism are for the most part no longer apparent.
My GSoC project is to leverage the work that Nicola Mometto did last
year in developing tools.analyzer(.jvm)
and tools.emitter.jvm
to
build an alternative Clojure bytecode emission system which seeks to
discard runtime dynamic behavior wherever possible in the hope of
reaping performance and memory usage improvements. The final product of
my project will likely be changes to leiningen allowing users to specify
compilation modes as part of their AOT configuration, at a minimum
providing choice between the existing Clojure core compiler, Nicola's
tools.emitter
compiler and my tools.emitter
-derived compiler. My
hope is that this helps open the way for building other Clojure emitters
and compilation modes playing more advanced tricks such as using
core.typed
to emit more tightly typed bytecode and compiler
introduction of transients where possible but both of those goals are
beyond the scope of this summer unless things prove much easier than I
expect.
CG: What kinds of optimizations are you planning on building into the compiler?
RM: The two big tricks that I'm looking to play are tree shaking and var elimination.
Tree shaking is simple: if some var "foo" is defined but not used, don't
define it. This sounds silly, but it could be really interesting when
deploying a production application. I'll pick on clojure.zip
here, but
the choice of victim is arbitrary. Say I use one function from
clojure.zip
in my entire application and I never use clojure.pprint
.
At present, all of the sourcecode for clojure.zip
and clojure.pprint
will be packaged into your application uberjar because no attempt is
made to prove that they will never be used. If you do in fact never use
clojure.pprint
, all its inclusion does is waste memory and load time,
as does the rest of clojure.zip
that's just baggage. The first goal of
my compiler is to be able to do this analysis of exactly what code can
be reached at runtime and simply discard everything else ahead of time.
Now this of course relies on your never using clojure.core/eval
,
tools.emitter.jvm/eval
, or clojure.core/symbol
at runtime, but by
blacklisting those symbols and aborting tree shaking when they occur I
expect to be able to eliminate significant subsets of many codebases at
compile time. This is what I'm working on at present.
Var (or rather IFn
) elimination is more involved. The JVM prior to 1.8
with the introduction of bytecode lambdas has no way to move a
"function" around as a value. The workaround that Rich came up with when
Clojure was first designed was the IFn
interface, which provides a
standard for the .invoke()
method and defines a "function" to be
anything which can be .invoke()
ed. When a function is compiled, there
is a function object that is created and implements IFn
appropriately.
The wrapping namespace compiles to a class with an init method which
installs a Var
in the global Clojure var table under the appropriate
name and with an instance of the implementing IFn
as its value. Uses
of this function as a symbol indirect through the global symbol table to
look up the var, dereference the Var
, and call the .invoke()
of the
resulting IFn
. This means that even in the same namespace, uses of a
Var
indirect through the global symbol table. This is reasonable from
the context of REPL driven development because it ensures that the "most
recent" definition of "foo" is always used however most applications do
not redefine symbols at runtime except in the well defined case of
dynamic vars.
This means that, for the most part, all these little objects with
.invoke()
methods are wasteful because the IFn
class is needed only
when a function can be taken as a value, and the var indirection is
needed only when a symbol may be redefined. This means that when the var
is not redefined and the function is not taken as a value that there is
no reason to build a separate object and incur the load time and memory
footprint of an extra object when we can get away with merging multiple
fn
implementations into a single class. Joining several fn
s into a
single class should yield a load time benefit, but most interestingly
should yield a performance benefit because fn
s implemented by the same
class in different methods can simply invoke each other as methods
rather than having to do a var table lookup. For those fn
s which are
taken as values we have to preserve the IFn
class which can be passed
as a value, however the fn
value class need not provide
implementations: it can defer to the merged (and faster!) merged fn
blob for its implementations. This opens the doors to some really fun
tricks such as merging all the fn
s at a namespace level, merging fn
s
at some sort of "module" or "library" level and even trying to merge all
the fn
s in the entire application, which would allow for near total
elimination of var indirection.
CG: Wow. That sounds really ambitious. Do you have any idea what kind of speedup that will give?
RM: To be blunt I have no idea. I expect that for large applications with many dependencies I will be able to do at least a 10-25% reduction of code size by eliminating unused parts of Clojure core and of library dependencies. This number may well be larger, but any system for measuring dead code will be at least as complicated as the one I need to build so I plan on just finding out.
As far as t
he function squashing goes again I have no scientific way to
estimate the speedup. However, the var dereference operation chases
through a JVM "transient" field which prevents the JVM from inlining the
referenced IFn
application. As a memory reference is amazingly
expensive compared to a branch to an address in the processor
instruction
cache
I could potentially see a factor of two or more speedup in some
applications. The real issue I expect to run into here is that most
Clojure applications are memory bound. Without compiler introduced
transients or aggressive programmer use of transients the functional
update style that Clojure promotes has a huge memory footprint and plays
hell with hardware prefetching structures and thus performance. Looking
at Clojure benchmarks against say Scala, Java and Haskell it's clear
that while we often do well in terms of LoC and total runtime we often
"feature" a factor of two or more memory usage increase. So for
applications involving complex functional updates and copying of
datastructures the speedup may not be huge but again there's only one
way to find out.
CG: What are the major challenges of the project?
RM: I expect that the biggest challenge I'll have to deal with
this summer is feature creep and keeping myself focused. Most compiler
infrastructures have well developed, well defined, type driven
structures for what exists in a program tree. Because as Tim Baldridge
put it tools.analyzer
is data driven and self descriptive I don't have
that to work from. That represents a significant architectural challenge
for me because I need to unlearn working with a type driven, operation
limited program tree.
Figuring out what level various transformations are legitimate is gonna be difficult too. I made function merging sound really easy earlier. Unfortunately that's something that will need to take place at the namespace or whatever the merge unit is emission level. Clojure compilers to date have only ever done emission at the expression level, which means that I'm architecturally on my own. All of this combined means that I get to design the first nontrivial compiler and compiler infrastructure for Clojure with all the consequent opportunities to explore architectural rabbit holes and features that won't help me get my core product delivered. It also doesn't help that I've got a dozen other ideas for tricks my finished compiler could be modified to play which don't contribute to my stated MVP.
CG: Sounds like quite a big task. Do you think this will dig up inconsistencies in Clojure's treatment of namespaces? Will this be able to clean things up?
RM: Almost certainly not. As the saying goes, I already have
enough rope to hang myself with. As there is no explicit standard laying
out shall we say "common Clojure" as there is for ANSI Common Lisp, any
deviations from the Java reference implementation that I introduce will
just muddy the waters further. No benefit will come to Clojure as a
language standard if I introduce more porting concerns than already
exist between CLJ and CLJS. There will likely be some minor deviations,
such as making the ^:static
annotation meaningful again and making
implementation details like what metadata is preserved (if any)
user-tunable. However, any code that runs atop Oxcart (the working name
for my project) should run without changes and with identical results
atop standard JVM Clojure and tools.emitter.jvm
.
CG: You mentioned before using types for traversing an AST. What language was that in? Can you describe that project?
RM: C++ unfortunately. For the graduate compilers class which I was involved with this last semester we were assigned to implement a variety of traversals and optimizations over the LLVM C++ API which makes aggressive use of in place mutability and a custom pattern matching construct to achieve very nice programmer semantics and a tight memory footprint as these things go. We did a couple things including implemen t ing constant folding and partial evaluation. We ultimately were assigned to build an iterative dataflow analysis framework using the template library and to implement dead code elimination using it. Wasn't a whole lot of fun, but it was awesome exposure to relatively modern research compilers techniques and some of the formal methods involved therein.
CG: Sounds challenging. Is there anything else you'd like to say before the end of the interview?
RM: I'd like to quickly thank Timothy Baldridge and Nicola Mometto for their previous work in this area and continued support, as well as Daniel Solano Gomez and the rest of the Clojure GSoC team for this opportunity to do something awesome and bloody hard with my summer, but that's all I've got.
CG: Where can people follow along in your progress?
RM: The source code for the current version of Oxcart is
online. Once functional, Oxcart will
likely be restructured into contrib libraries as the core team deems
appropriate. However, I'm delaying the involved architecture decisions
until after I have a working product. The above comes with the obvious
disclaimer that it's being developed full time by an undergrad, is
subject to daily changes and comes with no waranty. At present Oxcart
does little more than tools.analyzer.jvm
, however it is able to
analyze and emit both itself and the Clojure core and build ASTs for
loaded code. Documentation on the provisions of the Oxcart compiler and
its differences from the standard Clojure runtime is forthcoming and
subject to change.
CG: Nice. Thanks, Reid, for a great interview!
RM: Thanks Eric, take care!