Clojure Gazette 1.38
Issue 1.38 --- April 21, 2013
I have been exploring virtual machines lately with the idea of writing my own to learn how they work. What I have found is that there is a lot of great stuff out there on the internet. There's way more than I can possibly present here, but here is a selection of some cool readings/watchings. There's some histories, some deep technical stuff, some JVM stuff, and some academic papers.
P.S. Feel free to email me any time. I love hearing from readers.
If you have not checked it out yet, (def newsletter) is a great, weekly collection of links. It has a slightly different focus from The Gazette. The Gazette is more focused on high-quality content that a Clojure programmer might be interested in. (def newsletter), from what I gather, is more focused on new things in the Clojure world. The eleventh issue has just come out. I enjoy it every time it comes out. Do check it out.
The Smalltalk specification includes an implementation of the virtual machine in Smalltalk. The VM is a stack-based machine which defines how to access the data structures which represent objects. I really like how small and clean the VM is.
LLVM is a set of low level libraries for creating compilers with a well-decomplected architecture. It separates the frontend (the bytecodes), the optimization, and the machine code generation. This chapter from the excellent book ( The Architecture of Open Source Software ) describes the rationale and evolution of LLVM.
The JVM is a stack-based machine which runs JVM bytecode. This is the complete technical specification for how it should operate.
How The JVM Spec Came To Be (video presentation)
An enlightening talk by the creator of the JVM, James Gosling, which explains many of the decisions behind the JVM as we know it.
Fast Bytecodes for Funny Languages (video presentation)
For real JVM geeks,Cliff Click presents performance improvements to several JVM languages (including Clojure).
Darek Mihocka engineers a bytecode interpreter loop that compiles and runs efficiently across many CPUs and compilers. It takes into account cache misses and branch prediction.
The dirty secret is that many bytecode VMs waste most of their time dispatching to the code that implements the op codes. Ian Piumarta and Fabio Riccardi show how dynamically inlining common bytecode sequences into their own op code can improve performance drastically.
A paper which compares the performance of stack- and register-based machines. Stack machines require more instructions for a given task, but the instructions are smaller and the compilation is more straight-forward. Register machines require fewer instructions, but results in larger code. Since instruction dispatch is the biggest expense of a VM, register machines tend to be more efficient. Read the paper to go deeper.
Dawson Engler's paper presenting a system for dynamic machine code generation which costs six to ten instructions per emitted instruction.