I was fortunate enough to interview Richard Möhn who has been working on a Google Summer of Code project. His project is to bring better documentation support to Clojure and its libraries. You can find the interview below.
Clojure/conj is looking for talks. The submissions close August 14, so get going!
And Replete, the free iOS ClojureScript REPL is now available in the App Store!
Please enjoy the issue!
PS Want to be more attractive? Subscribe!
PPS Want to advertise to smart, talented, attractive Clojure devs?
Sponsor: Homegrown Labs
Unit tests exercise the individual parts of your system. And acceptance tests make sure the business requirements are met. If you're worried about how much failure in production will cost in your complex, distributed, and unpredictable system, Homegrown Labs can help build your confidence. Homegrown Labs specializes in Simulation Testing, which tests your system as a whole by simulating a realistic workload. Please support the Gazette by visiting Homegrown Labs and signing up to learn more about Simulation Testing at the bottom of the page.
LispCast: How did you get into Clojure?
Richard Möhn: I don't have a good story here. Maybe I had heard of Clojure before, but the first time I really became aware of it was at goto Berlin 2013. It was mentioned several times and people seemed both awed and afraid. At that conference I also attended a talk by Richard P. Gabriel, which got me more interested in Lisps. Loosely following that, I did the Clojure Koans, started reading Structure and Interpretation of Computer Programs and some time later On Lisp.
Until summer 2014 I didn't do much in Clojure, just played around a bit. Most of my functional programming experience was in Haskell and Scheme. But then I had to decide what I want to do, so I decided that I wanted to do Clojure, because it was fun(ctional) and appeared to be comparatively widely adopted. I read The Joy of Clojure and I got a part-time job developing a web application in Clojure and ClojureScript. Quite a rare chance, I'd say, especially if one knows as little as I did. And now the Google Summer of Code. Another rare chance.
Well, as I said, not a very good story.
LC: Can you describe your project?
RM: I'm developing a model for representing data about Clojure objects like artifacts, namespaces and what is defined inside namespaces, i.e. functions, macros, protocols etc. "data" includes doc strings, signatures, examples, notes and annotations for macros that help IDEs provide hints and semantic editing. Anything you might want to attach to something. The model just defines the rules you have to follow.
Those data can be scraped from existing Clojure code (JVM Clojure, ClojureScript, ClojureCLR, Ox or whatever platform), from Clojure code where you add data manually and also from external files. They can then be merged together, packaged up in JARs with the classifier "datadoc" (currently, subject to change) and uploaded to Clojars. Developers can download these JARs and do something with them.
For example, we want to provide a comprehensive package of documentation about Ring and the libraries around it. Luckily, the authors of Ring and those libraries have uploaded Datadoc JARs to Clojars. We also have a repository of notes and examples for the various namespaces and functions in those libraries. (For some reasons, people didn't want to include them with the libraries themselves.) We write a small program (using the library from my project) that reads in all these data, merges them together and produces a new Datadoc JAR. This could be uploaded to a web application that can render the contents of Datadoc JARs or pushed to Clojars where your favourite IDE can find and download and let you browse it.
Note though, that my project will only provide the infrastructure for all this: the format, reading, transforming and merging collections of data about Clojure APIs, and packaging and distribution. Also, of course, an architecture and tools that allow people to write arbitrary extensions. Two examples of things people have asked for: structured examples in the code. The extension could extract and run them to make sure that what is written corresponds with reality. (I imagine something like Python's doctest.) And declaring the markup language a docstring uses. Documentation generators could take this into account when rendering docstrings to HTML.
LC: Does this mean someone could provide their own documentation for
clojure.core that could be, for instance, tuned for beginners, that anyone could add to their IDE?
RM: Yes, that's a nice scenario. Let's say it's Dorothy the Documenter who wants to write beginner docs. She's plowing through clojure.core alphabetically, has reached M and now wants to document map. Here's what Dorothy might write:
["org.clojure" "clojure" "1.7.0" "clj" "clojure.core" "map"] map is a funny function. It lets you reverse every word in a list of words. Example: user=> (map string/reverse ["warbler" "sheep" "jellyfish"]) ("relbraw" "peehs" "hsifyllej")
The first line tells us that the documentation is for the function map in clojure.core in the JVM Clojure code of Clojure version 1.7.0. A few hours later, Dorothy has finished take-while and taken a break. She feeds these files into a little program, which parses them into a data structure compatible with the library I'm developing, checks that the examples are correct, merges it with the existing docs, packages and uploads it as
Then Alfred the Imperative, who is dipping his toes in Clojure, can load
org.dorodocs/sane-clojure into his IDE, which hopefully supports datadoc JARs. (Cider already has support for Clojure Grimoire, for example.)
Of course, Dorothy could also skip the parsing and write the docs in the proper format in EDN. And she could skip the merging and leave it up to the documentation browser which bits of documentation it displays for a given function or namespace.
By the way, I guess Dorothy was having a bit of a low there halfway through
clojure.core. Normally her writing is less frivolous.
LC: What are some of the biggest challenges?
RM: There is one big challenge and that is the design of the system. There are two parts to that: a model for data about Clojure objects and the architecture of the system that implements that model.
My goal with the model is to make it fairly rigorous: when Conrad the Puzzled gets data according to the model, he should know exactly what is there and what isn't and he should know what all that means.
But the model should also be open, that is, it should accomodate arbitrary data that people may want to work with. Documentation, examples, giving clues about the markup language used in doc strings, machine-understandable information about macro definitions etc. That's not too difficult to model, I'd say. It's just sticking a key-value structure somewhere and asking people to please specify their keys and values in the rigorous spirit of the model.
Unfortunately, Clojure is a rather wobbly basis for a model for representing data about Clojure objects. You have all sorts of different objects like functions, macros, deftypes, protocol implementations and so on. Some of them have to do with vars, some with classes, some with both, some with none. If you go to ClojureScript, you have different objects again. And all those objects have different kinds of names and different ways of finding them and different ways of getting information about them.
I don't know enough to put all that into the model up front. So the model itself has to be extensible. People (including me) have to be able to "add" to the model as they learn about what is there and what is needed. To design this is hard for me. And of course everything should be reasonably easy to use.
The architecture of the system partly follows from the model, but also has to address all the real-world issues the model doesn't need to concern itself with. Here the difficulties are not so much about finding out what I should do and how I should do the individual bits. I agonize more over how to fit all this together, so that it feels good.
Apart from that there were some technical challenges as well. Involving a clean second Clojure instance in the same JVM or writing a largish code generation thing, for example. But those are usually fun and challenge rather my judgement of whether solving them is actually necessary…
LC: What's the state of the project? What's left to accomplish?
RM: Hard to say. I'm a bit sloppy with my planning. I have released versions of the model and the library that work, but the model is both too unflexible and not rigorous enough. Currently I'm reworking that.
What I at least want to deliver is a solid basis to build on. So, apart from updating model and library, I still need to work out the exact layout for the Datadoc JARs, consolidate the APIs and write some documentation, including examples. There's already a bunch of documentation there, but it's a bit labyrinthine. And I should write a small Leiningen plugin that enables library authors to easily create and deploy Datadoc JARs.
Of course, there's much more that I can do if I find some time left over in the end. But things always take longer than you expect, as we all know.
LC: How can people contribute to the project? Is it ready for that? Where can they follow the progress?
RM: My progress is recorded in a bunch of Git repositories grouped under "Grenada".
I think contributing code would be a bit premature at the moment. But I'm always happy when people contribute their thoughts. There are two places where this would be especially helpful: the API example and even more especially the model.
LC: If people are interested, where can they follow you online?
RM: I'm a bit lazy with publicity. You can go to my GitHub profile and work from there.
LC: If Clojure were an animal, what kind of animal would it be?
RM: A herring, of course.
LC: Thanks for the interview!