5 Differences between clojure.spec and Schema
Summary: Schema and clojure.spec aim to solve similar problems. There are significant differences, though, that might not be obvious at first.
Schema came out in 2013 and I started using it right away. At the
company I was working at, we had a few API endpoints and we were having
the classic problem of having to write custom checkers for our data.
Schema seemed to solve the problem of describing the shape of the data,
along with expected types at the leaves. Because it was mostly just
data, it composed well. For instance, you could def
an Address
schema and reuse it wherever you needed an address. We also experimented
with the coercion facilities of Schema to convert data from the JSON
endpoint into better Clojure equivalents. For instance, we converted
date strings to java.util.Date
objects.
That was three years ago and Schema has since been used quite widely. It's used in many talks at Clojure conferences. And in general, it felt like it solved the problem pretty well, across Clojure and ClojureScript. Now, out of the blue, the Clojure team announced clojure.spec. I know when Rich Hickey writes a blog post, it's something important and insightful. So I take it seriously and try to parse it. And let me say, I had some trouble. It's apparent that Rich went deeper than I have on this problem.
In order to understand clojure.spec a little better, it helped me to compare it to Schema, which I already understood. Here are the main points of similarity and differences:
1. clojure.spec is not a "Data DSL".
Schema focuses foremost on describing a data shape by using data in that shape. It is a "Data DSL", where a map means "expect a map" and a vector means "expect a vector". That means that the schema looks similar to the data it specifies.
clojure.spec takes a different approach. It's not a Data DSL. Specs do
not aim to look like the data they are describing. The library is a
collection of small tools that do different jobs that can be used
together. There is a tool for maps (called
keys
)
that checks for the presence of required and optional keys and checks
that their values conform to the named attribute. There is a tool for
sequences that uses regular
expression operators. And, at bottom, conformance is checked by
predicate functions.
;; Schema
(def Person {:first-name s/String
:last-name s/String
:email #"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,63}"
(s/optional-key :phone) s/String})
;; clojure.spec
(s/def :com.lispcast.person/first-name string?)
(s/def :com.lispcast.person/last-name string?)
(s/def :com.lispcast.person/email (s/and string? #(re-matches #"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,63}" %)))
(s/def :com.lispcast.person/phone string?)
(s/def :com.lispcast/person
(s/keys :req [:com.lispcast.person/first-name
:com.lispcast.person/last-name
:com.lispcast.person/email]
:opt [:com.lispcast.person/phone]))
This example is borrowed and modified from the spec Guide.
In this simple example, I think I prefer Schema. Its intention is much clearer. However, once the Schema meets the real world, it turns out that you can throw the "your Schema looks like the data" pipe dream out the window. For instance, what if we need either an email or a phone or both. In Schema, that means they're both optional, but that you need an extra check afterward, which kind of ruins the elegance of the DSL. You're trying to specify that the phone and email have a relationship to each other. The presence of the keys are interdependent. There are several ways to do it in Schema. And I don't like any of them.
In clojure.spec, the and
can operate on the values already parsed.
Carin Meier has a great example of constraining different values to be
in relationship in One Fish Spec
Fish.
Takeaway: I'm interested to see what uses these smaller pieces can be put to. I don't understand them well enough yet. I look forward to experimenting with them.
2. clojure.spec prefers namespaced keywords.
While both clojure.spec and Schema allow namespaced and un-namespaced
keywords, clojure.spec clearly encourages a global
semantic for a unique keyword.
The keys
function takes a list of required keywords which must be
namespaced. Those keywords play double-duty. They check for the presence
of required keys and they name the spec that the value must conform
to. Schema is more relaxed and does not show that preference for
namespaced keys.
Takeaway: Rich Hickey clearly stated that we should be naming specs for global consumption in the Cognicast interview. I'm not sure what my position is on this, but I trust he's thought about it more than I have. I will definitely have to play with it before I come to an opinion.
3. clojure.spec has powerful sequence validation.
clojure.spec has a full suite of regular expression operators for
describing data in a sequence. While in general, vectors tend to be
either homogeneous (e.g., a vector of Strings) or used as
tuples (e.g.,
[:person "Luke" "Skywalker"]
), clojure.spec does not forget that code
is data, too. And code means complex lists. Look at the usage string
from
clojure.core/defn
:
(defn name doc-string? attr-map? ([params*] prepost-map? body) + attr-map?)
It is clearly expressed with regex operations in mind. It uses ?
, *
,
and +
, which are classical symbols for regex operators. clojure.spec
makes writing a checker to validate defn
forms a straightforward
translation of this documentation.
Schema did have some useful operators for talking about heterogeneous vectors. But they were nowhere near as powerful as regular expressions.
;; Schema spec with heretogeneous vector
(def FancySeq
"A sequence that starts with a String, followed by an optional Keyword,
followed by any number of Numbers."
[(s/one s/Str "s")
(s/optional s/Keyword "k")
s/Num])
This example comes from the Schema readme.
Takeaway: Because it will be so easy to describe the expected
arguments to a macro, we should expect better error messages in macros
in the core library and beyond. Jonathan Claggett and Chris Houser
demonstrated something
similar with Sequence Expressions.
And Colin Fleming uses full recursive
grammars to parse macros
in Cursive. Another bonus is that specs can be attached to functions and
macros without modifying code using
clojure.spec/fdef
.
4. clojure.spec combines checking with parsing.
So often, when writing a macro, I need to parse out the pieces of the
arguments that I need for each section of logic. clojure.spec requires
that you name each piece of the regular expression.
clojure.spec/conform
uses those names to create a map of all of the
pieces. So you're checking that the arguments conform as well as parsing
it into parts. And since it's a regular expression, it's pretty
powerful. Schema doesn't really check sequences like that. Check out
David Nolen's comments on
clojure.spec
for an example of parsing.
Takeaway: The parsing feature is going to be really important. Regular expressions are great for defining a set of inputs far larger th an the expression itself. There are branching and backtracking built in. I'm really excited for what this means for macros. They'll be easier to make and have better error messages.
5. clojure.spec has tight test.check integration.
test.check is Clojure's implementation of generative testing. I really like generative testing. It covers a large number of cases with higher-order properties. clojure.spec specs can automatically be turned into test.check generators. If you define specs for the arguments and return value of a function, the function can be tested automatically.
Takeaway: I will be more confident in my code when I use
clojure.spec. I think it's going to make generative testing more
accessible, as well. It's not that generative testing is hard, but the
learning curve on spec is easier.
clojure.spec/fdef
and
clojure.spec/fspec
will test functions given specs.
Conclusions
I'll confess: when I first saw clojure.spec, I was neither impressed nor excited. I was more baffled than anything. Was this what the Clojure team was working on? Weren't there more pressing matters? But when I read what the core team had produced, worked through the API docs, and listened to the Rich Hickey interview, I started to see some exciting possibilities. I'm really happy this is getting attention as a language feature. It shows that the team is listening to the community.