A Model of Interceptors
I've been gearing up for what I think will be a fun and fruitful project. I wanted to review some of the fundamental abstractions we use in Clojure for building web applications, and see if I can't find something super robust for building on top of.
The "classical" model is Ring, which defines Adapters, Middleware, and Handlers. Briefly, Handlers are functions from request to response. Middleware are higher-order functions that take a Handler and return a new Handler. They're Handler transformers. Finally, Adapters are used to insert the Ring model into a "host". For instance, the Java Servlet system could be a host. A Servlet Adapter will convert the Servlet request to a Ring request, and also convert the Ring response to a Servlet response.
The Ring model is pretty good, but it has the problem that it makes it hard to do asynchronous servers. Besides Adapters, the only things that handle requests are the Handlers. Those are just functions, so they consume a whole thread. It's one thread per request, which doesn't scale well and doesn't let you do asynchronous operations.
Enter Pedestal. Pedestal introduced the idea of Interceptors. In Ring
Middleware, you could transform the request before the Handler got it,
and/or transform the response after the Handler returned. Interceptors
defined the transform-on-the-way-in (:enter
function) and
transform-on-the-way-out (:leave
function) as two separate operations
(both functions). This basically reifies the two uses of middleware into
a distinct object. If you add core.async to the mix, you can get
asynchrony. Pedestal also adds an :error
function for the third use
case of Middleware, which was wrapping handlers in try-catch to
transform Exceptions into responses.
Pedestal has machinery for running a request through an Interceptor
chain. This machinery uses a stack and a queue for holding the :leave
and :enter
functions. The stack and the queue are held in a Context
map, and that is what gets passed through each Interceptor. Each
Interceptor returns a modified Context, which could modify the HTTP
Request or Response, or it could modify the stack and queue. This lets
Interceptors add other Interceptors to the end of the chain.
One thing I miss about Ring when using Pedestal is the cleanliness of the composition model. Handlers are general purpose functions, and Middleware, which just transform the Handler in the first argument, are easily composed using a nice thread-first macro. In the end, you've got one big function to call.
In Pedestal, they say "everything is an Interceptor". But it's not quite true. While a lot of the functionality of your service is written as Interceptors, the Interceptor chain itself is not an interceptor. You also have the ability to add Interceptors to the chain from within an Iterceptor. This makes it hard to predict what's going to happen when an Interceptor is called. Since it's adding Interceptors to the end, but the Interceptor itself doesn't know what's after it, there's room for lots of trouble.
What I do like about Interceptors is that they separate out the three main uses of Middleware (transform request, transform response, and handle Exceptions). They reify that concept and eliminate the bit of boilerplate every Middleware has to implement (the wrap/call).
A first pass
Let's simplify and formalize the concept of Interceptors.
First, an Interceptor is made of three functions, the :enter
,
:leave
, and :error
. Their types are as follows.
;; :enter :: Request -> Request
;; :leave :: Response -> Response
;; :error :: Request -> Exception -> Response
All of them are optional. Let's leave aside the :error
function for
now, and deal only with :enter
and :leave
. We'll add :error
back
later. That means we have two functions. Both are of the form "take a
thing, return it transformed".
Now, given two of these, how do we compose them? It's obvious that we can use function composition.
(comp enter2 enter1) ;; call enter1, then pass the return to enter2
That's a really good sign. We can make a function that uses comp
to
compose two interceptors:
(defn chain [i1 i2]
{:enter (comp (:enter i2 identity)
(:enter i1 identity))
:leave (comp (:leave i1 identity)
(:leave i2 identity))})
I'm calling it chain
because that's the language from Pedestal. Notice
that the :leave
functions are composed in the oposite order from
:enter
functions. We notice that chain
is a monoid because it just
uses comp
, which is a monoid. That's sweet! People talk about monads
all the time, but monoids are where it's at.
And we definitely want chain
to be a monoid. I imagine Interceptors
being assembled in different places, then finally being assembled into
one big interceptor. Monoids give us that property.
Also notice that I've put the Interceptors in the order we expect to see them in. They're kind of backwards in function composition.
Now we need a function that will run this chain. It simply threads the value through the enter and leave functions.
(defn run [interceptor value]
(let [{:keys [enter leave]} interceptor]
(-> value
enter
leave)))
Let's build a test to ensure that we maintain this associative property. (Monoids are associative).
First, we'll need a generator for some simple functions.
(def gen-integer-fn
(gen/elements
[identity
(constantly 0)
(constantly 1)
inc
dec]))
Nothing too fancy, because we'll be generating some long Interceptor chains.
Now to generate Interceptors themselves.
(def gen-interceptor
(gen/let [enter gen-integer-fn
leave gen-integer-fn]
{:enter enter
:leave leave}))
And the test:
(defspec chain-associative
1000
(prop/for-all [a gen-interceptor
b gen-interceptor
c gen-interceptor
x gen/int]
(let [x (mod x 100)
i1 (chain a (chain b c))
i2 (chain (chain a b) c)]
(= (run i1 x) (run i2 x)))))
(Note: all of this testing is done using clojure.test.check.)
Converting to Vectors
I simplified the model of Interceptors to get a first approximation. The function composition version really feels nice to me. However, it's missing a lot of stuff. I did try to add error handling and asynchrony to the function composition model, but it was really complicated and I wasn't sure how to add some features. I won't go over that here. It was an interesting exercise but ultimately I think it's a dead end.
What I decided to do instead was to keep the Interceptors separate, make
the run
function a little smarter. Instead of composing the functions,
the run
function loops through the Interceptors first from begining to
end, running the :enter
functions, then back again to the begining
running the :leave
functions.
I still need chain to be a monoid. I shouldn't have to change the test.
There's a nice existing monoid that will serve our purpose well. It's
Clojure's very own Vector. If we keep the Interceptors in a Vector, we
can define chain
in terms of Vector concatenation, which is also a
monoid.
Let's go the full monty and define chain
using Clojure's monoid
pattern. But first, I want to define an Interceptor a little
differently. Now, an Interceptor is a Vector of zero or more maps. Each
map can have :enter
and :leave
functions. We'll get to :error
in a
bit.
Because we're now defining Interceptors as a Vector, I will define a
function normalize
that will convert different stuff to that format.
(defn normalize [i]
(cond
(nil? i)
[]
(= {} i)
[]
(seq? i)
(vec i)
(vector? i)
i
(fn? i)
[{:enter i}]
(map? i)
[i]))
That should be pretty straightforward. The notable things are that the
empty Vector is the identity of chain
, and that functions alone are
converted into :enter
functions.
Now chain has the full monoid definition, which includes all arities.
(defn chain
([]
[])
([i1]
(normalize i1))
([i1 i2]
(let [i1 (normalize i1)
i2 (normalize i2)]
(into i1 i2)))
([i1 i2 & is]
(apply chain (chain i1 i2) is)))
I'm not sure that's the most performant possible version, but I'm not
concerned about performance here. You build the chain once and run it
many times. run
will need to be performant.
run
is an interesting situation now. If I were in a different Lisp,
I'd have tail call elimination, and I'd just jump between functions in
tail position with almost zero cost.
But Clojure doesn't have tail call elimination. So I'll use two
functions, one for calling :enter
functions and one for calling
:leave
functions. While it's moving in one direction, it can just
recur
in a tight loop. Once it reaches the end of the :enter
side,
it defers to the other function that handles the :leave
functions.
(defn run-enter [i v idx]
(if (contains? i idx)
(let [e (:enter (get i idx) identity)]
(recur i (e v) (inc idx)))
(run-leave i v (dec idx))))
(defn run-leave [i v idx]
(if (contains? i idx)
(let [l (:leave (get i idx) identity)]
(recur i (l v) (dec idx)))
v))
These look pretty similar to each other. The difference is that
run-enter
defers to run-leave
when it runs off the end. And
run-leave
returns the value when it runs off the beginning.
Now, run
just defers to run-enter
:
(defn run [i v]
(run-enter i v 0))
Rerun the tests and they pass.
Adding error handling
One of the important use cases of Middleware is to wrap a handler in a
try/catch
and handle errors. We want to enable that with our
Interceptors model.
We augment the Interceptor definition. Now it's a Vector of zero-or-more
maps, where each map has an :enter
, :leave
, and :error
function
(all functions are optional).
Errors act differently depending on if you're in the :enter
or
:leave
side. If you're entering, if an :enter
function throws a
Throwable (or returns a Throwable), we immediately begin the :leave
side with the Throwable as the value.
(defn run-enter [i v idx]
(if (contains? i idx)
(let [e (:enter (get i idx) identity)
v' (try
(e v)
(catch Throwable t
t))]
(if (instance? Throwable v')
(run-leave i v' idx)
(recur i v' (inc idx))))
(run-leave i v (dec idx))))
When we're leaving, if we have a Throwable, we call the :error
function. Otherwise, we call the :leave
function. That means that at
any point, the :error
function can return a non-Throwable value and it
will go back to the :leave
functions.
(defn run-leave [i v idx]
(if (contains? i idx)
(let [status (if (instance? Throwable v) :error :leave)
l (get-in i [idx status] identity)
v' (try
(l v)
(catch Throwable t
t))]
(recur i v' (dec idx)))
v))
The tests run, but we aren't testing the error path. We need an :enter
function to throw. This test sets up a chain where the last Interceptor
in the chain throws an error. The first Interceptor has an :error
function that returns its own error. We can test that we get that error
out at the end.
(defn constantly-error [_]
(throw (ex-info "Error" {})))
(defspec chain-errors
1000
(prop/for-all [i (gen/vector gen-interceptor)
x gen/int]
(let [t (ex-info "My error" {})
i (chain {:error (constantly t)}
i
{:enter constantly-error})
x (mod x 100)]
(identical? t (run i x)))))
We also want to test what happens when we don't catch the error.
(defspec chain-error-not-caught
1000
(prop/for-all [i (gen/vector gen-interceptor)
x gen/int]
(let [i (chain {:enter constantly-error} i)
x (mod x 100)]
(instance? Throwable (run i x)))))
We could really test this a lot, but I'll skip over that.
Adding async
Another thing that Pedestal has is asynchrony. If your Interceptors return a core.async channel, Pedestal's machinery will consider that an async Interceptor.
I would rather use Manifold, which is a cool abstraction for asynchronous values. That means that in my Interceptors, if your function returns something Manifold understands as a deferrable thing, the Interceptor chain will be considered an async chain. This change is not too difficult:
(defn run-leave [i v idx]
(if (contains? i idx)
(let [status (if (instance? Throwable v) :error :leave)
l (get-in i [idx status] identity)
v' (try
(l v)
(catch Throwable t
t))]
(if (d/deferrable? v')
(-> (d/->deferred v')
(d/chain #(run-leave i % (dec idx)))
(d/catch #(run-leave i % (dec idx))))
(recur i v' (dec idx))))
v))
(defn run-enter [i v idx]
(if (contains? i idx)
(let [e (:enter (get i idx) identity)
v' (try
(e v)
(catch Throwable t
t))]
(cond
(instance? Throwable v')
(run-leave i v' idx)
(d/deferrable? v')
(-> (d/->deferred v')
(d/chain #(run-enter i % (inc idx))
(d/catch #(run-leave i % idx))))
:else
(recur i v' (inc idx))))
(run-leave i v (dec idx))))
We can test this by making sure async chains return the same value as non-async chains.
(defn make-async [i]
(-> i
(update :enter #(comp defer-val %))
(update :leave #(comp defer-val %))))
(defspec chain-async
1000
(prop/for-all [i (gen/not-empty (gen/vector gen-interceptor))
x gen/int]
(let [ai (mapv make-async i)
x (mod x 100)]
(= (run i x) @(run ai x)))))
(defspec chain-async-one
1000
(prop/for-all [i (gen/not-empty (gen/vector gen-interceptor))
idx gen/int
x gen/int]
(let [idx (mod idx (count i))
ai (update i idx make-async)
x (mod x 100)]
(= (run i x) @(run ai x)))))
(defn constantly-error-async [_]
(let [d (d/deferred)]
(d/error! d (ex-info "Async error." {:async true}))
d))
(defspec chain-error-async
1000
(prop/for-all [i (gen/vector gen-interceptor)]
(let [e constantly-error-async]
(:async (ex-data @(run (chain i e) 0))))))
Adding early return
A final thing that Middleware could do was to not call the handler. It
could do something different, instead. I'm calling this "early return".
What I want to happen is have some way for an :enter
function to
immediately begin the :leave
phase instead of continuing to the next
:enter
function.
We'll take a page from the Clojure playbook and use reduced
. reduced
is used in reduce
for early returns. It signals that the computation
is over. reduce
checks for it, then unpacks the reduced
value. We'll
do the same. Only :enter
functions can do this.
(defn run-enter [i v idx]
(if (contains? i idx)
(let [e (:enter (get i idx) identity)
v' (try
(e v)
(catch Throwable t
t))]
(cond
(instance? Throwable v')
(run-leave i v' idx)
(d/deferrable? v')
(-> (d/->deferred v')
(d/chain #(run-enter i % (inc idx))))
(reduced? v')
(run-leave i (unreduced v') idx)
:else
(recur i v' (inc idx))))
(run-leave i v (dec idx))))
We can test this by saying that a chain of a
should be equivalent to
a
early returning before b
. It's clearer in code.
(defn make-reduced [i]
(update i :enter #(comp reduced %)))
(defspec chain-early-exit
1000
(prop/for-all [a gen-interceptor
b gen-interceptor
x gen/int]
(let [x (mod x 100)
i1 (chain a)
i2 (chain (make-reduced a) b)]
(= (run i1 x) (run i2 x)))))
That's it! Check out the code here.
Conclusions
I'm happy with where this wound up. I've got a nice, concise implementation of Interceptors, ready for use in async web servers---though it's not specific to the web. I've also got a nice suite of tests. The model is not quite as formal as I'd like. Some of the tests seem a little too specific. I'd love for them to be a few simple properties that guarantee the behaviors I'm looking for. Right now they feel ad hoc. I probably spent too much time on this. But I really enjoyed it! I'm looking forward to hear what you have to think about it.