A Model of Interceptors

I've been gearing up for what I think will be a fun and fruitful project. I wanted to review some of the fundamental abstractions we use in Clojure for building web applications, and see if I can't find something super robust for building on top of.

The "classical" model is Ring, which defines Adapters, Middleware, and Handlers. Briefly, Handlers are functions from request to response. Middleware are higher-order functions that take a Handler and return a new Handler. They're Handler transformers. Finally, Adapters are used to insert the Ring model into a "host". For instance, the Java Servlet system could be a host. A Servlet Adapter will convert the Servlet request to a Ring request, and also convert the Ring response to a Servlet response.

The Ring model is pretty good, but it has the problem that it makes it hard to do asynchronous servers. Besides Adapters, the only things that handle requests are the Handlers. Those are just functions, so they consume a whole thread. It's one thread per request, which doesn't scale well and doesn't let you do asynchronous operations.

Enter Pedestal. Pedestal introduced the idea of Interceptors. In Ring Middleware, you could transform the request before the Handler got it, and/or transform the response after the Handler returned. Interceptors defined the transform-on-the-way-in (:enter function) and transform-on-the-way-out (:leave function) as two separate operations (both functions). This basically reifies the two uses of middleware into a distinct object. If you add core.async to the mix, you can get asynchrony. Pedestal also adds an :error function for the third use case of Middleware, which was wrapping handlers in try-catch to transform Exceptions into responses.

Pedestal has machinery for running a request through an Interceptor chain. This machinery uses a stack and a queue for holding the :leave and :enter functions. The stack and the queue are held in a Context map, and that is what gets passed through each Interceptor. Each Interceptor returns a modified Context, which could modify the HTTP Request or Response, or it could modify the stack and queue. This lets Interceptors add other Interceptors to the end of the chain.

One thing I miss about Ring when using Pedestal is the cleanliness of the composition model. Handlers are general purpose functions, and Middleware, which just transform the Handler in the first argument, are easily composed using a nice thread-first macro. In the end, you've got one big function to call.

In Pedestal, they say "everything is an Interceptor". But it's not quite true. While a lot of the functionality of your service is written as Interceptors, the Interceptor chain itself is not an interceptor. You also have the ability to add Interceptors to the chain from within an Iterceptor. This makes it hard to predict what's going to happen when an Interceptor is called. Since it's adding Interceptors to the end, but the Interceptor itself doesn't know what's after it, there's room for lots of trouble.

What I do like about Interceptors is that they separate out the three main uses of Middleware (transform request, transform response, and handle Exceptions). They reify that concept and eliminate the bit of boilerplate every Middleware has to implement (the wrap/call).

A first pass

Let's simplify and formalize the concept of Interceptors.

First, an Interceptor is made of three functions, the :enter, :leave, and :error. Their types are as follows.

;; :enter :: Request  -> Request
;; :leave :: Response -> Response
;; :error :: Request  -> Exception -> Response

All of them are optional. Let's leave aside the :error function for now, and deal only with :enter and :leave. We'll add :error back later. That means we have two functions. Both are of the form "take a thing, return it transformed".

Now, given two of these, how do we compose them? It's obvious that we can use function composition.

(comp enter2 enter1) ;; call enter1, then pass the return to enter2

That's a really good sign. We can make a function that uses comp to compose two interceptors:

(defn chain [i1 i2]
  {:enter (comp (:enter i2 identity)
                (:enter i1 identity))
   :leave (comp (:leave i1 identity)
                (:leave i2 identity))})

I'm calling it chain because that's the language from Pedestal. Notice that the :leave functions are composed in the oposite order from :enter functions. We notice that chain is a monoid because it just uses comp, which is a monoid. That's sweet! People talk about monads all the time, but monoids are where it's at.

And we definitely want chain to be a monoid. I imagine Interceptors being assembled in different places, then finally being assembled into one big interceptor. Monoids give us that property.

Also notice that I've put the Interceptors in the order we expect to see them in. They're kind of backwards in function composition.

Now we need a function that will run this chain. It simply threads the value through the enter and leave functions.

(defn run [interceptor value]
  (let [{:keys [enter leave]} interceptor]
    (-> value
        enter
        leave)))

Let's build a test to ensure that we maintain this associative property. (Monoids are associative).

First, we'll need a generator for some simple functions.

(def gen-integer-fn
  (gen/elements
   [identity
    (constantly 0)
    (constantly 1)
    inc
    dec]))

Nothing too fancy, because we'll be generating some long Interceptor chains.

Now to generate Interceptors themselves.

(def gen-interceptor
  (gen/let [enter gen-integer-fn
            leave gen-integer-fn]
    {:enter enter
     :leave leave}))

And the test:

(defspec chain-associative
  1000
  (prop/for-all [a gen-interceptor
                 b gen-interceptor
                 c gen-interceptor
                 x gen/int]
    (let [x (mod x 100)
          i1 (chain a (chain b c))
          i2 (chain (chain a b) c)]
      (= (run i1 x) (run i2 x)))))

(Note: all of this testing is done using clojure.test.check.)

Converting to Vectors

I simplified the model of Interceptors to get a first approximation. The function composition version really feels nice to me. However, it's missing a lot of stuff. I did try to add error handling and asynchrony to the function composition model, but it was really complicated and I wasn't sure how to add some features. I won't go over that here. It was an interesting exercise but ultimately I think it's a dead end.

What I decided to do instead was to keep the Interceptors separate, make the run function a little smarter. Instead of composing the functions, the run function loops through the Interceptors first from begining to end, running the :enter functions, then back again to the begining running the :leave functions.

I still need chain to be a monoid. I shouldn't have to change the test. There's a nice existing monoid that will serve our purpose well. It's Clojure's very own Vector. If we keep the Interceptors in a Vector, we can define chain in terms of Vector concatenation, which is also a monoid.

Let's go the full monty and define chain using Clojure's monoid pattern. But first, I want to define an Interceptor a little differently. Now, an Interceptor is a Vector of zero or more maps. Each map can have :enter and :leave functions. We'll get to :error in a bit.

Because we're now defining Interceptors as a Vector, I will define a function normalize that will convert different stuff to that format.

(defn normalize [i]
  (cond
    (nil? i)
    []

    (= {} i)
    []

    (seq? i)
    (vec i)

    (vector? i)
    i

    (fn? i)
    [{:enter i}]

    (map? i)
    [i]))

That should be pretty straightforward. The notable things are that the empty Vector is the identity of chain, and that functions alone are converted into :enter functions.

Now chain has the full monoid definition, which includes all arities.

(defn chain
  ([]
   [])
  ([i1]
   (normalize i1))
  ([i1 i2]
   (let [i1 (normalize i1)
         i2 (normalize i2)]
     (into i1 i2)))
  ([i1 i2 & is]
   (apply chain (chain i1 i2) is)))

I'm not sure that's the most performant possible version, but I'm not concerned about performance here. You build the chain once and run it many times. run will need to be performant.

run is an interesting situation now. If I were in a different Lisp, I'd have tail call elimination, and I'd just jump between functions in tail position with almost zero cost.

But Clojure doesn't have tail call elimination. So I'll use two functions, one for calling :enter functions and one for calling :leave functions. While it's moving in one direction, it can just recur in a tight loop. Once it reaches the end of the :enter side, it defers to the other function that handles the :leave functions.

(defn run-enter [i v idx]
  (if (contains? i idx)
    (let [e (:enter (get i idx) identity)]
      (recur i (e v) (inc idx)))
    (run-leave i v (dec idx))))
(defn run-leave [i v idx]
  (if (contains? i idx)
    (let [l (:leave (get i idx) identity)]
      (recur i (l v) (dec idx)))
    v))

These look pretty similar to each other. The difference is that run-enter defers to run-leave when it runs off the end. And run-leave returns the value when it runs off the beginning.

Now, run just defers to run-enter:

(defn run [i v]
  (run-enter i v 0))

Rerun the tests and they pass.

Adding error handling

One of the important use cases of Middleware is to wrap a handler in a try/catch and handle errors. We want to enable that with our Interceptors model.

We augment the Interceptor definition. Now it's a Vector of zero-or-more maps, where each map has an :enter, :leave, and :error function (all functions are optional).

Errors act differently depending on if you're in the :enter or :leave side. If you're entering, if an :enter function throws a Throwable (or returns a Throwable), we immediately begin the :leave side with the Throwable as the value.

(defn run-enter [i v idx]
  (if (contains? i idx)
    (let [e (:enter (get i idx) identity)
          v' (try
               (e v)
               (catch Throwable t
                 t))]
      (if (instance? Throwable v')
        (run-leave i v' idx)
        (recur i v' (inc idx))))
    (run-leave i v (dec idx))))

When we're leaving, if we have a Throwable, we call the :error function. Otherwise, we call the :leave function. That means that at any point, the :error function can return a non-Throwable value and it will go back to the :leave functions.

(defn run-leave [i v idx]
  (if (contains? i idx)
    (let [status (if (instance? Throwable v) :error :leave)
          l (get-in i [idx status] identity)
          v' (try
               (l v)
               (catch Throwable t
                 t))]
      (recur i v' (dec idx)))
    v))

The tests run, but we aren't testing the error path. We need an :enter function to throw. This test sets up a chain where the last Interceptor in the chain throws an error. The first Interceptor has an :error function that returns its own error. We can test that we get that error out at the end.

(defn constantly-error [_]
  (throw (ex-info "Error" {})))

(defspec chain-errors
  1000
  (prop/for-all [i (gen/vector gen-interceptor)
                 x gen/int]
    (let [t (ex-info "My error" {})
          i (chain {:error (constantly t)}
                   i
                   {:enter constantly-error})
          x (mod x 100)]
      (identical? t (run i x)))))

We also want to test what happens when we don't catch the error.

(defspec chain-error-not-caught
  1000
  (prop/for-all [i (gen/vector gen-interceptor)
                 x gen/int]
    (let [i (chain {:enter constantly-error} i)
          x (mod x 100)]
      (instance? Throwable (run i x)))))

We could really test this a lot, but I'll skip over that.

Adding async

Another thing that Pedestal has is asynchrony. If your Interceptors return a core.async channel, Pedestal's machinery will consider that an async Interceptor.

I would rather use Manifold, which is a cool abstraction for asynchronous values. That means that in my Interceptors, if your function returns something Manifold understands as a deferrable thing, the Interceptor chain will be considered an async chain. This change is not too difficult:

(defn run-leave [i v idx]
  (if (contains? i idx)
    (let [status (if (instance? Throwable v) :error :leave)
          l (get-in i [idx status] identity)
          v' (try
               (l v)
               (catch Throwable t
                 t))]
      (if (d/deferrable? v')
        (-> (d/->deferred v')
            (d/chain #(run-leave i % (dec idx)))
            (d/catch #(run-leave i % (dec idx))))
        (recur i v' (dec idx))))
    v))

(defn run-enter [i v idx]
  (if (contains? i idx)
    (let [e (:enter (get i idx) identity)
          v' (try
               (e v)
               (catch Throwable t
                 t))]
      (cond
        (instance? Throwable v')
        (run-leave i v' idx)

        (d/deferrable? v')
        (-> (d/->deferred v')
            (d/chain #(run-enter i % (inc idx))
            (d/catch #(run-leave i %      idx))))

        :else
        (recur i v' (inc idx))))
    (run-leave i v (dec idx))))

We can test this by making sure async chains return the same value as non-async chains.

(defn make-async [i]
  (-> i
      (update :enter #(comp defer-val %))
      (update :leave #(comp defer-val %))))

(defspec chain-async
  1000
  (prop/for-all [i (gen/not-empty (gen/vector gen-interceptor))
                 x gen/int]
    (let [ai (mapv make-async i)
          x (mod x 100)]
      (= (run i x) @(run ai x)))))

(defspec chain-async-one
  1000
  (prop/for-all [i (gen/not-empty (gen/vector gen-interceptor))
                 idx gen/int
                 x gen/int]
    (let [idx (mod idx (count i))
          ai (update i idx make-async)
          x (mod x 100)]
      (= (run i x) @(run ai x)))))

(defn constantly-error-async [_]
  (let [d (d/deferred)]
    (d/error! d (ex-info "Async error." {:async true}))
    d))

(defspec chain-error-async
  1000
  (prop/for-all [i (gen/vector gen-interceptor)]
    (let [e constantly-error-async]
      (:async (ex-data @(run (chain i e) 0))))))

Adding early return

A final thing that Middleware could do was to not call the handler. It could do something different, instead. I'm calling this "early return". What I want to happen is have some way for an :enterfunction to immediately begin the :leave phase instead of continuing to the next :enter function.

We'll take a page from the Clojure playbook and use reduced. reduced is used in reduce for early returns. It signals that the computation is over. reduce checks for it, then unpacks the reducedvalue. We'll do the same. Only :enter functions can do this.

(defn run-enter [i v idx]
  (if (contains? i idx)
    (let [e (:enter (get i idx) identity)
          v' (try
               (e v)
               (catch Throwable t
                 t))]
      (cond
        (instance? Throwable v')
        (run-leave i v' idx)

        (d/deferrable? v')
        (-> (d/->deferred v')
            (d/chain #(run-enter i % (inc idx))))

        (reduced? v')
        (run-leave i (unreduced v') idx)

        :else
        (recur i v' (inc idx))))
    (run-leave i v (dec idx))))

We can test this by saying that a chain of a should be equivalent to a early returning before b. It's clearer in code.

(defn make-reduced [i]
  (update i :enter #(comp reduced %)))

(defspec chain-early-exit
  1000
  (prop/for-all [a gen-interceptor
                 b gen-interceptor
                 x gen/int]
    (let [x (mod x 100)
          i1 (chain a)
          i2 (chain (make-reduced a) b)]
      (= (run i1 x) (run i2 x)))))

That's it! Check out the code here.

Conclusions

I'm happy with where this wound up. I've got a nice, concise implementation of Interceptors, ready for use in async web servers---though it's not specific to the web. I've also got a nice suite of tests. The model is not quite as formal as I'd like. Some of the tests seem a little too specific. I'd love for them to be a few simple properties that guarantee the behaviors I'm looking for. Right now they feel ad hoc. I probably spent too much time on this. But I really enjoyed it! I'm looking forward to hear what you have to think about it.