PurelyFunctional.tv Newsletter 371: Chain map, filter, and reduce

Issue 371 - March 30, 2020 · Archives · Subscribe

Clojure Tip 💡

Chain map, filter, and reduce

I've been thinking a lot lately about chaining map, filter, and reduce. After all, it's the main topic of an upcoming chapter in my book. The challenge is how to teach it.

In Clojure, we are very used to doing long chains of map, filter, and reduce (along with other sequence functions). But it's not so common in many languages. For instance, in JavaScript, even if they have .map(), .filter(), and .reduce() as part of the standard library, it is a lot less common to structure computation as a long chain of these. JavaScripters might use each of those functions in isolation for simple tasks, but for some reason it's not as common to find longer chains of them. It's one of the practices I am trying to address in the book.

In Clojure, we often think of these long chains of sequence operations as data transformation pipelines. A collection of data comes in one end, and the result of a calculation comes out the other end. In the middle, there are small steps that bring the data we have closer to the data we want.

Here's an example. Let's say we have some data on the staff at a local hospital.

(def hospital
  {:hospital "Hans Jopkins"
   :staff {1 {:first-name "John"
              :last-name  "Doe"
              :salary 40000
              :position :resident}
           2 {:first-name "Jane"
              :last-name "Deer"
              :salary 100000
              :position :attending}
           3 {:first-name "Sam"
              :last-name "Waterman"
              :salary 0
              :position :volunteer}
           ...
           }})

Now we want to figure out the average salary of attending doctors. We can break it down into steps.

(defn average-of-attendings []
  (let [staff (-> hospital :staff vals)]
    (->> staff
      (filter #(= :attending (:position %))))))

This first step will go through all of the staff and return all of the attendings.

Next step: we can pluck out all of the salaries. We are confident we only have attendings at this step, so we just map over them.

(defn average-of-attendings []
  (let [staff (-> hospital :staff vals)]
    (->> staff
      (filter #(= :attending (:position %)))
      (map :salary))))

Now we have only the salaries of the attendings. While filter removes items from the sequence, map transforms each item without removing any. In this case, map transformed the structure from a staff entity to a single number, the salary.

Here's a question: do these steps need to be done in that particular order? Do we need to do the filter, then map? Or can we do these the other way around?

Final step: average. Well, average is actually two steps. The average is the sum divided by the count, which are two distinct operations. Let's first do the sum.

(defn average-of-attendings []
  (let [staff (-> hospital :staff vals)]
    (->> staff
      (filter #(= :attending (:position %)))
      (map :salary)
      (reduce + 0)))

That was easy! Summing is just adding them up, starting from 0. But now we have the sum, how can we get the count? We need the count of only the attendings, so this requires restructuring the code.

(defn average-of-attendings []
  (let [staff (-> hospital :staff vals)
        attendings (->> staff
                     (filter #(= :attending (:position %))))
        sum (->> attendings
              (map :salary)
              (reduce + 0))]
    (/ sum (count attendings)))

Unfortunately, this breaks our nice chain. We have a couple of options to reinstate it. We'll go over a very straightforward way.

We could notice that, although the average function does have two steps (sum and divide by count), it is a generic function---it has nothing at all to do with doctors, only with numbers. So we could define it separately:

(defn average [nums]
  (/ (reduce + 0 nums) (count nums)))

Then we use it as the last step of the chain:

(defn average-of-attendings []
  (let [staff (-> hospital :staff vals)]
    (->> staff
      (filter #(= :attending (:position %)))
      (map :salary)
      (average)))

That works great. We also notice that we could collapse the two chains in that function into one:

(defn average-of-attendings []
  (->> hospital
    :staff
    vals
    (filter #(= :attending (:position %)))
    (map :salary)
    (average)))

And now the chain takes up the entire function. This is extremely common in Clojure code. And doing this average as a last step is also common.

I asked before whether we could do the map step before the filter step, and I don't think it's possible. If we do the map step first, we're only left with a salary. We don't know who the salary belonged to. We can't select only the attendings' salaries after. The map step loses information that we need at a later step.

We could do something else, like indicate somehow in the map step that we should later omit it. That might look something like this:

(defn average-of-attendings []
  (->> hospital
    :staff
    vals
    (map #(when (= :attending (:position %))
            (:salary %)))
    (filter some?)
    (average)))

But we're adding a conditional. The steps are not as simple and elegant as the other way. Part of the art of chaining is to order your steps so that each step is straightforward. I suggest practicing this art. Explore different orderings. Explore smaller and bigger steps. Push the limits. That's the way to get comfortable and develop your style.

Quarantine update 😷

I know a lot of people are going through tougher times than I am. If you, for any reason, can't afford my courses, and you think the courses will help you, please hit reply and I will set you up. It's a small gesture I can make, but it might help.

I don't want to shame you or anybody that we should be using this time to work on our skills. The number one priority is your health and safety. I know I haven't been able to work very much, let alone learn some new skil l. But if learning Clojure is important to you, and you can't afford it, just hit reply and I'll set you up. Keeping busy can keep us sane.

Stay healthy. Wash your hands. Stay at home. Take care of loved ones.

Clojure Challenge 🤔

Last week's challenge

The challenge in Issue 370 was to determine 3 hashtags from a new article headline. You can see them here.

You can leave comments on these submissions in the gist itself. Please leave comments! You can also hit the Subscribe button to keep abreast of the comments. We're all here to learn.

This week's challenge

Inverse Yoda

Talk like this, Yoda does. Translate back to normal, you must.

(ayoda "Talk like this, Yoda does.") ;=> "Yoda does talk like this."
(ayoda "Translate back to normal, you must") ;=> "You must translate back to
normal."
(ayoda "Fun, Clojure is. Learn it, I will.") ;=> "Clojure is fun. Learn it, I
will."
(ayoda "Do or do not. There is no try.") ;=> "Do or do not. There is no
try."

Some notes:

  • Expect at most one comma in the sentence. If there is no comma, it doesn't need translation.
  • Try to handle capitals correctly.
  • There may be more than one sentence. Handle all of them.

Thanks to this site for the challenge idea.

As usual, please reply to this email and let me know what you tried. I'll collect them up and share them in the next issue. If you don't want me to share your submission, let me know.

Rock on!
Eric Normand