PurelyFunctional.tv Newsletter 397: unknown keys in maps

Issue 397 - September 28, 2020 · Archives · Subscribe

Clojure Tip 💡

unknown keys in maps

In the last issue, I hinted at the power that the constraint of limited knowledge gives you. [a] -> Int has few useful implementations compared to [Int] -> Int. Therefore, we are more sure of what the function does because it limits our knowledge of the type.

That is not to say that it limits it down to a single implementation, or that the type is all you need to know about a function to know its purpose. It is a relative statement: Limiting knowledge can improve certainty. It's an apparent contradiction.

Yet, when examined, it isn't really a contradiction at all. It merely talks about the level of generality of the function. Limiting knowledge increases generality.

We can illustrate this principle with a social example. I can say with certainty "Unknown Person X likes food". But about my friend, I can say "Bob likes gala apples". When I speak of an unknown person, I must speak in generalities. They like something, but I can't say anything specific.

This translates into being able to guess what they are going to say. If I ask someone "what does Person X like to eat?", I can guess pretty well that they can't say much more than "food". But if I ask Bob's friend "what does Bob like?", I can't guess as well, even if I know Bob. The friend can answer "Bob likes risotto" and "Bob likes lemon on broccoli." There are many specific statements that are true. Thus, I am less certain of the specific answer, just like I am less certain of the implementation of the function.

We use this relationship between knowledge and generality a lot in idiomatic Clojure code. A very common practice in Clojure is to write a function that accepts an "open" map. What do we mean by open?

An open map is a map where we know some of the keys and values, but we don't restrict other, unknown keys and values. For instance, I could have a function that computes the distance from the origin of a point:

(defn distance [point]
  (Math/sqrt (+ (square (:x point)) (square (:y point)))))

We expect point to contain :x and :y, but it will work if point has other keys as well. We could express this unknown as a type, like this:

{:x Number :y Number ...} -> Double

Note the ellipsis (...) indicating that the map is open. Like I said before, this is a very common pattern. It lets us write more general functions. Just like a indicates an unknown, the ... indicates an unknown. Both force our functions to be more general.

I think that this open map type is not possible to express in the standard Haskell type system. I am told languages like Elm and PureScript do allow this. (This shows the importance of design choices in a type system; it's not a simple matter of typed vs. untyped.)

Now, this leads to yet another common pattern in Clojure. When we are making data pipelines, we will augment and pass along the data given at each step. So let's write a function that adds the distance to the map.

(defn augment-distance [point]
  (assoc point :distance (distance point)))

We can now use augment-distance as a step in a threading macro. The pattern is this: accept an open map, add one or more keys to it, and return the result. There is a double unknown: we don't know what keys are passed to us and we don't know what keys we return. But we do know that the keys passed in are the same as the keys passed out (plus the extra keys we added).

How would you write the type for this? Let's make up a notation:

a:{:x Number :y Number ...} -> a:{:distance Double}

We are saying "the function takes an open map, called a with :x and :y keys (plus zero or more keys), and returns a with :distance key. I think this pattern is impossible to type in Haskell, but is likely possible in Elm and PureScript. To be clear, it's not just saying"I return a map with :distance." It says "I return all the keys I was passed, plus :distance."

Encapsulating volatility

David Parnas explained that module boundaries should be drawn to encapsulate volatility. The point of this pattern is to do just that. What is volatility? It is unexpected changes. We can look at two dimensions of changes to see what kinds of volatility this module is robust against.

1. Changes over time

As requirements change, we may have to add, remove, and change keys to this map over time. As we do, the augment-distance function is robust against almost all of those changes. The only changes that would affect it are the ones that remove or change :x or :y.

2. Changes over "space"

distance and augment-distance can be used in different places in the code. The context, and thus the keys in the open maps, will be different in different contexts. Thus, these functions are robust to changing the "space" where they are used. This is also more commonly known as "reusability".

Conclusion

Clojure code commonly uses open maps as both arguments and return values. The open maps specify that less is known about their actual keys. The limited knowledge increases the generality of the functions. Being more general protects them against changes over time. And it makes them more reusable in different contexts. It would be very nice if a type system for Clojure could express these two patterns.

Quarantine update 😷

I know a lot of people are going through tougher times than I am. If you, for any reason, can't afford my courses, and you think the courses will help you, please hit reply and I will set you up. It's a small gesture I can make, but it might help.

I don't want to shame you or anybody that we should be using this time to work on our skills. The number one priority is your health and safety. I know I haven't been able to work very much, let alone learn some new skill. But if learning Clojure is important to you, and you can't afford it, just hit reply and I'll set you up. Keeping busy can keep us sane.

Also, if you just want to subscribe for a paid membership, I have opened them back up for the moment. Register here.

Stay healthy. Wash your hands. Stay at home. Wear a mask. Take care of loved ones.

Clojure Challenge 🤔

Last week's challenge

Issue 396

There are so many ways to implement it!

Please do participate in the discussion at the submission links above. It's active and it's a great way to get comments on your code.

This week's challenge

Big change: Please submit your solutions as comments on the gist linked below.

Wolf, sheep, cabbage

There's a really common problem called "Wolf, sheep, cabbage". (There are other names for it as well). The problem goes like this:

You own a wolf, a sheep, and a cabbage. You need to cross a river and you only have a boat big enough for you plus one other passenger (your wolf, sheep, or cabbage). The trouble is, the animals are hungry. If you leave the wolf with the sheep, the wolf will eat it. If you leave the sheep with the cabbage, it will eat it. The wolf, being a carnivore, will not eat the cabbage. How do you move them safely to the other side with nothing being eaten?

It's a fun problem and I suggest you give it a try on paper if you don't know the solution. If you can't figure it out, search for it online and you'll find solutions.

Your task is to write a function that generates solutions to this puzzle, in the form of boat crossings. Each crossing should state:

  • the direction you are going
  • the passenger in your boat
    (defn wolf-sheep-cabbage []) => ([...
                                      {:direction :across
                                       :passenger :cabbage}
                                      {:direction :return
                                       :passenger :sheep}]
                                     ...)

Bonus

Write a function that validates a solution.

Please submit your solutions as comments to this gist. Discussion is welcome.

Big change: Please submit your solutions as comments on the gist linked above.

Rock on!
Eric Normand