PurelyFunctional.tv Newsletter 397: unknown keys in maps
Issue 397 - September 28, 2020 · Archives · Subscribe
Clojure Tip 💡
unknown keys in maps
In the last issue, I hinted at the power that the constraint of limited
knowledge gives you. [a] -> Int
has few useful implementations
compared to [Int] -> Int
. Therefore, we are more sure of what the
function does because it limits our knowledge of the type.
That is not to say that it limits it down to a single implementation, or that the type is all you need to know about a function to know its purpose. It is a relative statement: Limiting knowledge can improve certainty. It's an apparent contradiction.
Yet, when examined, it isn't really a contradiction at all. It merely talks about the level of generality of the function. Limiting knowledge increases generality.
We can illustrate this principle with a social example. I can say with certainty "Unknown Person X likes food". But about my friend, I can say "Bob likes gala apples". When I speak of an unknown person, I must speak in generalities. They like something, but I can't say anything specific.
This translates into being able to guess what they are going to say. If I ask someone "what does Person X like to eat?", I can guess pretty well that they can't say much more than "food". But if I ask Bob's friend "what does Bob like?", I can't guess as well, even if I know Bob. The friend can answer "Bob likes risotto" and "Bob likes lemon on broccoli." There are many specific statements that are true. Thus, I am less certain of the specific answer, just like I am less certain of the implementation of the function.
We use this relationship between knowledge and generality a lot in idiomatic Clojure code. A very common practice in Clojure is to write a function that accepts an "open" map. What do we mean by open?
An open map is a map where we know some of the keys and values, but we don't restrict other, unknown keys and values. For instance, I could have a function that computes the distance from the origin of a point:
(defn distance [point]
(Math/sqrt (+ (square (:x point)) (square (:y point)))))
We expect point
to contain :x
and :y
, but it will work if point
has other keys as well. We could express this unknown as a type, like
this:
{:x Number :y Number ...} -> Double
Note the ellipsis (...) indicating that the map is open. Like I said
before, this is a very common pattern. It lets us write more general
functions. Just like a
indicates an unknown, the ...
indicates an
unknown. Both force our functions to be more general.
I think that this open map type is not possible to express in the standard Haskell type system. I am told languages like Elm and PureScript do allow this. (This shows the importance of design choices in a type system; it's not a simple matter of typed vs. untyped.)
Now, this leads to yet another common pattern in Clojure. When we are making data pipelines, we will augment and pass along the data given at each step. So let's write a function that adds the distance to the map.
(defn augment-distance [point]
(assoc point :distance (distance point)))
We can now use augment-distance
as a step in a threading macro. The
pattern is this: accept an open map, add one or more keys to it, and
return the result. There is a double unknown: we don't know what keys
are passed to us and we don't know what keys we return. But we do know
that the keys passed in are the same as the keys passed out (plus the
extra keys we added).
How would you write the type for this? Let's make up a notation:
a:{:x Number :y Number ...} -> a:{:distance Double}
We are saying "the function takes an open map, called a
with :x
and
:y
keys (plus zero or more keys), and returns a
with :distance
key. I think this pattern is impossible to type in Haskell, but is
likely possible in Elm and PureScript. To be clear, it's not just
saying"I return a map with :distance
." It says "I return all the keys
I was passed, plus :distance
."
Encapsulating volatility
David Parnas explained that module boundaries should be drawn to encapsulate volatility. The point of this pattern is to do just that. What is volatility? It is unexpected changes. We can look at two dimensions of changes to see what kinds of volatility this module is robust against.
1. Changes over time
As requirements change, we may have to add, remove, and change keys to
this map over time. As we do, the augment-distance
function is robust
against almost all of those changes. The only changes that would affect
it are the ones that remove or change :x
or :y
.
2. Changes over "space"
distance
and augment-distance
can be used in different places in the
code. The context, and thus the keys in the open maps, will be different
in different contexts. Thus, these functions are robust to changing the
"space" where they are used. This is also more commonly known as
"reusability".
Conclusion
Clojure code commonly uses open maps as both arguments and return values. The open maps specify that less is known about their actual keys. The limited knowledge increases the generality of the functions. Being more general protects them against changes over time. And it makes them more reusable in different contexts. It would be very nice if a type system for Clojure could express these two patterns.
Quarantine update 😷
I know a lot of people are going through tougher times than I am. If you, for any reason, can't afford my courses, and you think the courses will help you, please hit reply and I will set you up. It's a small gesture I can make, but it might help.
I don't want to shame you or anybody that we should be using this time to work on our skills. The number one priority is your health and safety. I know I haven't been able to work very much, let alone learn some new skill. But if learning Clojure is important to you, and you can't afford it, just hit reply and I'll set you up. Keeping busy can keep us sane.
Also, if you just want to subscribe for a paid membership, I have opened them back up for the moment. Register here.
Stay healthy. Wash your hands. Stay at home. Wear a mask. Take care of loved ones.
Clojure Challenge 🤔
Last week's challenge
- Set game - submissions
There are so many ways to implement it!
Please do participate in the discussion at the submission links above. It's active and it's a great way to get comments on your code.
This week's challenge
Big change: Please submit your solutions as comments on the gist linked below.
Wolf, sheep, cabbage
There's a really common problem called "Wolf, sheep, cabbage". (There are other names for it as well). The problem goes like this:
You own a wolf, a sheep, and a cabbage. You need to cross a river and you only have a boat big enough for you plus one other passenger (your wolf, sheep, or cabbage). The trouble is, the animals are hungry. If you leave the wolf with the sheep, the wolf will eat it. If you leave the sheep with the cabbage, it will eat it. The wolf, being a carnivore, will not eat the cabbage. How do you move them safely to the other side with nothing being eaten?
It's a fun problem and I suggest you give it a try on paper if you don't know the solution. If you can't figure it out, search for it online and you'll find solutions.
Your task is to write a function that generates solutions to this puzzle, in the form of boat crossings. Each crossing should state:
- the direction you are going
- the passenger in your boat
(defn wolf-sheep-cabbage []) => ([...
{:direction :across
:passenger :cabbage}
{:direction :return
:passenger :sheep}]
...)
Bonus
Write a function that validates a solution.
Please submit your solutions as comments to this gist. Discussion is welcome.
Big change: Please submit your solutions as comments on the gist linked above.
Rock on!
Eric Normand