PurelyFunctional.tv Newsletter 324: Tip: Use tuples for local data, entities for global data

Written by Eric Normand. Published: April 29, 2019.

Eric Normand's Newsletter

Software design, functional programming, and software engineering practices

Over 5,000 subscribers

Issue 324 - April 29, 2019 · Archives · Subscribe

Clojure Tip 💡

use tuples for local data, entities for global data

Tuples and entities serve a common purpose. They are both used to group related data together. The difference between them is that tuples use positions to denote the pieces of data, and entities use names (keywords). For example, I could represent a name as:

;; Name entity
{:first-name "Douglas"
 :last-name "Engelbart"}

That would be an entity. It's a map with keyword keys. But I could just as easily do this:

["Douglas" "Engelbart"]

I know that the first element will be the first name, and the second will be the last name.

Tuples are short and concise. But the concision comes at the cost of 2 things: human readability and extensibility. Tuples are not as human readable. If you printed that out, and read it later, you may guess what the parts mean. But look at this tuple: [20.3 11.54]. What does that mean? Almost impossible. For this reason alone, you shouldn't use tuples for data that will spill out of a very local context. They lose their meaning once the context is gone.

They are also not extensible very easily. You can really only add to the end without messing up existing code. And once you start adding lots of stuff, you'll probably need to put nils in just to mark places that are missing.

Imagine this entity, which evolved a lot from the name entity to describe more about the person:

{:first-name "Douglas"
 :last-name "Englebart"
 :birthplace "Portland, OR"}

Guess what the equivalent tuple looks like. You can't! So I'll give it to you:

["Douglas" "Englebart" nil nil nil "Portland, OR"]

What are those three nils? They are the middle name (which we don't know for Doug), the date of birth (we don't know it), and the person id (from the database, that we don't care about). Those places in the tuple recount the history of the evolution of the tuple, which is not very meaningful for our program or for anything outside of our program that has no context. The entity has become more concise!

But there is a place for tuples. Tuples are great in local context. The tuple can evolve with the context. If you want to add a new element to the tuple, you can add it anywhere, because only a small amount of code has to change to work with it---that's what local means. That makes their concision even more powerful, because there's much less typing and naming.

Currently Recording 🎥

I am currently recording a course called Repl-Driven Development in Clojure. There are a several new lessons ready this week, including one re-recording of an existing lesson.

Guidelines for Repl-Driven Development contains 7 rules of thumb I use to help me maximize my flow and productivity.
In Ergonomics - Tools to write code I go over three tools to make the experience of writing code easier.
In Structural editing - Tools to write code I explain the four different structural editing modes and my recommendations. (Spoiler: choose Parinfer if you haven't gotten good with Paredit yet).
Dealing with errors talks about how to work with exceptions at the repl, and how to read stack traces.
Scientific debugging shows a method of debugging where you run small expressions at the repl to locate the bug.
println debugging talks about the tried-and-true debugging method of the ages.
Keeping code in sync deals with the problem of your repl and your code not agreeing. It's a common problem, and I give some practical advice for how to stay sane when things get crazy. This one went long because I go deep.
Batch vs. Interactive has been re-recorded. I wanted to emphasize the relationship to repl-driven development a little more clearly than I did. Even though computers are faster than they were in 1960, the idea of fast feedback is a much stronger goal in interactive (repl-driven) programming. Batch languages (almost every non-lisp) force you to re-run your whole program each time you want to change it. Lisps let you change little pieces. The result is better flow!

I've been getting lots of great advice from people watching it. Thanks! This is going to be a much better course because of it. If you watch it, please let me know how it can improve.

The course is available as of right now as part of an early access program (read serious discount). If you buy now, you'll receive updates to the course as new lessons come out. There are already 5.5 hours of video, and more coming. If I had to guess, I'd say 7-8 hours total when I'm done. But I can't be sure. That's what the discount is for.

Of course, members of PurelyFunctional.tv will get the course as part of their membership for no extra cost. Just another benefit of being a member.

Check out the course. The first lesson is free.

Functional Programming Media 🍿

I've been thinking a lot about the untimely death of Joe Armstrong, his legacy, and what I want my legacy to be. Of course, I thought of this interview he did with Alan Kay. Joe had an interesting perspective and did incredibly practical work. He's truly someone to learn from. But Alan is the real deal. He's going down in history.

Watch the video here (YouTube).

B

rain skill 😎

Study before you sleep. The last cognitively challenging thing you do will keep working in your mind while you sleep. That means you can do something easy like showering or doing the dishes between studying and sleep. But be careful: screens and other bright lights can disturb your sleep. Best not to study on a computer or tablet.

Clojure Puzzle 🤔

Last week's puzzle

The puzzle in Issue 323 was to consolidate ranges into a minimum form.

You can see the submissions here.

I got some good submissions, though I didn't test them thoroughly.

Once I started trying to solve it myself, I realized it was a tricking problem with some tough corner cases. When that I encounter something like that, I pull in the big guns: property-based testing.

First, my simple generators:

;; a random range (integers only)
(def gen-range (gen/tuple gen/int gen/int))
;; a random vector of ranges
(def gen-ranges (gen/vector gen-range))

I came up with two properties that I think capture the entirety of the solution:

1. Same union property

Consolidating the ranges should not change what numbers are contained in the union of the ranges. You should neither add nor remove.

(defn ints-in
  "Generate a set of all integers in the given range r."
  [r]
  (let [[a b] (normalize r)]
    (set (range a (inc b)))))

(defn union
  "Generate the set of all integers in all ranges rs."
  [rs]
  (set (mapcat ints-in rs)))

(def consolidate-union
  (prop/for-all [rs gen-ranges]
    (= (union rs)
       (union (consolidate rs)))))

That proves that you haven't messed things up, but you could just return the original input and satisfy this property. That brings us to #2.

2. No overlap property

After consolidation, a number should never be contained in two ranges. You want to make sure that the consolidation was complete.

(def consolidate-overlap
  (prop/for
-all [rs gen-ranges
                 x gen/int]
    (let [containing-ranges (filter #(contains? (ints-in %) x)
                                    (consolidate rs))]
      (or (= 0 (count containing-ranges))
          (= 1 (count containing-ranges))))))

That proves that you consolidated overlapping ranges.

Those two properties corresponded to the two fears I had: that I would miss consolidating two ranges, or that I would somehow consolidate two that I shouldn't. The properties gave me a safety net. Once I had those, then it was easy to focus on the implementation. I knew that I wouldn't miss any corner cases.

The implementation I went with was to sort the ranges, then run through them in order, trying to consolidate a range with its neighbor to the right.

(defn normalize [range]
  (vec (sort range)))

(defn consolidate*
  "Consolidate 2 normalized ranges."
  [a b]
  (let [[[a1 a2 :as a] [b1 b2 :as b]] (sort-by first [a b])]
    (if (>= a2 b1)
      [[a1 (max a2 b2)]]
      [a b])))

(defn consolidate [ranges]
  (loop [ranges (vec (sort-by first (map normalize ranges)))
         i 0]
    (if (not (contains? ranges (inc i)))
      ranges
      (let [c (consolidate* (get ranges i) (get ranges (inc i)))]
        (if (= 1 (count c))
          (recur (-> (subvec ranges 0 i)
                     (into c)
                     (into (subvec ranges (+ 2 i))))
                 i)
          (recur ranges (inc i)))))))

Property-based testing is awesome for this. Let me know if you're interested because I want to make a course about it.

This week's puzzle

Langton's ant (from Rosetta Code)

Langton's ant is a cellular automaton that models an ant sitting on a plane of cells, all of which are white initially, the ant facing in one of four directions.

Each cell can either be black or white.

The ant moves according to the color of the cell it is currently sitting in, with the following rules:

If the cell is black, it changes to white and the ant turns left; If the cell is white, it changes to black and the ant turns right;
The ant then moves forward to the next cell, and repeat from step 1.

This rather simple ruleset leads to an initially chaotic movement pattern, and after about 10,000 steps, a cycle appears where the ant moves steadily away from the starting location in a diagonal corridor about 10 cells wide. Conceptually the ant can then walk infinitely far away.

Model this problem and run it out to 10,000 steps. A 100x100 grid should be sufficient. You can stop if it goes outside of that grid, or if you hit 10,000 steps.

Bonus points for visualizing it (send me pictures!).

As usual, please send me your implementations. I'll share them all in next week's issue. If you send me one, but you don't want me to share it publicly, please let me know.

Rock on!
Eric Normand