PurelyFunctional.tv Newsletter 367: What about errors?

Issue 367 - March 02, 2020 ยท Archives ยท Subscribe

Clojure Tip ๐Ÿ’ก

What about errors?

So it turns out CSV parsing is not as simple as it at first appears. Last week I showed this code for converting the vectors of strings to sequence of maps:

(defn rows->maps [csv]
  (let [headers (map keyword (first csv))
        rows (rest csv)]
    (map #(zipmap headers %) rows)))

I wouldn't expect a beginner to write this themselves since it contains a handful of difficult ideas. Last week I suggested that it could be a learning moment for a beginner. It can introduce those ideas and the more general philosophy of transforming data.

Astute reader Alan pointed out that it doesn't have any error reporting. That is, if one of the rows has the wrong number of columns, it will still return a value, even though there is clearly an error in the CSV. That makes it even harder for beginners.

Alan didn't suggest this, but I thought it highlights another wrinkle in the ongoing question: would it be better for beginners for the library to handle this aspect of parsing? What would a correct solution even look like?

Again, I don't know. Answers to these questions form the design decisions that make for a good or bad library---given the task at hand. It's hard to speak of them in the abstract. However, I can say this: errors are hard.

Errors are a hard design problem. For instance, I've worked with large CSV files that I processed lazily. After 1,000,000 rows and hours of work, I'd hate for it to crash on a row just because there was a misplaced comma. I'd much rather it stash the bad row somewhere for me to take a look at later. But maybe for short CSVs, you'd want it to just fail when you parse it if any expectations about the data are not met.

That's the thing: you need a whole theory of errors and error handling for this to be useful. It would be great if the theory could be enshrined in the CSV library. Does such a theory exist? Is it the right one? What problems does it solve and what does it create?

It reminds me of many libraries I've used in the past where you needed to read lots of documentation and code just to understand what the errors meant. This happened to me frequently in the Haskell ecosystem, where there doesn't seem to be one error system that satisfies everyone. People come up with their own error types that have to be understood. I'm glad we just accept exceptions as the flawed yet default standard. I would sure hate to pass off learning a whole new error system to beginners.

One thing a CSV library could do is aggregate different error handling strategies. Each strategy could be enshrined somehow in code. Then it becomes the library's job of helping the programmer figure out which strategy they should use. But notice again the philosophy of pulling things apart. No pretenses that any of the strategies are correct for all contexts.

So, where does that leave us? I don't know. I still have a bias toward doing as little as possible and passing the complexity back to the programmer. A library should present a simpler interface than the problem it solves, otherwise the library is not useful. The error handling would have to be simpler than what I can do on my own, with all the domain knowledge and context that I have. Do the "beginner-friendly" libraries out there actually achieve this? Or are they just agglomerations of ad hoc rules that only seem simple because you can google a solution on Stack Overflow? Is that what "good for beginners" actually means? These are questions I wrestle with all the time, and I don't think I'll find an answer any time soon.

State of Clojure 2020 Results ๐Ÿ“‹

The results are in! Check them out.

Book update ๐Ÿ“–

I've done a significant revision to the first six chapters of Grokking Simplicity. I submitted them to the publisher, they've been approved, and the gears are turning to have those released to the public. Stay tuned.

This revision includes 19 new pages, significant updates and clarifications, better code diff formatting, and a bunch of typos fixed. It also splits chapter 6, which was long, in two (chapters 6 and 7).

You can buy the book and use the coupon code TSSIMPLICITY for 50% off.

Chapters 8 & 9 are next. Those are the vaunted stratified design chapters that mark the end of Part 1. They are already written. They just need a bit of layout polishing before I can submit them. I am very excited.

Clojure Challenge ๐Ÿค”

Last week's challenge

The challenge in Issue 366 was to calculate the combinations (as in 52-choose-5). There were many submissions. You can see them here.

You can leave comments on these submissions in the gist itself. Please leave comments! You can also hit the Subscribe button to keep abreast of the comments. We're all here to learn.

This week's challenge

Caesar's Cipher

Julius Caesar had many secrets. He used a simple form of encryption to hide his secrets from his enemies. The cipher was easy: shift the alphabet over by a number of letters and replace each letter with the shifted version.

For instance, if you shift the alphabet by two, you get:

a b c d e f g h i j k l m n o p q r s t u v w x y z
c d e f g h i j k l m n o p q r s t u v w x y z a b

When encrypting a string, we replace instances of the top letter with the lower letter. m becomes o. And notice that the alphabet wraps around.

This was cutting edge technology back in those times! You can get this as a prize in a Cracker Jack box :)

Anyway, write a function that will encrypt a string. It should take the shift number and the string. Leave non-letters alone and keep caps in place. You only have to deal with regular ASCII letters for simplicity.

Bonus: make it so to decrypt it you can pass in a negative number.

Thanks to this site for the idea.

As usual, please reply to this email and let me know what you tried. I'll collect them up and share them in the next issue. If you don't want me to share your submission, let me know.

Rock on!
Eric Normand