What's the problem with using arrays for pizza toppings?

This is an episode of Thoughts on Functional Programming, a podcast by Eric Normand.

Subscribe: RSSApple PodcastsGoogle PlayOvercast

I discuss why arrays aren't great for representing pizza topping selection.


[00:00:00] What's the problem with using arrays for pizza toppings?

[00:00:08] Hello, my name is Eric Normand, and this is another episode of my podcast. Welcome.

[00:00:18] So we've been talking about using an array to hold the names of the pizza toppings for our pizza domain, our data model.

[00:00:38] Now, I've also mentioned that there's some problems with it. We've already just gone over, in the last episode, the problem of collections being in practice unbounded, you can have any number of toppings in there, whereas your business probably wants to limit it. And I said use stratified design [00:01:00] to defer that decision to the business layer.

[00:01:03] But there's another problem with using arrays, which is that I lied. I said that the array perfectly matches the possible states, except that you can have more than the limit. And that's not true.

[00:01:26] So the real problem is: if I add mushrooms and then olives (so I have an array that's mushrooms, olives), is that really a different pizza from olives, mushroom. And obviously if you built the pizza and you put the mushrooms on first and then the olives, or you put the olives on first and the mushrooms, no one would say, "these are so different. This is not what I ordered." They would not return the pizza.[00:02:00] It doesn't make a difference.

[00:02:05] So the problem is we're representing the same state twice. Pizza with mushrooms and olives, and olives with mushrooms. So we have two different data model representations, two different encodings that map to the same conceptual model.

[00:02:24] Right? And so this is a problem with fit, like we talked about before. It's not that the data model maps to something that's not possible, that doesn't make sense. It's that it maps two encodings of the same thing.

[00:02:41] And that becomes a big problem when you start to ask questions like how many pizzas had mushrooms and olives? Well, you have to actually check both. And it gets worse because it's not just two. When you think [00:03:00] about it, what if you have three toppings? Well, now you have mushrooms, olives, and artichoke hearts. We also have artichoke hearts, mushrooms, and olives; mushrooms, artichoke hearts, olives. You have all these combinations, and a lot of them are the same.

[00:03:15] They all have mushrooms and olives, and so now your query has to check all of those, right? So it's multiplied the awkwardness of your thing. It's not convenient to write this query anymore.

[00:03:31] States in your data model that are meaningless and also states in your data model that are redundant, those both cause your code to be messier, worse, it's wordier. There's just more to write, there's more checks.

[00:03:49] So you wanna avoid these scenarios. So how do we avoid it? You go back to fit. How do we more [00:04:00] precisely capture the states? If olive and mushroom is the same as mushroom and olive, it means we need a collection that doesn't count order.

[00:04:13] The two main ones that don't count order are set and map. Okay, let's look at set. We can choose a set, but then what if someone wants double mushrooms? So they want mushrooms and mushrooms? That's possible in our conceptual model. It was also possible in arrays, but with set, you can't have duplicates.

[00:04:38] So let's just cut that one out. Now with hash maps, you can't have duplicate keys, but you can represent the number, the count as the value. So you can have something like mushrooms two as [00:05:00] the key value in your map to represent the fact that you want two servings of mushroom toppings on your pizza.

[00:05:10] So then we have a few rules we had to put in. Well, if it's not in the map, if there's no key for that topping in the map, then it means zero. Like that's the default. And then we have to have a rule like before that the sum of all the values equals three or a fewer. That we would put in the business domain.

[00:05:34] So that's lucky that we can find this representation because now what happens is our map has equality semantics, right? So no matter what order we put stuff in, if I have mushrooms, one olives, one, those toppings lists will be equal, whether I put [00:06:00] them in mushrooms first or olives first, right? So equality, it turns out, is actually a really useful operation because it forces you to make sure that different ways of generating the same pizza-- the different orders of adding toppings-- result in the same representation of the pizza.

[00:06:23] That's hinting that equality is important. It's sort of always there and implied. Secondarily, it's pointing at the level three, which is algebraic, because if you don't have equality, it's really hard to define algebraic properties.

[00:06:41] All right. That's about all I have to say about this topic. My name is Eric Normand. This has been another episode of my podcast. Thank you for listening, and as always, rock on!