← Contents · Runnable Specifications by Eric Normand · Work in progress · Comments
Chapter 1
Data Lens Part 1
Chapter objectives
Learn to analyze the structure of a domain and en-
code it in your language.
Learn to evaluate data models based on t.
Understand how domain models are constructed by
abstraction, encoding, and feedback.
This chapter presents a challenge to me as an author: I’ve
got to reteach something most programmers already do
intuitively without making the topic seem obvious and
boring.
You probably have an intuitive sense of how to model
the data of a domain. You likely do it every day. But some-
times we need to relearn the skills we rely on at a deeper
level so we can build on top of the new understanding.
The data lens is all about encoding the relationships
we nd in our domain using features of our language that
have the same structure.
41
Domain Code
1. abstract
3. evaluate
2. encode
4. look anew
model
0 04
by the end of
this chapter, you’ll
understand these
three diagrams
one of
one of
"super" "mega" "galactic"
Super Mega Galactic
alternative
42 Chapter 1
Welcome to MegaBuzz!
MegaBuzz is the premier fast-food coffee shop. We pride
ourselves on giant servings, coffee that needs milk to taste
good, and whatever avors you need to feel special.
Our barristas have been doing a good job, but were
growing like crazy. We need help from some software.
Can you help us design the data model?
Each coffee consists of one of three sizes, one of three
roasts, and optional add-ins.
Super
Raw
Soymilk Espresso
Almond
Chocolate
Hazelnut
Mega
Burnt
Galactic
Charcoal
{
"size": "super",
"roast": "burnt",
"addIns": ["espresso",
"soy"]
}
Coffee model
Sizes
Roasts
Add-ins
Coffee encoding: an example coffee in JSON
Each coffee encodes the choices of the customer. The JSON
on the right represents one possible coffee. It may seem
obvious how to encode that JSON, but let’s dive deep into
the process to really understand it.
43Data Lens Part 1
Encoding the size of a coffee
We’ll take the encoding one piece at a time. Lets start with
the size.
In our model, to choose a size, we have to select one
among the three different sizes. This structure comes up
so much, we can give it a name. We’ll call it alternative. Al-
ternatives mean choosing one from a set of options.
We want to choose an encoding that has the same struc-
ture as our model. In this case, we want to preserve the
one of” structure that characterizes alternatives.
We chose to encode the size by representing each choice
as a string, then representing the “one of” structure as a
TypeScript union type. It means that a Size has to be one
of those three strings.
one of alternative
one of
alternative in model
union type in code
type Size = "super" |
"mega" |
"galactic";
"super" "mega" "galactic"
Super Mega Galactic
TypeScript union
type indicated
by |
TypeScript type
declaration
Note
I’m using TypeScript types
when its notation is very
clear. Please don’t take it to
mean that domain model-
ing can be done only with
TypeScript or only with
static types. Its just a con-
venient way to express the
encoding.
44 Chapter 1
Encoding the roast of a coffee
We encode the roast in a similar way to the size. It is a
choice of one roast among many options, so it too is an
alternative.
We choose to encode the size as a union type of three
strings. Each string corresponds to one of the choices. And
the union type maintains the “one of” structure from the
model. Since the structure between the roasts is the same
as the structure between the sizes (one of”), it makes
sense to use the same encoding.
one of alternative
one of
alternative in model
union type in code
type Roast = "raw" |
"burnt" |
"charcoal";
"raw" "burnt" "charcoal"
Raw Burnt Charcoal
45Data Lens Part 1
one of
zero or
more
one of
zero or more
AddIn
alternative collection
type AddIn = "soy" |
"espresso" |
"hazelnut" |
"chocolate" |
"almond" ;
type AddIns = AddIn[];
Encoding the add-ins of a coffee
The add-ins have a different structure from alternatives.
First of all, you don’t choose one. You can choose multiple.
You can also repeat the same add-in, such as two espresso
shots. We’ll break down the structure into two parts:
1. Choosing the add-in
2. Collecting them together
We must choose each add-in in the collection, which is
very much an alternative. Each one is one of ve choices.
Then when we collect them together, there is a zero-or-
more structure to the add-ins. We will call this kind of
structure a collection.
Soymilk Espresso
Almond
Chocolate
Hazelnut
alternative in model
collection in model
union type in code
array in code
"soy"
"espresso"
"almond"
"chocolate""hazelnut"
We’ve chose to encode the collection of add-ins as an array.
There are many choices we could make since there are
many types of collections. We’ll revisit this choice soon.
46 Chapter 1
all of
all of
combination
type Coffee = {
size : Size;
roast : Roast;
addIns: AddIn[];
};
{
"size" : "super",
"roast" : "burnt",
"addIns": [
"soy",
"espresso"
],
}
Size
Roast AddIn[]
these are the
types we just
dened
here’s an example
coffee encoded in
this way
Encoding the whole coffee
Now that we’ve got the three components of a coffee, we
can combine them together. This time, the structure is
all of” instead of “one of” because the coffee needs a size,
roast, and collection of add-ins (which could be empty).
We call the “all of” structure a combination.
We chose to encode this combination as a JS object type in
TypeScript. There were other possibilities. We will explore
those later.
But, good news! We’ve nished describing how we’ve
encoded this model. Now we will take a closer look at the
choices we could have made but didnt.
combination in model
JS object type in code
47Data Lens Part 1
this is just one
possibility. we
should consider
more
Revisiting the size encoding
Here’s the same view of the size encoding we saw on page
43.
We’re going to zoom in on the bottom half of the dia-
gram, the part showing the encoding. We’ll keep the mod-
el (the top half) as given and we’ll explore different choic-
es we have for encoding the same model.
One thing I will emphasize repeatedly is that we should
consider as many options as possible for each design deci-
sion. The quality of your design is proportional to how
many possibilities you consider. While we’re here, we
should look at different ways we could encode the size al-
ternative.
Turn the page to zoom in on the bottom half of this dia-
gram and see other ways to encode it.
one of alternative
one of
alternative in model
union type in code
type Size = "super" |
"mega" |
"galactic";
"super" "mega" "galactic"
Super Mega Galactic
The quality of your design is proportional to how many
possibilities you consider.
48 Chapter 1
type Size = "super" |
"mega" |
"galactic";
enum Size {
super = "super",
mega = "mega",
galactic = "galactic",
}
interface Size {
name: string;
}
1
2
3
class Super implements Size {
name = "super";
}
class Mega implements Size {
name = "mega";
}
class Galactic implements Size {
name = "galactic";
}
Strings
+
union type
another possibility is to
use care instead of static
types. the same three
string values would be legal,
we just have to make sure
not to use anything else
Strings
+
enum
Classes
+
interface
Numbers Super
Mega
Galactic
alternative in model
possibilities in code
Many options for encoding an alternative
Possible ways to encode an alternative
Lets zoom into the encoding of the size alternative. Any
time we encode something, we have a choice in how it is
encoded. Our programming language gives us particular
constructs. We have to choose among those constructs
which ones have the same (or similar) structure as the
model we are encoding.
In this case, we are encoding an alternative in Type-
Script. We can list the constructs TypeScript gives us that
share the “one of” structure. We will see next how we can
evaluate them to choose the best option.
one of
Super Mega Galactic
49Data Lens Part 1
Fit: evaluating our encoding
We have lots of options for how to encode our models. We
need some way to compare them. Each option is slightly
different. We need some way to know which ones are bet-
ter for our model.
The secret is that there is no one way to evaluate them.
Why? Software design is hard. Its multidimensional. Its
too dependent on context for any simple scheme to work
every time.
This book is full of lenses, and each lens gives us a dif-
ferent way to evaluate our options. Here in the data lens,
we are going to use a concept called t.
Lets evaluate the t of a simplied coffee—one that has
only size and roast. To evaluate the t, we need to count
the number of states in the model.
Counting the states in a combination
Counting the states in a TypeScript object type
X
=
=
Our coffee is a combination
of two alternatives, each
with three options. When
counting the states of a
combination, we multiply
the states of the compo-
nents. So in this case, 3 siz-
es times 3 roasts equals 9
possible combinations.
We encode a coffee as a
TypeScript object type. To
count the states, we mul-
tiply the states of the two
components. In this case, 3
sizes times 3 roasts equals
9 possible states.
Both the model and the code allow the same number of
types. This is very important. We’ll call that perfect t, and
we’ll see this graphically on the next page. In addition, the
correspondence between the model and code are clear.
type Coffee = {
size : Size;
roast: Roast;
};
"super",
"raw"
"mega",
"raw"
"galactic",
"raw"
"super",
"burnt"
"mega",
"burnt"
"galactic",
"burnt"
"super",
"charcoal"
"mega",
"charcoal"
"galactic",
"charcoal"
50 Chapter 1
Model:
size x roast
Code:
JS Object
unrepresentable meaningless
representable
0 states
9 states
0 states
Fit: Measuring the encoding with the model
Fit gives us a way to quantitatively judge an encoding and
how well it represents the same possibilities as the mod-
el. Fit is not the only way to judge an encoding, but some-
times it is enough to show that one encoding is clearly
worse than another. Fit means we compare the states in
our encoding with the states in a model.
The best way to understand t is with a Venn diagram.
In one circle, we put the states that the model can repre-
sent. In the other, we put the states our code can represent.
The overlap shows which states are representable in our
model. The two non-overlap sections we will call unrepre-
sentable and meaningless.
Perfect t
When we compare the simplied model (the combination
of size and roast) to using a JS Object type to encode it,
we see that they have perfect t. Perfect t means that the
states the model can represent and the states the encod-
ing can represent are exactly the same. In other words,
the unrepresentable and meaningless parts are both zero.
51Data Lens Part 1
Degenerate case: booleans for size
Its often very useful to look at a very obviously bad case
when you’re trying to understand a concept. We call that a
degenerate caseobviously not the right answer. Lets look
at a degenerate case for encoding size, namely using a
Boolean.
Booleans have exactly two states: true and false. Howev-
er, our model needs three states for the size: super, mega,
and galactic. Lets take a look at the Venn diagram.
Its clear that we shouldnt use a Boolean to represent the
size. A Boolean can really only represent two sizes.
This analysis may seem obvious, but its only because
you’ve done the analysis. The same thinking extends to
many existing encodings. Without doing the analysis, you
may have a similar situation where you have states from
your model you cannot represent in your code.
Prefer having meaningless states over having unrepre-
sentable states. Very rarely can we encode a model with
perfect t. The world is nearly innitely varied and we
have nite tools in our languages. We’ll soon see how to
deal with meaningless states with normalization functions
and validation functions.
Model:
size
Code:
Boolean
unrepresentable
meaningless
representable
1 state
2 states
0 states
Prefer having meaningless states over
having unrepresentable states.
52 Chapter 1
4 problems encoding coffees with numbers
1
5
8
2
6
9
3
7
4
X
=
I mentioned before that we can encode the size using num-
bers. We can also encode these 9 states in the size x roast
model using the rst nine natural numbers
1. Bad t
This encoding has several drawbacks. The
rst is that the t is not great. Check out
the t in the Venn diagram to the right.
We can represent every state using Num-
ber, but there are many meaningless states.
In this case, we use JavaScript numbers,
which are 64-bit numbers. That means
that the vast majority of the possible state
space doesnt have any meaning.
2. Human readability
The next problem is that the encoding is
arbitrary. What does 5 represent? What
about 8? The human readability is very low.
unrepresentable
meaningless
representable
0
9
1.845x10
19
Model:
size x roast
Code:
JS Number
3. Difcult operations
We’ll look at operations more closely in the next chapter,
but just imagine what it might be like to change the size of
a coffee from super to mega. Or even writing code to deter-
mine the size of a coffee becomes a challenge.
4. Extra operations
You can add two numbers, but can you really add two cof-
fees? What about multiplication? And less than (<)? No,
these are meaningless operations.