What is Immutability?
This is an episode of Thoughts on Functional Programming, a podcast by Eric Normand.
Functional Programmers will talk about immutable values. What do they mean? How can you write software where none of the values change?
Eric Normand: What do we mean when functional programmers talk about immutability? Do things really have to never change?
I want to talk about some of the more practical aspects of it, not just the theoretical. The theoretical says that an immutable value cannot be changed. Now, in practice, there are several ways that you can enforce that.
We don't mutate
Sometimes you don't have immutability in your language so you have to do something else. It's really just how you interpret that cannot. One thing is, the language could enforce it or the particular object, the data itself, might not have any methods on it that allow it to be mutated.
That is one way of enforcing this discipline, this rule, that things shouldn't be mutated, they can't be mutated. Another way is you could just have a, let's call it, a developer policy that says, "We don't mutate."
Maybe the object is mutable. You could but you don't. You just apply the rule, but you have no help from the language, your compiler, or anything. That's the practice of it.
Discipline #1: copy-on-write
There's actually two and a half disciplines for doing this. The first one is called copy-on-write. This means whenever you need to make a modification to an object, you don't know who else has a pointer to that object so you don't want to change that because then you're breaking your rule of immutability. You make a copy and then change the copy.
You're changing it. You're using mutation, but no one has a reference to it yet. No one has seen it but you. Then, once you give it out now, boom, the immutability rule is now enforced and no one else can change it. You're not going to change it and you're saying no one else can change it.
The rule has to be applied unilaterally. Once you give away a pointer to it, you cannot change it. If you get something that you didn't create yourself, meaning you know exactly the code path that it went through to get to you, you can't change it either because someone else might already have it.
Those are the two rules. If you get something, you can't change it. If you need to make a change, you have to make a copy and then change that copy that you control. That's called copy-on-write.
Discipline #2: copy-on-read
There's another one called copy-on-read. This one is, if you get something from someone and you're not sure of this library, this API that you're using, believes the same things you do about immutability.
If it's even implemented correctly, if they enforce the rule properly, so you don't believe that this thing, this system, is immutable. What do you do? You make your own copy of the thing.
That means that the library can modify the one that it gave you all it wants. What's important is that the one that you keep is under your control and you're only going to pass it to things that you believe will be immutable.
Rules of the 2 disciplines
Let's talk about the rules of this. If you get something from an unreliable source and you don't trust that it enforces immutability, you make a copy and you throw away the old one. You throw away the one they gave you. You just immediately make a copy and that's the one you keep.
Then, if you're going to give your ostensibly immutable thing to something else, you don't trust that it's going to maintain your discipline of immutability, you have to make a copy before and give them the copy.
You can say like, "You can do whatever you want with this. I've got the original copy. You can tear that up. You can write all over it. I've got the original."
Those are the two main disciplines. If you're living in a language or a runtime that allows for immutability of your main objects, you're going to have to do those kinds of things, those two disciplines.
One is defensive and one is more offensive. You're going to probably have to do both depending on the situation. If you're dealing with unreliable code, you want to do copy-on-read.
If you're dealing with your own objects that you maintain, you're going to do copy-on-write. Because, within your own code, you trust that you're being immutable, that you're enforcing your rule of immutability. You don't have to copy every time you read. You only have to copy when you need to modify it.
Discipline #2½: append-only
The half one, I said there were two and a half. The other one is called append-only. This is only a half because it doesn't really apply in most cases. It's just not applicable. You can't use it.
There are certain data structures that let you apply an append-only discipline which means you don't modify the thing, you only add to the thing in a way that doesn't modify the original.
Just as an example, if you have a stack and let's say you only push, you never pop. It's append-only. It's push-only in this case. If you have this stack, things can push without modifying anything underneath the top of the stack. You can share this thing around because the stack is only growing. None of the old stuff is being modified.
It's immutable in that way. Now, if it's an immutable implementation of a stack, meaning every time you push, you're actually creating a copy, then that's even better. I'm just talking about a regular immutable stack where you're adding stuff.
All you can do is add. You can't remove and you can't modify anything that's on there already. That has immutability to it. It's only growing in a known way without modifying the old stuff.
In theory, if you had a pointer to this stack up to this height, you know that stuff is only getting added on top. I can still read this safely and be sure it's never going to change.
Of course, this isn't just another discipline. This is append-only. Does your language help you do this? Does it have a pop method? Does your stack have a pop method that someone might accidentally call that doesn't realize that it shouldn't do that in your system?
HashMaps and Key Values
That's how much the language affords doing immutability. Some languages like Clojure enforce these automatically. It does not enforce copy-on-read. It doesn't need to because the objects are immutable. There are no methods on them that can change the state inside the objects. They are immutable.
It does do a copy-on-write. If you want to add a key value to a HashMap, it actually makes a new HashMap with all the existing key values plus the new one that you're adding. It does that very efficiently because it's going to share most of what already exists with the old copy.
You can have both copies in memory and even if there's a thousand keys and values in that HashMap and you add a new one, in theory, you have two HashMaps with 1000 key values and a 1001 key values.
Most if it is shared between the two. Only a very small amount of memory has to be duplicated to have both. That's called persistent data structures. What persists is the structure of the preexisting HashMap when you make a copy to a new HashMap.
That's immutability. I hope that explains what it is and how it's done. I didn't really explain very much about why, but that could be a topic for another of these episodes.
Thank you. See you later.