Should we waste memory?
This is an episode of Thoughts on Functional Programming, a podcast by Eric Normand.
We have so much memory now, compared to the 1970s, that it often seems like we have memory to burn. I misspoke in a previous episode where I made it seem like I'm in favor of wasting memory. But what did I mean instead?
Do I believe that we should be wasting memory? We've got so much memory these days. What's the point of conserving it? What's the point of trying to find a more efficient way of doing it? Shouldn't we just be throwing it away?
My name is Eric Normand and these are my thoughts on functional programming. Those are the questions I'm going to answer right now. In a few episodes back I think I mentioned something about wasting memory and how we have so much of it these days. I probably didn't express myself very clearly.
I probably said something about how we have so much. We're still treating it like we're living in the '70s when we had much less memory. It was very expensive. Why not just waste it, right? Why not just use it without trying to conserve it and make it efficient? That's not really the intention I meant.
What I meant was we've got a situation where a lot of our systems and our software development practices were created at a time when we did have much less memory than we have now.
One of the ways to work around that was to use a lot of mutable state and to make our records in our databases, stuff that's supposed to be persisted to a permanent medium, like a disk or a tape, we would make that mutable as well. What I really didn't express well was that we waste so much memory already.
We waste it, mostly because we can. We just bloat everything with huge stacks of abstractions. We're just bad at it now in a way that you couldn't even have imagined back then because we didn't have the same. While we're doing all that waste, we still throw away our records. We still overwrite stuff. We still use mutable state as if we didn't have the memory.
My main point is when we develop an information system outside of the computer, you want to track some paperwork, you want to...for instance, a good example is in a doctor's office. They're legally required to keep good records. At least here in the US we have to do that.
Part of keeping your license is keeping your records up to a certain standard, keeping them archived for a long time, having them ready if someone asks you for them, and keeping them private. There's this whole ethics around the records and maintaining them.
You see this over and over. Information systems in the real world have put a lot of effort into maintaining the records for, basically, forever in perpetuity. It seems nuts that we would allow our digital records to be done in a way that just throws them away or overwrites them willy-nilly.
We spend so much time keeping our paper archival quality. Acid-free pens are not going to fade. The ink isn't going to fade. We scan it. We make a backup. Then our database, if someone is fired or they leave a job, we just erase their row in the database.
I'm sure professional systems don't do this. Enterprise-grade software does not just throw away a row because someone leaves the company. They do overwrite stuff all the time. I remember I had to do some work on PeopleSoft once. PeopleSoft is like an enterprise-grade employment software system.
It's supposed to keep employee records — who gets paid, what their benefits are, all that stuff. You can just go in and change someone's name in the system. It would just be like it was never different. There's no record, no trace that that name had been changed. Change the field in the database and the person's name is different. It's like it had always been different.
That's really what I meant. That it's crazy that we are simultaneously wasting our memory like we have so much that we don't really need to conserve it, even though we probably do. At the same time, we are not using the memory for the good thing that it could give us, which is permanent records, which we work so hard to do outside of the computer.
Eric: One of the benefits of having immutable records is that they're very good for sharing, for concurrency. You could think of it as a shared resource. If it's immutable, then any number of threads could share that piece of data at the same time and none of them would unexpectedly see a change.
One of the hard parts of parallel programming is that you have to do locks around shared resources so that only one thing is changing something at a time, and that no one is seeing an inconsistent write. You avoid that whole problem entirely by not letting anybody write, basically. This is a different issue from making data permanent forever.
This is simply that they're in memory. While data is being processed, you also want it to be immutable. It's very related because you write it to disk and it gets read in later. It's still in process. It goes back to that definition of data that we had for functional programming, which is that data is a fact about an event that you can use for a later computation.
It's a fact about an event. Whether it's accurate information or not, it is a fact that I read this. The sensor could be malfunctioning, but I read this from the sensor at this time, or the sensor has a certain margin of error, but this is the reading I read at that time. It's the same with a file on a disk. This is what I read from the disk at that time.
It says that it is what was read at a previous time from a user input. It was saved. If you think of your data like that, you actually gain a shift in perspective, a shift up in perspective. It's no longer, "I know the user's name. I have captured this information. I know their name."
No, what you know is what was written to disk in the last measurement of their name, which was what was captured from a web request, which was what, presumably, was typed into their browser and submitted as part of a form, which came from their memory and their typing. There's all these levels of translation and potential sources of error in that chain of stuff.
To model that as an information system, that is what we're trying to do. You're not saying, "Oh, their name was wrong. Let me correct it in the system." What you're saying is, "I got a new piece of information. The name that the system is reporting because of the chain of data that led to this system having this last piece of data as the name."
I'm getting new information that says the name is wrong. Here is the correct name from this other person or the same person who's correcting. To think of data in that way, you're starting toward this model of an information system. It's less of a computer program and more of an information system.
All right. I don't have much more to say today on the nature of information systems. As homework, I'd love for you, dear listeners, to think about precomputing information systems that you might know of. I'll give you two examples, and I'll leave it at that.
The first is that one I talked about before. The medical records are a complete information system that allows for doctors to be wrong. They make a diagnosis and often it's not right. Then there has to be a way of correcting it. What they don't do is cross out or throw away the paper where they made the wrong diagnosis.
They have some other mechanism, which is usually to add another paper that says this other one was wrong. They made another measurement. That diagnosis was wrong. The second one is accounting, where you want to keep track of every transaction that happens, and sometimes you make a mistake.
There's all sorts of stuff about, "Well, this transaction happened, but we didn't record it until later, so we have to backdate it." There's all sorts of processes for adding transactions to the register out of order, or making corrections, or dealing with inconsistencies called reconciliations.
This is a complete information system. This is the kind of thing that functional programming moves toward when you're dealing with bigger software problems. All right. Thank you so much. I'll see you later. Bye.