A Theory of Functional Programming 0006

This is an episode of Thoughts on Functional Programming, a podcast by Eric Normand.

Subscribe: RSSApple PodcastsGoogle PlayOvercast

What I want to talk about is this issue of what is an action and what is a calculation in terms of timeliness, because we know that deep down in the computer, everything is an action. Every operation depends on what is that particular locations in memory at the time that the operation is run.

Transcript

My name is Eric Normand, and I'm writing a book called "A Theory of Functional Programming." I want to talk about a certain topic from that book and I'm going to use the transcript of this video to help me write the book. Everything here might make it into the book. At least the topic is going to. Who knows what's going to happen with the editing.

What I want to talk about is this issue of what is an action and what is a calculation in terms of timeliness, because we know that deep down in the computer, everything is an action. Every operation depends on what is that particular locations in memory at the time that the operation is run.

Even like addition, an addition in the mathematical sense is a calculation, it's timeless, but when you run the add instruction, it reads two locations in memory which are mutable, and it writes to a location in memory, which is also mutable.

It is an action deep down and how the machines work, because our machines are based on the idea of a turing machine. It's all actions all the way down.

https://twitter.com/ericnormand/status/1073669879625068544

How is it that we can claim that we actually have calculations and it's just through discipline, the discipline of our compiler, or even just as programmers, we set it up so that those registers don't have anything important in them so we can overwrite them and we're going to pull the value right out of the register and put it on the stack or stored in a variable right away.

We've got these disciplines that allow us to do the operations in a way that doesn't depend on when it's run. Like I said, the disciplines are either enforced by the programmer or by the compiler, and both of them are fine. The compiler makes it a lot easier on the programmer and avoids a lot more mistakes.

https://twitter.com/ericnormand/status/1008001425417306112?ref_src=twsrc%5Etfw

In the sense, it's all an illusion. The whole idea of timelessness of calculations, but it's a very valuable illusion. We can make it pretty safe. Pretty reliable.

There's another issue though which is that you have language like Haskell just as a really good example of a functional language. It has a very hard line that it draws between calculation and action. The calculations are function types. All of the functions are pure and so their calculations and I/O is reserved for actions.

The thing is that's a great line. That is where it should be drawn as a language making this choice for the programmer. That is the right place to put it. There are some actions, let me put it this way, what if I am writing and reading to a file? That's definitely an action.

You can't really control the file system. It's shared by all of the programs that are running on the computer at the same time. They can read and write to it as the same time as I do. It's definitely a time evolving. It's bound up in the time because I could write before they write or I could write after they write.

It really changes things. What if I were to add a lot of discipline to my program to lessen the chances almost to zero that someone else would be writing and reading at the same time as me? I would generate a new temporary file with a random name so that no one could guess it.

I read and write to it very quickly and then immediately delete it. Basically there's near-zero chance that at any moment another program could guess the file, read from it, write to it, and mess up my code.

In the same way, the operating system will protect the memory, saying, "You can't read that memory from this other program. You can't write to the memory of the other program." You're enforcing a similar discipline. Would that still be considered an action?

Does it matter if it's a near-zero probability that someone can mess with my code? I do it in such a way that it's a pure calculation. I just needed a temporary file to hold some values for me. You can argue — and it's the right way to argue — that that is not an action.

You're using discipline to ensure that it doesn't depend on when it is run, or how many times it is run. We can move that into a calculation. In fact, Haskell gives you a way to take something that is typically an I/O, meaning disk reading and writing — disk I/O — any kind of I//O but just in this example disk I/O is an I/O.

They give you a thing called "perform unsafe I/O." What that does is it takes an I/O and turns it into a calculation. That's great. Haskell knows that this is a thing that you're going to run into, where sure it's I/O in the general sense. In this particular case, I want a compiler that trusts me, that I know what I'm doing and it should be a calculation. That's great.

https://twitter.com/ericnormand/status/1027932794620719105?ref_src=twsrc%5Etfw

Similarly, if you have a calculation...it's a pure function. It's like a statistical calculation, but it uses big data. It takes a long time to run. It's a very complicated, convoluted algorithm, but it's pure. If you run it twice, you get the same answer.

The issue is it takes 24 hours to run. In general, this would be considered a calculation because it's pure, but when you start running it, you're going to wish that you had started 24 hours ago. That's yesterday. You're not going to know the answer until tomorrow. It does matter when you run it.

All calculations take some time. Usually it's so little time that we don't care. 10 milliseconds on the human time scale, we don't care. Hundred milliseconds, we don't really care. A minute, maybe we start to care, but 24 hours, we definitely care. That's starting to get into, "Oh, is this going to be done by the end of the week?" territory.

https://twitter.com/ericnormand/status/1067086416163860480

What do we do? We should call that an action. It does matter when you run that. What I'm trying to get at is this principle that this is an illusion that we are building. In many cases, we can get away with it because everything takes time. Everything has a side effect... [coughs] Excuse me.

Everything has a side effect of using memory. It's putting memory pressure on the garbage collector. It's using one of the CPUs while it's running. That's affecting the scheduler, and what other things can run at the same time, how long other things take to run.

It all does have an effect. It's generating heat. It's using electricity. All those things. We ignore those. Like in a physics problem at school, we would ignore friction. Even though we know it's there. We know that there must be friction. We just don't count it. We know there's air resistance. We just don't count it. The answer you get is good enough without it even if you ignore it.

We do the same. We say this calculation is timeless enough. We don't care about the time it takes.

[pause]

Eric: Some languages, like I've said, give you a little bit more support in the discipline. For instance, Haskell gives you a lot of support because its functions are all pure by themselves. Clojure gives you a lot of support because it has immutable data structures.

https://twitter.com/ericnormand/status/996827734931361792?ref_src=twsrc%5Etfw

I believe that functional programming is something you can do if you know what you're doing. It is possible to do it with purely programmer discipline and not explicit support from the language. This is why, because it's all a choice anyway. It's all a system that's built up by the compiler and the programmer anyway.

This has been me talking about the illusion that we're creating when we're doing functional programming. An illusion of timelessness, an illusion of timelessness of data as well. A lot languages don't have immutable data structures. Everything is mutable. What's popular now is having a variable that is unchangeable after it is set.

In JavaScript, you have const. Even if you use a const, if you assign an array or an object to that const, you can still mutate the array or the object. It's not really immutable. It's only one level of immutability. What you want is to guarantee that when you generate data it is an actual bit of data. It is a value. It is not a place that can change. It is not a place for storing stuff.

It is a record. A record means something that you keep forever. You've recorded and it's stored forever. I don't know if I've talked about that enough. Data is a recorded fact about an event that happened. This is the definition of data we talked about before. By record, it means that we want to keep it. You record something to keep it around later.

https://twitter.com/ericnormand/status/1006551766308130817?ref_src=twsrc%5Etfw

We've lost that idea of record generally in computing, in software, mostly because we had a very limited amount of storage space. Hard drives were expensive. RAM is certainly expensive.

We're starting to remember that in any other information systems outside of the computer, say in an accounting system or a medical records system, when you say I have a record, it means it's permanent. You want to keep that record around forever. You don't want to change it.

The typical threat at school — at least here in America — any absences will be marked on your permanent record. That's exactly what we want. We want things to be stored forever. It's permanent.

Of course, that permanence of storage doesn't really count for stuff in RAM. If we turn off the computer, boom, you've lost it. We do want it as a transient place to store something before it gets recorded. Putting a record immutably in memory so that we don't have to worry about it being overwritten, that is very useful.

As a discipline, functional programming moves toward this immutable data structure approach because it allows you to reason very locally. You don't have to worry about who else has a pointer to this data structure if it's immutable. You can just rely on it never changing.

Whereas if you have mutable data structures, you have to have a very strong trust, a very strong contract with the clients that are using you about ownership and what you're allowed to mutate. The language will allow mutation. There's a lot more non-local reasoning that you have to do.

You have to say, "Look, I'm going to give you this value. It's not yours. I'm just letting you read it. Do not write on it. I'm letting you borrow it but return it in the same way you found it." Whereas in an immutable thing that's enforced by the run time.

Often what we'll do is we will to help enforce the safety we will use a discipline called Copy on Write or something even stricter, which is Copy on Share. Every time someone asks for a value that is mutable in the language, say in an object in JavaScript or an array in JavaScript, every time someone asks for it, you make a copy and pass them the copy so they get a fresh one.

https://twitter.com/ericnormand/status/1043213984483213312?ref_src=twsrc%5Etfw

You don't have to trust them anymore. You can write all over this if you really want to, it's not going to affect me.

The problem is that's very wasteful in terms of memory and it can easily get out of hand. For instance, copying an array is a linear operation. If you don't know that it's copying, because this is part of the contract, you have to tell them, "Hey, this is now...Reading this value is now a linear operation."

Now, it can become accidentally quadratic, because what if they ask for the value, not knowing that is linear? Let's say they just assume it's constant time, it becomes accidentally quadratic, because they're asking for each element.

See? It complicates things to have mutable things, because if you want to enforce it through discipline, you have to have all these checks. You have to have a trust, you have to have a contract that's better understood. Whereas a mutable is a very simple contract.

I'm going to give you something, it just can't write to it. You can try, it's not going to work. It's easy. It's very easy to reason about.

I've talked enough about this. Thank you so much for listening, tell the robots what you like, so they know how to share and recommend to other people. They don't have emotions, they can't tell what's good. Let them know. Let them know what you like. Hit those buttons, mash them, subscribe.

Sorry about the audio for some previous videos. I really think it's my headphones, not these. I switched back to these. These don't work so well, but I think the microphone is fine. Sorry about the audio. I bought a new pair, they're in the mail. We'll see how they work.

See you later. Bye.