A Theory of Functional Programming 0005

This is an episode of Thoughts on Functional Programming, a podcast by Eric Normand.

Subscribe: RSSApple PodcastsGoogle PlayOvercast

There are different patterns that we use as functional programmers to reduce the possible states so that it becomes easier to reason about. I think that this is something that we should talk about a little bit more, because it's actually something that isn't talked about much in imperative programming.

Transcript

Eric Normand: Hello. My name is Eric Normand. I'm writing a book called "A Theory of Functional Programming." In this show, I explore the ideas, and then I'm going to have them transcribed, and I'm going to use it to start my book.

You get to hear them, hear the ideas before the book comes out, hear them raw, and listen to me develop them. If you're into that, welcome. I'm talking about a definition of functional programming.

We've already gone through the different parts of functional programming. That everything is broken down into three pieces — data, calculations and actions. During the actions explanation, I was talking a lot about how it's about collapsing the different states, the possible states.

https://twitter.com/ericnormand/status/1002203223808204804?ref_src=twsrc%5Etfw

There are different patterns that we use as functional programmers to reduce the possible states so that it becomes easier to reason about. I think that this is something that we should talk about a little bit more, because it's actually something that isn't talked about much in imperative programming.

That even though everything in imperative programming is an action, meaning everything potentially depends on when it is run or how many times it is run, there's almost no safety or no analysis of what that entails.

What does it entail or that the program — without this analysis your program can't run in a parallel environment. The resource sharing is all done ad hoc, and usually in a buggy way, that it's very hard to do modularity, it's hard to have modules because everything is bound up in time.

https://twitter.com/ericnormand/status/1075420919932301312

Sorry, there's a garbage truck. You can't trust that the thing that you're using is going to be safe to use because something else might be using it. The time that you use it, when you use it, becomes an issue. If you can't control when you use it, then you can't control how your system will execute.

What you often see, and these come up much more in the parallel system. In a sequential program, you can control when things run because there's only one thing running at a time. Every imperative statement happens after the last.

Even if you get into something like JavaScript, which has a single thread of control, yet there are multiple things happening, for instance, Ajax requests that are inflight, you can't control when the callback is going to be called for that Ajax request.

Stuff is going to be happening on the server so you're in a distributed system. Even in a system as simple as JavaScript, you start to face these issues of order, of when things happen, of how many times they happen.

You can't rely on those shared things because at any point a callback could...You can't rely on the order of callbacks. There's a lot of ad hoc solutions to these if you see springing up.

What functional programming really tries to do is to collapse these possible states into a reasonable subset of the states that you can rely on. I talked about this. I'll just give a few more examples. I'll repeat some examples.

https://twitter.com/ericnormand/status/1047562445143887873?ref_src=twsrc%5Etfw

In a system with transactional actions, so these transactions let you do a series of actions within the transaction so that you maintain the asset semantics so that they're all or nothing. What you get is serialized actions. You compose multiple actions together into a single action.

That single action happens as a transaction so that those actions will never be interleaved with other actions in that same transaction boundary or context. What this guarantees is that you don't have any simultaneous actions within that context. Otherwise, if you didn't have the transactions, you could interleave them in a number of ways.

The more steps you have in that action the more ways you can interleave. Now what you have is only two ways to interleave. A happens all before B starts, or B happens and finishes before A starts. You've collapsed this insurmountably huge number of possible states into two. It still depends on the order.

https://twitter.com/ericnormand/status/1083393507505700865

They could give you two different answers, but those two states are both acceptable in many situations. You have to decide whether those two possible situations are acceptable in your circumstances in your domain. If you do need to maintain complete order, you probably have to rely on something else. That's OK because there's other things.

Another thing, if you need complete order, so you need to guarantee that A happens before B, what you have is a queue with a single consumer, a single worker, to add on the end that they're reading off of that queue. It completely orders everything. A lot of what we do in the functional programming is this kind of reasoning about the possible states that are acceptable to us.

https://twitter.com/ericnormand/status/1053662855801782272?ref_src=twsrc%5Etfw

Each one has its advantages. What we're more worried about is the shortcomings and how much are we willing to give up or how much can we give up in terms of control and still maintain our invariance. If we don't really care about the order but we can't have any interleaving, you see we can give up order. We're fine because in our situation we don't care.

I feel like I need a worked example on this one. I need to go through an example and show, "Hey, we don't care about this in this case" so you see we can use XYZ. We do care about that here, so that's it. A good example that I like to use a lot for talking about concurrency, parallel systems, is a bank. There's a lot of concurrency in a bank, a lot of shared resources in a bank.

The one thing that banks maintain and really make easy is that you write checks. Let's say you write five checks to five different businesses. Each business is going to have its own process for depositing that check in the bank. They might mail it in and then the mail is going to be delivered in different orders.

Then when they get them even to get delivered on the same day, the bank is going to be on big stuff in a bag. The bank is going to take them out in a different order from the order that you wrote those checks in. You wrote check 200 then 201 then 203 then 204 then 205. The bank might not get them in that order. Probably won't.

Does that matter? No, it doesn't matter. That's the whole point of a checking account is that the way they process them it shouldn't matter. In fact, most banks will let you go negative during the day. You could have a negative balance on your account because you [inaudible 11:20] . You've been spending and depositing so you take a credit and a debit on your account.

They allow for more debits than credits because they don't reconcile the account until the end of the day. Once all the credits are in for that day, then they say, "Hey, you've overdrawn." They don't want the order to matter at least during that day. A reasonable amount of time for things to get out of order and then to be put back altogether.

It's just addition and subtraction. It's credits and debits. The orders shouldn't matter what or how much you do that. Of course they do have those policies of like, "Oh, you've overdrawn for two days." Whatever their policy is, if you overdrawn for two days then now we're going to charge you a fee.

As long as you're making those deposits and taking money out back and forth, they're not going to dock you for getting under twice in the same day and then coming back over. They won't do that because they know the things coming in out of order. This is a concurrency thing.

It allows the banking system where everyone is like a node in this network making debits and credits to their centralized account and using slow unreliable networks like Micromail. I don't know. [laughs] I should break this down into how this works in functional programming as actions. Think of it like the book of your account.

The transaction ledger of every transaction is a data structure Immutable append only data structure that allows things to come in out of order. Instead of comparing...Of course if things come in different orders in two different universes like check one with deposits or check two with the other one was check two or check one, you're not going to get the same book.

https://twitter.com/ericnormand/status/1006974540038262785?ref_src=twsrc%5Etfw

The checks are going to be written down in a ledger in a different order but the final balance will be the same. That is what you care about. You can let the order slip as long as the final balance come out to be the same which is all that matters in the end. The same checks keep running down regardless of the order and the balances, etc.

This is one of those variance that you have to have that to have a good banking system. It allows for a lot of leeway and how things get implemented and your policies at your bank. What order things happen there or whether you can let different people share the work.

One person, one of your employees will handle all of the blue-colored checks and another person will handle all the red-colored checks, or one person will handle all of the odd number checks and one person can handle the even number checks. You can group them however you want. That's going into some pretty cool algebra territory which I want to get to in a different section.

We'll talk about them right now.

All this is to say that you are collapsing the number of possible states. The more the states the harder things are to match because states compound. If there are two states at variable A, is just a Boolean. Then you have two states at variable B another Boolean.

That's four possible states. A could be true and false. B can be true and false. Its true true, true false, false false, false true. Four possible states. Then you add a third variable that's also Boolean. Now you have eight. You added four variable, you have 16. It doubles each time you add a variable.

That's just for Boolean. That's just two states. Now you're getting into the state space that is becoming very hard to match. What you would rather do is have them collapse at some points. Did the message from A come in first or the message from B come in first? That's a Boolean. A first to B first.

It's two different possibilities. If they doesn't matter, boom, we collapse them down to one. It doesn't multiply anymore. This is a very important property that we can take advantage of to simplify the possible states that we have to reason about in our heads and our software has to deal with. Every state you have is possibly another conditional. Another if statement.

https://twitter.com/ericnormand/status/1043153342229299201?ref_src=twsrc%5Etfw

This has been about 17 minutes. I think I'm done on this topic. I do want to go over at better example. I promise I'll get one soon in the future episode. Please, the AIs have...Those poor robots, they have no emotions. They feed off of your emotions.

The only way they can make good recommendations is by having massive amounts of data about our emotions as humans. If you please, press the Like button, press the five-star rating, whatever you have to do to tell those robots that you like this. Just do it, because they need it. Those poor robots. We haven't invented artificial emotions yet.

Please help them out. Click those buttons. Let them help other humans by recommending what's good. All right, see you later. Bye.