Reduce complexity at every step

This is an episode of Thoughts on Functional Programming, a podcast by Eric Normand.

Subscribe: RSSApple PodcastsGoogle PlayOvercast

Postel's Law states that a program should be liberal in what it accepts and strict in what it sends. What does this have to do with functional programming? How can this help us reduce complexity?

Transcript

Eric Normand: Postel's Law states that a program should be liberal in what it accepts and strict in what it sends. What does that have to do with functional programming and complexity reduction?

Hello, my name is Eric Normand and these are my thoughts on functional programming. Postel's Law is this thing that is used in the definition of the web, that you're supposed to for instance, make as much sense as you can out of HTML even if it's badly formatted.

If there is a missing and div tag, you're supposed to do your best as a program to keep parsing it, display what you can, make what sense you can out of it and try to present the data. At the same time if you're sending data, you're supposed to make it as strict and well formed as you can.

There is several elaborations on this law. It's also used in networking, that if you're receiving a packet, you're supposed to try to interpret it. But if you're sending a packet or you're using a protocol, you're supposed to restrict yourself to the strictest possible subset of the protocol that you that you need. Don't use some obscure part of the protocol.

In general, I think people look at this as one of the reasons why the web actually works as well as it does that everyone is trying to be strict, but then they're also being lenient when they're accepting something. But there are some criticisms of it too.

One of the criticisms is that. The biggest criticism, I guess, is that what happens is because programs are being lenient, it hides problems until later. If you do leave out that div tag and the browsers let it happen, and they still show the web page as. They might guess correctly and put the div tag in the right place and you look at their page and even though there's a big mistake in it, it still looks fine, and you don't notice as a user.

Then later someone comes in and they have to write software to read your HTML, and they don't implement the error correction in exactly the same way as the browser, so it's like a bug for a bug implementation. The software doesn't read it properly and it doesn't know what to do and it gets the wrong answer.

Whose fault is that? Is it the fault of this new software for not being exactly like a browser, or is it the fault of the HTML for giving the bad output? It kind of doesn't matter whose fault it is because the end result is the same, the software that you write doesn't work. But if you are the one writing that software and you're under contract, they're going to blame you not getting it to work and you might not get paid.

This is one of the problems with it, that it hides problems until later. You could be writing terribly formatted HTML, mistakes all over the place, but the browser hides those problems from you. I have had to deal with HTML that was badly formatted and it's a pain. It's a real hassle.

My general conclusion is that Postel's Law doesn't actually do what it's supposed to do, which is make a system more robust. What it does is it might make it more robust in the short term, so a new entrant into the system, like a new piece of HTML, is going to work more likely.

It's going to be more likely to work despite errors because all the existing infrastructure there is being lenient, and you are coding against that, the current leniency. But in the long run, that HTML might outlast some of those things. New things come in and they have to deal with all these problematic data that is out there on the network.

In the long run it's actually bad for robustness. The history of it is it comes from this principles in electronics, that if you are making a component like a resistor or a capacitor, you are supposed to accept a wider range of voltage than you output. If you even imagine all these things being chained up in series, that one could be noisy. Noisy meaning it is not sending a clean signal out from its output, but the next one in the line is going to clean up that signal and the next one in the line is going to clean up that signal.

So any noise introduced, each component is doing a little bit of work to clean up that signal. The end result is, you should have a clean signal, even if there are faulty components on your...I don't want to say faulty, noisy components in your circuit.

This works, because the components are like these self-contained modules. A single person is putting them together and is going to test them as a whole. They're not like a browser that is talking to hundreds of different computers all the time, getting signals from all of them and new things are entering into your circuit all the time.

It's a much more controlled system. It works better. Plus, voltage is a much less complex signal than an HTML message. I don't know if it makes sense to try and interpret this tree of HTML nodes. Do the best you can on it. It's not like a voltage that's a little too high. It's a different kind of animal.

It's a digital signal. That HTML is just not a tree. It shouldn't even be considered HTML. Anyway, I think about this a lot because we need to come up with, in software, these kinds of principles. But they need to also work for us. We don't want to just blindly do them. One thing that we don't have so well-established is this notion of real components.

https://twitter.com/ericnormand/status/1026845611738443776?ref_src=twsrc%5Etfw

I was talking to Jerry Sussman when he was at my conference, Clojure SYNC, back in February. He gave the example that back before...he said it was something like 50 years between the formalization of electricity and the scientific formalization of the rules around it, the laws, and the development of the different components that we take for granted today; The resistors, the capacitors.

It took a lot of time to figure out that you just want a little thing that's a unit of resistance that you can reason about and put together into a bigger circuit. You don't want to be thinking about the resistance on the line. "How do I get a resistance in there?" "How do I do a little capacitor?"

These components didn't exist at first. It took time to develop those. We're kind of in that same era with programming. That we don't know the components that can let us have a very reliable system that we just build up out of these pieces.

I think a lot about that, and I don't have an answer. I think that a lot of the systems that we think of as most reliable are attempts at that. As an example, Unix is trying to make everything into processes, so programs and text streams.

The programs can communicate through text streams, you can pipe between them, they can read in files from the file system which are text streams, etc. You have something like Smalltalk, which is everything's an object with message passing between the objects. This is a little communication system that you've set up within your software. Something like Erlang is similar. Its processes, talking with messages. Even something like Lisp. It's like everything is a [inaudible 10:27] . You have this unifying system going on.

Those attempts, we need to come up with something. One of the things that I've been thinking about is, instead of saying we need to be liberal in what we accept and strict in what we send, maybe there's some other thing that we could do that would be useful now, even if we don't have some unifying theory of programming.

It would be useful now in your software. What would that be? What I've been thinking about is that part of the role of functional programming. The thought process you go through when you're doing functional programming is you want to reduce complexity. So if we could make each function reduce the complexity of the output.

https://twitter.com/ericnormand/status/1025758650080616453?ref_src=twsrc%5Etfw

This goes back to the episode I did on variants. Where the idea is, you want to choose an output type that is only as complex as it needs to be, to represent the thing that it is calculating. That is the idea I'm proposing now. It's not fully fleshed out. This a discussion, I'm opening a discussion.

If you make each function reduce the complexity instead of expand the complexity, your system will be less complex overall. What increases complexity? Anytime you have cornering cases, any kind of branching that you need to do, all of that is increasing complexity.

If you can make it reducing the number of corner cases. Not just corner cases, cases in general. You are reducing the complexity to what is strictly needed to represent the thing you want to represent. As an example, it's very common to use a list or some other container to represent multiple values. But sometimes, there's no use for that empty container.

There's never going to be a time when you have zero things. That's just like a little corner case that you're introducing into your system. I know we don't typically have containers that cannot be empty. They'll always require a thing. It's a place where we often add complexity because we're giving it a case that cannot possibly exist. We're able to represent it, but it shouldn't be ever possible.

Also, as dynamic typers, we have a tendency to return different types from the same function. We'll maybe add some meaning to different types. "A string means this and a number means that." That thing is also adding the complexity. Any time you return a different type from the same function, you have to check that type somewhere else.

https://twitter.com/ericnormand/status/1025033607478501378?ref_src=twsrc%5Etfw

You're adding in an if statement, at least one because you might use that function in multiple places. You have to add an if statement somewhere else that knows all the types that this function is going to return.

You've smeared that complexity all over your software. It would be much better to capture it some other way as a single type with one case. All right. This has been a little rant where I'm trying to explain this idea of complexity reduction. Our software might get more robust. It's just an idea. I don't know how you might test this kind of thing.

I don't apply this everywhere in my code. I can't even recommend it. It's just an idea. If you like this idea, if you hate this idea, let me know. I want to discuss this. I want to understand this better. I want to see bounce out of other people.

On Twitter, I'm @ericnormand. You can also reach me by email, eric@lispcast.com. Please subscribe and or like the podcast wherever you get your podcast. All right, thanks so much. See you later.