My response to Out of the Tar Pit

Out of the Tar Pit came out 14 years ago and it was a big influence on my thinking. I've thought a lot about it and I want to share some extensions and refinements of the ideas in the paper. Specifically, I hope to present a more objective definition of complexity and refine the idea of Essential vs. Accidental complexity.

Transcript

is software complexity really synonymous with hard to understand and is the line between essential and accidental complexity really the line we want to draw. Hi my name is Eric Normand and I help people thrive with functional programming. In this episode I want to talk I want to give my response to out of the tarpit. In the last episode I read some excerpts from it and I had some comments on it but I wanted to leave the comments that I made to be mostly about just interpreting what the paper says but I also have thought a lot about this paper is very influential on me and I feel like in my work in writing the book I'm writing, Grocking Simplicity, I've taken these ideas a little bit further and made them a little bit more useful. So let's get started. The two things I want to focus on are those two things that I ask questions about. The idea of complexity and the idea of essential versus accidental. So in the paper they really talk about complexity. They don't define it ever which I found to be problematic after several readings. I wanted to know a little bit more but the hint that they're really talking about anything that makes software hard to understand. The cause and effect there are a little blurred but in the end they're kind of just like parenthetically, you know, complexity, parenthesis, anything that makes it hard to understand. So I think that complexity needs to be defined a little bit better. I think it needs to be defined objectively not just subjectively as in hard to understand because different people come from different backgrounds and what is hard for one person might not be hard for another person. I have to Brent mention Rich Hickey's work where he does some etymology research on the word simplicity which is kind of the opposite of complexity. So it's worth talking about in Rich Hickey's sense simplicity means that different concepts, different parts are not intertwined together. It's not about the number of parts, it's about how intertwined they are and a simple system has less intertwining than a complex system. So if you can pull apart ideas or relationships, dependencies, those kinds of things into smaller pieces that don't depend on each other, then your system is more simple. Even if now you have to put them back together, right, the simplicity is in the fact that they are separate. I think that this is a good definition, it's objective, it is useful in that you can start to point out complexities and systems where things are unnecessarily dependent on each other. But I don't think that's the kind of complexity that they're talking about in the paper. What I think they're trying to get at is that there are a lot of nonlinearities in software. There are things that grow faster than linear. So that means geometrically, exponentially, quadratically, you know, there's a combinatorial explosions, that means like factorial. And those are the complexities in software that make it difficult. So they go over a couple of sources of complexity, they go over state and they go over what they call flow of control. So state is definitely a source of complexity. And the key to why is because any, if you add one bit of state, it doubles the total number of possible states of the system. And that doubling with one bit, so you have, you know, you linearly add bits to your state, and each time you're doubling it, that is a quadratic growth. It's super linear. And so this is the case with mutable or immutable state, right? Even if it's immutable, adding one more bit is going to double, then potting the total number of possible states. Now this is considering unbounded states, undifferentiated state, right? Mutable state though makes it worse, because now you're dealing with time and change over time. And so it becomes like an incalculable combinatorial explosion of all the possible histories that led up to the current state you're in. It just becomes impossible even to analyze what how many different states you could be in and what they would all mean. Okay, so why is this important? Well, the more states you have, the more, of course, difficult it is to understand, but also you have to add code to handle all these states. There are going to be more states that are correct, that are, I, you know, like you imagine a chessboard, there are some states of the board that could have come out of a game, that could be part of a game. And there are a lot of states of a board, you can move pieces around randomly, and now could that ever come out of the rules of chess, like it, you know, it might be impossible. And so those states are, although you could arrange the pieces, they shouldn't ever get into that state, right? And so, likewise in software, you could have it where, well, if this is true, this Boolean is true, this other piece of state has to be, you know, less than 10. And if you have it's where it's false, or if it's, if it's true and it's greater than 10, you have a problem, you're in an inconsistent state. And so you need to have more code to make sure that you stay in the consistent state and not the inconsistent state. And a lot of work is, you know, been put into this, you know, you can control it with type systems, certainly more advanced stuff like dependent type systems, liquid types where you're trying to flow the constraints on each piece of state, and, you know, not allow a program that could get you into an inconsistent state. All of that is basically more complexity to address this problem. The more state you have, the bigger, you know, it grows quadratically, the bigger the problem is. How do you stay in a consistent state when you have all this stuff? Mutable state makes it even worse, makes it way more than quadratic, probably combinatorial. And so, that's why it's a bad thing. That's why it makes it hard to write software like this. Another non-linear thing is branches. Each branch that you add, so let's just say you have an if statement with a van and an else, very simple branch, just two possible outcomes. It doubles the number of paths through your software. You had one branch and it doubles it. So, the interesting thing about this is, like, our software is just so complex already that we don't even think about how, you know, we'll write a feature that needs a new branch. We just add it. We don't even think, "Oh, I just doubled the number of branches or the number of paths through my code." Because, you know, once you're in the ten to the sixty-four possible paths through the code, "Well, what's ten to the sixty-five? It's the same," you know. Like, there's no, you can't sense the scale at that point. And it's this, you know, it's the same with with any kind of exponential growth. We just can't sense how quickly it gets out of hand. So, branches, again, it's just about how hard it is to understand. You know, getting rid of go-to was really useful for this because at least you can see in the code the structure of the branching. But it doesn't make the problem go away. It just makes it a little bit more contained. A really hard one is what I call timelines. This is sort of a generalization of threads, but also in a distributed system, you have different machines working in parallel. So, they're not threads like machine, you know, process is running on a single machine. They are on different machines. But each one is progressing forward with time. And then at some point they might communicate with a message over the network. And you can't really control, I mean, without control, they're having uncoordinated message passing, right? So, you don't know if this one's going to send the message first. So, that one's going to send the message first. And then when they do, you don't know which one's going to arrive first. It turns out that this is a combinatorial explosion as well. The number of possible orderings that the sequence of steps can happen in between two threads or two timelines on different machines. And so, this is, you know, the reason distributed systems are so hard is this complexity of explosion of different possible ways that the system can progress. And so, we need to add more code, more work to get that under control and make sure that only the kinds of orderings of message passes that we, that will lead to the correct result, happen, right? But we have this factorial explosion of them. And there's just a lot. So, if you have two threads running and each thread has 12 steps in it, the number of possible orderings is already at a million. Okay, and 12 steps is not a lot, right? And two threads is not a lot. So, we have, you have a million different ways that that, that those two processes can execute. Are they all correct? Are, are probably not? So, how do you make sure you don't land in one that's rare, but very incorrect? You know, that's, that's the problem with distributed systems. Okay, so, the, in the, in the, in the paper, they have this idea of two things. You can avoid the complexity and, or reduce the complexity and then also separate out the complexity. And so, I think that this, this reducing is the most powerful thing. Can you reduce the number of states? So, if you have a, let's just say number of states, right? A number of bits in your system. If you add a bit, it doubles, but the nice thing is if you remove a bit, it cuts it in half. So, every bit you remove is going to cut your, your state in half, the number of states that you can be in. And that is really nice. Now, usually there's a limit there. You can't just keep removing and still, you know, correctly function. But you can also, you know, employ these systems like, you know, your, your variables are going to be thought of in terms of, of bytes, right? And so, you're, you're kind of chopping up the, the state into, well, this is going to represent an integer. This is going to represent a double. This is going to be a string. And those things really limit the number of states as well. You know, there's, there's other techniques. I won't go into them. Managing mutable state that is also a huge issue. One of the biggest issues is when you have multiple threads running on mutable state that you have, you have the possibility of them writing over each other or one is reading while the other is written an incomplete thing. You want to totally avoid that. Some kind of transaction, something like that can really help. The other thing about this idea of looking at the complexity as the growth, if you can keep the mutable state really small, it's not that it's not that big of a problem, right? It's very, you know, at the beginning, you know, if you have two threads and there's only three actions in each thread, that's only like six. There's only six possible orderings. You can count six, you know, you can say, oh, these two are right, these two are wrong. I'm going to avoid these two in some ad hoc way. Similarly, this, this happens a lot in closure systems, you'll have like one mutable variable. So you have one mutable variable, it's not that hard to manage, right? If, you know, two or three maybe it's still possible. Okay, right, so the other thing I want to talk about is this, this divide between essential and accidental. In the paper, they are extending or modifying the idea from the original paper, no silver bullet. So in that he is using the idea from Aristotle. This is Fred Brooks. So what Fred Brooks is trying to say is that software development is hard and then trying to say there's going to be some parts that are essentially hard that will never be able to get rid of. And some parts that are accidentally hard, meaning they could, it could become easier. And he was giving examples of the accidentally hard is stuff like, well, you have to print out the punch cards. And you need to get in a queue to get your your code compiled and run. And then it takes two weeks to get your results back, right? And then the results are on this printout. Now you have to go to the whole cycle again, and it takes weeks and weeks and weeks to debug your code. In the paper, he talks about how that was being removed, because now people were getting their own workstations or terminals and there was time sharing, and you could send your code over the network and it would get back to you within minutes. And so all of this stuff was reducing, reducing, reducing. All that's the accidental complexity, the essential complexity was having this huge list of requirements and having to think about the problem really hard and design a solution to the problem. And all of that he said was essential, that you'll never be able to get rid of that part, which is the mental part of programming. I think that a lot of people disagreed with him, in that the stuff that he called essential, or that he called accidental in the paper used to be thought of as essential, right? So the fact that you had to do your own memory management, and you have to design a memory management system. Well, with garbage collection, you don't have to do that anymore. And so it's become accidental, but only because someone was clever enough to see past the limitations of what we were considering essential. And so the authors of this paper out of the tarpet, we're thinking much more like, well, if you could in the ideal world, trying to jump past or jump, you get that cleverness by doing a thought experiment of if in the ideal world, how would it look? Then you could draw a much stricter line between essential and accidental. Okay, so the line they draw in this paper is, if you imagine the ideal world, which means you don't have to think about performance, or cost, or memory usage, or anything like that, or how hard it might be to describe the problem, because you could imagine all in the ideal world, we'll have the perfect language for describing the problem. What would you do? And basically they say you would just take the informal requirements from the users, and you would formalize them. So you know, get rid of ambiguities, express them a little bit more clearly, more more formally, and then run the formal specification, you just hand it to the system, and it would just run it. And so by doing this, they can say anything that that is not just part of the formal specification is accidental, we could in theory get rid of it. Right, so you still have this same, like, well, there's the mental work of translating the informal to the formal, but then everything else is accidental. Okay, I actually think that this definition, it's a great definition, it makes it objective. But when I run it by people who haven't read the paper and who aren't versed in this culture that has read this paper, they're confused by it, and they wonder, you know, what is the point of calling it accidental, you know, they're just confused. And so I've basically broken this down, because there's still a lot of practical considerations these days. So for instance, you know, what's your, what's platform are you going to run on? Are you going to run on iOS on the web? Is this a distributed system, or does it just run on the desktop on one machine? There's all these questions that are kind of requirements that are business requirements, and aren't really answered by this idea of just formalize the formalize the informal requirements, and then you can just run it. You know, another thing is that a lot of the informal requirements are about speed, right? So it is about performance. It is about, oh, this has to run on an embedded device, so it has to work within one megabyte of memory, you know, something like that. And the idea that that would just be accidental and optional is not the case. That is part that will have to translate into the formal requirements. And so I think that this line doesn't really work in the real world. It works in some kind of academic idea of what of the software development process is. You know, you ask the users and they say, oh, we need to print out a report that looks like this. And so then you just take that and formalize it and then run it, right? That's not how software development works. So I've chopped up the complexity into three different areas. So there's the domain complexity. So, you know, classic example is if you're making rocket science software, your domain is rocket science. And rocket science is complex. Like it has a lot of moving parts and ideas and, you know, related concepts and you have to capture all of that. So that's that's the first source of complexity. There's no way you can justify calling yourself rocket science software if you don't do that correctly, if you don't capture that complexity that's in that system. And then there's architecture complexity. Architecture complexity is, you know, you're going to choose, well, we're going to run on a workstation, right? And so we can assume a certain speed of processor and RAM amount and that it's a single user, you know, there's all sorts of stuff you can assume. Or you could say, no, we have to run on the web. This is a business requirement. It needs to be in the browser. We're going to have a server back end, you know, so you're already introducing complexity of dealing with HTML and having a distributed system because you're talking to the back end. Okay, and then there's development complexity. So this is all the complexity that we introduce because we are programming the system. And we sometimes, you know, we make mistakes or we don't do it in the in the ideal as possible way. So this is stuff like the complexity of managing a thread pool, right? Like you say, well, we need it, we need to use our cores. So we're going to have to manage this thread pool. Or we're going to, we're going to do this with mutable state, right? We're going to use many people. That's introducing development complexity. Or we're going to use conditionals. And now we have that kind of complexity that I talked about before. Okay, so those are those are the two ideas. Let me let me summarize that one again. So they they cuts things up into accidental versus essential with the idea that at least in principle, you can get rid of 100% of the accidental, right? In practice, you might need a little bit. But in in principle, you can get rid of all of it. And they make statements like this, like you should be able to run the essential part of the system, although it might be really slow, right? That the accidental part might be optimization hints for the essential part or caching or something like that. I'm suggesting that it's much more useful to talk about the level at which the choices happen for introducing the complexity. So if it's domain complexity, this corresponds roughly to the essential complexity of the thing. It's it's the requirements. This this software has to do this function. This is the requirements. That's the domain, right? The architecture is the choice of like what language to use, what machines it's going to be able to run on. And you know, if you choose like microservices or something, all of that is architecture. And then the development stuff, development complexity, is all the stuff that we have to do to maybe manage the complexity that we have we introduced through the other two sources, you know, you know, once you're on once you've made the choice to be in JavaScript in the browser, now you've got a bunch of complexity you have to deal with. And you're going to have to add more development complexity to handle and manage that other complexity. I don't know. When I when I run this by people, they're like, Oh, yeah, that's true. That is how, you know, that is a very practical way of thinking about it. Like where did this, you know, basically decision that is causing a lot of pain? Where did this come from? Oh, that's a really hard formula in rocket science. And that's why it's hard for us. But we have to do that. There's no way we can run a business without that. Or oh, yeah, we chose to be on the web. So we have to deal with CSS. That is why we are having this pain. Okay, but we need to be on the web. So that, you know, we're just gonna have to deal with it. Or we chose this library to to handle this thing. And it's just giving us a lot of pain. We can get rid of it because it's a development choice. And we can always change those very easily. Okay, so let me summarize. So this is this been my response to out of the tarpet. The paper came out 14 years ago. And so there's been some a lot of thought and discussion on it. Very influential paper. So so the idea that I'm that I'm sort of taking the ideas further. Perhaps tightening up the ideas. It's all because this is, you know, a good enough base to build on. And not in any way saying that they should have gotten this better or anything like that. I think these these are the kinds of discussions that happen over time, over long time periods where, you know, Brooks's original paper was written in the 80s. And so then it took 20 years to respond with with this level of depth. Now I'm I'm responding. I don't know how deep it is. You can be the judge of that. Okay, so I say that complexity is nonlinearities. And nonlinearities are inherently hard to understand. And they're hard to get right because when something grows very quickly, there's going to be parts of that space. It's creating this huge space like this multi-dimensional space. It's going to be a lot of parts of that space that are that you that are okay to be in. And a lot of parts that are not okay to be in. And making sure that you're in that space becomes harder. He needs more code, more understanding, even if you do understand it, you need more tools, more analysis, more, more work to stay in that space because both sides are growing. The consistent state side and the inconsistent state side are both growing quadratically. Okay. But okay, so the other thing about it is that if you can keep it small, it's not that big of a deal, right? So if you can have a small number of threads, it's actually not hard. If you can have a small number of mutable variables, it's not that hard. It's once it starts getting to that like elbow of the curve and it just gets super complicated really fast. Okay, some examples of complexity are state both mutable and immutable. You got branches, threads or timelines. I mean, there's more. I'm not going to go into all of them. But these are things that really grow exponentially. Our intuition can be wrong. When we add a new bit to our system, we're not thinking we're doubling the complexity, but we're doubling the number of states. We're doubling the difficulty of our problem. But we are. And it's just that it's been doubled so many times. It's so big we can't even comprehend. It's like, oh, we're doubling the size of Mount Everest today. Like it, you know, it's like, well, I don't I can't even imagine how big it is already. I mean, it doesn't it's already like huge. So what's the difference? Okay, so you should be spending work to reduce this. If you remove one bit of state from your system, you are having the difficulty of the problem, right? Okay, one thing I forgot to mention before is another thing we can do is use something that has a super linear power to to manage it, right? So you go, you could address the problem linearly, right, and say, well, I'm gonna add a line of code to address all these impossible states or these inconsistent states, right? The inconsistent states are growing quadratically. So let me add, I'm gonna have to add a lot of lines of code as this thing doubles and doubles and doubles, I'm gonna have to double and double and double the size of my code. But if you could have a system that corrected the problem that managed your state that grew super linearly, then you could keep up with the problem without growing your code base out of control, right? So we need to be looking for those kinds of solutions. One solution, one idea for a solution, it's not like a silver bullet, right, but it's going to help is using trees, because trees grow, they grow exponentially. In terms of the amount of data you can store in the tree versus how deep it is, it's exponential, right? And so by using that, you're able to manage more and more memory without, you know, quadratically increasing the depth of the tree. We need to be thinking of things like that in order to manage this, these sources of complexity. All right, next is the, the division between essential and accidental. I disagree with this division. I think in a, in a practical way, we have to, in a practical system, we have to identify like what level of choice led to the complexity, because it, just because in the ideal world, we can get rid of it, does not mean that in the real world, we can totally ignore it. And so I think that dividing it into domain architecture and development makes more sense. It, it lets you know, okay, we chose this complexity. This is a complexity we're dealing with, because we said we wanted to run on embedded devices, that was an architectural choice we made. We're trying to do rocket science software, we have to deal with like all these different kinds of rocket fuel. Each one has a different, you know, different routine for calculating how it works. But it has to run on this embedded device, and it's very limited in resources. So it's complicated. Right. Okay. Well, that's, that's it. If you want to respond to my response I am totally open. Please go to lispcast.com/podcast. And there you'll find links to subscribe, find me on social media, email. You'll find all the old past episodes. And I'm looking forward to hearing from you. These ideas that I've been talking about are things I've been thinking about for my book. So I've got some synergy going between this podcast and thinking for the book. I'm trying to talk things out and write them into the book. And you know, it's just going back and forth. So I really appreciate it when I get questions and comments, because it helps me solidify this understanding of it. Awesome. My name is Eric Normand. This has been my thought on functional programming. Thanks for listening and see you next time.