Lambda: The Ultimate GOTO

Recorded by Eric Normand. Published: December 7, 2020. Updated: January 11, 2021.

This is an episode of The Eric Normand Podcast.

Subscribe: RSS Apple Podcasts Google Play Overcast

In this episode, I read from Lambda: The Ultimate GOTO. We learn whether avoiding GOTOs makes your code better and how to make function calls fast.

Download the paper here

Transcript

Folklore states that ghost-to-statements are cheap while procedure calls are expensive. This myth is largely a result of poorly designed language implementations. The historical growth of this myth is considered both theoretical ideas and an existing implementation are discussed which debunk this myth. It is shown that the unrestricted use of procedure calls permits great stylistic freedom. In particular, any flowchart can be written as a structured program without introducing extra variables. The difficulty with the go-to statement and the procedure call is characterized as a conflict between abstract programming concepts and concrete language constructs. Hello, my name is Eric Normand and I help people thrive with functional programming. Today we are reading a paper from October 1977. It is AI memo 443 from the Massachusetts Institute of Technology, Artificial Intelligence Laboratory. This one actually has three alternative titles, the author, seemed to have fun with the naming of this paper, so I'll read the three. Debunking the Expensive Procedure Call Myth, or procedure call implementations considered harmful, or lambda the ultimate go-to. This is by Guy Lewis Steele Jr. Just a little background on Guy Steele, he is one of the main implementers, designers of the scheme programming language, which he did. This research that's published in this paper is part of that, part of that work to develop scheme as a very fast compiler. I would call it an essentialist project to figure out what made Lisp and what was the minimum language required to build a very powerful system, a very important question. This paper comes out at the time when people are discussing structural programming, should we use go-to's or not, and this paper tries to shed a little bit of a different perspective from what I think most people thought, which was what alternatives do we have? This is trying to shed a different light on it, so I really appreciate this paper. It is part of a series of papers that use lambda the ultimate blank in the title, and those papers are very long or more technical. This one is a little less technical than the others, and so I thought it would be good to read this one. The other ones are like, "Here's a bunch of code," and then it's like a three-page code listing that doesn't make for great podcasting, but we'll be able to touch on some of the same themes that we see in those papers, and I'll probably figure out a way to talk about those, anyway, in the future, but today we're on this paper. There is a little bit of code, and I'm going to skip over most of it. There's some assembly code as well. I'll just have to give a high-level description, I'll do my best. The paper is freely available, so you can look at the code yourself if you want to, but we're just going to skim through it. Another thing is this is a very dense paper. I don't know if you've seen the previous episode where we did list, but language for stratified design. That was a very dense paper as well, and I'm going through it with a highlighter, and what's important, what's important, highlighting every line, that's not going to work. I can't read the whole thing. I kind of feel like that in this one where there's not much to leave out, it's very dense. I do leave out quite a bit, and so we're just going to try this. It might be a two-hour one again. This is from the introduction, I'll start. For the past five to ten years, the computing community has been embroiled in debate over structured programming, and as a special case, the go-to statement. Just right there, just right there, you know that this is starting off really strong. This as a special case, he's bringing a perspective to this. The broader discussion is about structured programming, but there's a specific discussion within that about the go-to statement. This seems like an obvious thing, but as a writer myself, I know how precious these truths that you can find and encode in language in your paper, I know how precious they are. They don't happen that often. You might know that, but you don't think to write it down. I don't know, these kinds of things are the things that really shine to me and show how dense the paper is going to be. He's already in the first sentence. He's made this distinction. On the one hand, many authors have observed that programs written without go-tos are often more perspicuous than programs written with go-tos by the same programmer. I had to look this one up, it's not an everyday word that I would ever come across. It just means easy to read, clear. That is, the avoidance of go-tos is a discipline which has been observed to produce programs which are easier to understand and maintain. This has been observed, we believe that, but then there's all these objectives. On the other hand, a certain portion of the computing community has rebelled this discipline. There are several reasons for this rebellion. Among these are, one, some programmers fear that their expressive power or style will be cramped if go-to is taken away from them. He points out that this is true in some languages, like some languages they're kind of designed around the go-to and you wouldn't be able to do everything you needed to do. But that there's a lot of more powerful languages that make that not an issue. In all fairness, it should be pointed out that go-to is a basic universal primitive with which, in conjunction with a conditional, almost any other control construct can be synthesized if the language designer neglected to include it. What he's saying is, go-to is universal, the jump to another piece of code with a conditional gives you everything you need to be Turing-complete. In some cases, you might consider it kind of escape hatch for the language designer. I don't know if this is giving enough so I'm going to give them this one thing that they might need to use sometimes because I'm not sure if they've got enough power. 2. Some programmers feel that the imposed discipline doesn't make any difference because one can write incomprehensible programs without go-to. This piece of logic is slightly false, for while avoiding go-to does not guarantee comprehensibility, it often does increase the probability of producing comprehensible code given our current cultural biases and programming style. Yes, you can write bad code without a go-to, but it's harder to write bad code. It's more likely that you're going to write good code considering all the other disciplines we have. 3. There is some feeling that go-to is a cheap statement, while other constructs, such as do, case, etc, are more expensive. Everyone knows that a go-to gets compiled into one machine instruction while do produces many instructions. This is the main thing he's going to go into in the paper. The paper first title is called Debunking the Expensive Procedure Call Myth. Go-to is considered cheap because it's just a jump, whereas all these other things can get compiled into these multi-instruction things. The example I wish to consider here is the procedure call. Everyone knows that unlike go-to statements, procedure calls are expensive. Indeed, these feelings are not unjustified given past and current programming language implementations. Because procedure calls are expensive, we tend to avoid using them in our code. Following the lines, he's saying language designers, compiler writers have made a mistake. They've made procedure calls expensive when they don't need to be. They've perpetuated this myth that procedure calls are expensive inherently. Recent practitioners of programming discipline have advised us not to be afraid to use procedure calls as the readability they afford is worth the inefficiency. This is like saying that cars with square wheels are all right because transportation is worth a bumpy ride. We really ought instead to concentrate on improving our wheels. He's not this kind of apologist saying it's worth the inefficiency. We should use procedure calls anyway because they make our code more readable. He is saying they make our code more readable, but he's saying they don't have to be slow. They can be really fast. Of course, he's got the compiler to back it up because he's working on scheme. I would like to suggest here that procedure calls have been given a bad press for a number of reasons. There is no reason that procedure calls must be expensive. Okay, then he outlines the paper. I'm not going to do that, we're just going to move forward. In section A, that was the introduction. Section A, procedure calls as go-to statements. Remember, the third title of this paper is Lambda the ultimate go-to. What he's saying is that Lambda's are a kind of go-to, and maybe an even better kind of go-to. Right, so he's going to explain that here. In studying the nature of procedure calls, it is appropriate to study the LISP language. LISP, more than any other language, uses procedure calls as the primary means of expression. In particular, the arbitrary distinction between built-in operators like multiplication and user functions is almost completely non-existent. LISP is an expression language in that every program is a function which returns a value, but this value may be ignored. Many programs produce interesting side effects and then return nil. Okay, so LISP is made out of functions, that's what he's saying, and there's very little distinction between the built-ins and the user defined ones. So what better language to study to find out how to make function calls fast than to look at LISP, right? And they use procedure a lot. I don't know why my guess, instead of function, they use the word procedure. My guess is that they want to distinguish it from a mathematical function. These don't have to be mathematical, they don't have to be pure in LISP. And also that it's probably the language that most, the word, the term that most other programming language communities use, that these are procedure calls. And so basically, you can simulate a pure function using a procedure because it has a return value. Okay, the usual way to implement function calls in LISP is by using a stack. When a function is called, a return address is pushed onto the stack. When a function is done, it pops a return address and goes there. After stashing the value to be returned in a register. This is how, in fact, how procedure calls are implemented in many languages. You have a stack, you push the arguments, you push the return pointer, and then you jump into the code of the procedure. And then the return statement is going to look up the return pointer that you pushed before and jump back there. Okay, this is where he shows a small procedure and how you would push onto the stack with some assembly instructions and jump and stuff like that. So I'm going to try to explain. He shows how to safely replace a push and jump instruction. There's one instruction called push j pushes and jumps, and then a pop j, so pop and jump, with a single jump, a single instruction, and he does that very effectively with assembly. He shows like, look, we've converted this push j and a pop j, we don't need both of those in a row, we can just jump to where we need to go. So you don't need to push the return value onto the stack, sorry, the return pointer onto the stack, then pop it right off directly. You don't need to do that. You can just jump directly there. Okay, all that has changed is the useless pushing of bar three, which is the pointer, and the encoding of a procedure call as a jump. That is a go-to. Okay, so he's shown that in a lot of cases, a procedure call is just a jump, just one instruction, just as cheap as a go-to. This approach is quite general. If we rewrite a program in list notation, it is clear that the last thing that program does is a procedure call of some kind. If the last thing a procedure does is call another procedure, that call can be compiled as a go-to. Such a call is called tail recursive, because the call appears to be recursive, but it's not since it appears at the tail end of the caller. Okay, so tail recursive, you'll also hear a call, called a tail call. To indicate it's not really recursive, because it's not always calling itself, it could be calling another procedure. So this is tail call. So in a tail call, it can be compiled just as efficiently as a go-to. This approach can be made even more uniform by compiling every procedure call as a go-to. I think I should explain this tail call a little bit more. The normal idea, the sort of default for a procedure call is you jump into the code of the procedure, you run it, and then it jumps back to where you were in that outer function. So I'm in function A, call function B, that jumps into B, runs the code, and then jumps back to where I was in A, or like the next line. So you have to do a push, you have to save the pointer in the code where you're going to jump back to. So they use the stack, you use stack frames to do that. stack frames hold the arguments, and the return pointer. Maybe they also hold other stuff like local variables and stuff, but let's ignore that for right now. So what he's saying is if the last thing you do is call another function, so the very last thing, why are you coming back? Why are you jumping, running procedure B, and then jumping back to the end of A? There's nothing there to do. You're done. You could just jump to the code of B, and run it using the same stack frame, right? You'd have to manipulate the arguments in the stack frame, but you just jump to it, and then when you're done with B, you jump back to whoever called A, right? Because your return value, the return value of B is going to be the same as the return value of A, so you got the same, you got the thing right. So you're able to call this thing as a go-to, okay, so now he's going to go even further. This approach can be made even more uniform by compiling every procedure call as a go-to, not just tail recursive, but every procedure call. The idea is that a return address is pushed on commencement of the evaluation of an argument rather than application of a function. This provides a uniform compilation discipline. Okay, so what he's saying here is, let's say you've got this function F, and it has, it takes an argument, but that argument is another function call, right? So you have to call that inner function first before you can pass the value that it returns to F. So instead of pushing the return pointer, when you're going to do the function call, you do it when you first start evaluating the arguments. Okay, so he shows some code and it shows that we're going to push the, push a return pointer, then set up the arguments and then jump, and this is very common in assembly, or as a, you know, every language has a kind of stack discipline where like procedure calls are going to follow the same discipline, right? So you could say, well, we're going to always push the return pointer first and then push the arguments, because the arguments are like variable in numbers, sometimes there's one, sometimes there's five, sometimes there's zero. So why don't we make the known thing, which there's always a return pointer, we're going to put that first, right? Then you push all the arguments, then you jump, and you maybe even push the top thing is like the number of arguments you were called with, something like that, some discipline that allows you to interpret what's on the stack, right? So in the first case, he shows, will you push the return pointer, then you push the arguments, then you jump, right? And then the function you jump to is going to pop that and pop all the way back down to that return pointer and then jump back. Okay, but then he shows what if we did the opposite and we push the arguments first, then we pushed the return pointer and then we did a jump, right? So he's using jump, it's just a go-to with one extra push. So he's using this uniform compilation discipline, every function call is just a jump, but sometimes there's a return pointer that you push on, but then he shows, look, we've got to push next to a jump, that is, we could use a peephole compiler, well, let's read. If our machine has a push J instruction, then a peephole optimization can now transform the push jump sequence into a single push J instruction. Thus we see that a procedure call can be uniformly treated as a go-to with the push J instruction considered an optimization rather than vice versa, so he's trying to reverse the default, right? Before the default was, oh, we always need to push the return pointer and then jump and then jump back to where we pushed. He's trying to say, no, the default is always just jump into it, but sometimes you need to save where you're going to return to, and now doing that, we can use, like he said, a peephole optimization, which just means you look at a few instructions at a time and see if you can make it better, make it more efficient, so your machine might have some instruction that does two things for the same faster time than the two instructions, right? So he's showing that, like, if you reverse the way you think about it, you can actually make what seemed to be a special case just like some optimization you can do in a later pass that you were going to do anyway. All right, but there are a couple of difficulties with this idea. One is that the stack is often used to hold information other than return addresses, such as stack allocated data structures and intermediate results. This is no problem as it turns out. It is merely necessary to clean up the stack just before calling F rather than just after calling F. Okay, let me explain that. So a lot of languages when they compile their procedures will use the stack for local variables, right? They'll allocate some space on the stack for local variables. Why? Because it's easy to clean up. The garbage collection is trivial, and you know, once this function returns, I'm not going to need these values anymore, they're just local. Plus, if you call something, you want your functions to be re-entrant, which means you can call them multiple times deeper and deeper into the stack, recursive functions are this way, but sometimes it happens other times. You don't want to allocate it inside the function itself, in the code or the function, because then you would clobber someone else higher up the stack who is using that space. So use the stack. It's just this really cheap data structure that with one instruction, you just pop pop pop pop pop pop pop, and you just forget that stuff. Forget that memory. Okay, but the problem is, now if we call into B, and we never return to A, because it's a tail call, what about all those local variables? We haven't popped them, right? Well all you have to do is pop them before you jump to B. Okay, so A knows what local variables it's allocated on the stack, and it just has to pop pop pop pop, right, before it jumps, okay, because normally you'd put it after, right, there's like a cleanup step at the end, right before you do your final return jump. Okay, this leads to a second difficulty, which is that some languages would allow F to refer globally to stack allocated structures within bar, such as the dynamic values of X and Y. In this case, we cannot clean up the stack until after calling F. There is however, some evidence that such global variables are as harmful as go-to statements. In any case, it is a good idea to minimize their use and to define variables to be lexical in scope by default. All right, this one might need some more explanation because it's not something we do commonly anymore, unless you're using something like emax-list, which you know, honestly you might be if you're listening to this. But dynamic variables used to be a very common and almost sometimes default, like an emax list, it's the default, it's a common technique for passing data down the stack, okay, so instead of having an argument, meaning you pass data directly to a function, you know, sometimes you have this problem where I don't need it in this function, it needs to be used in the next function call, next one, and then it's used like 10 levels down in the stack. So I have to add an argument everywhere along that path so that I can get the data down in there. So one approach is dynamic variables. What dynamic variables are is you declare, you declare them, let's say you have to, in your language, you have to declare them as dynamics, just to state that they're not the default lexical scope. So what it means is, if I set the variable at one level of the stack and then make a call, then that makes a call and that makes a call, that setting passes on, right? So basically it's the stack, it has to go up the stack to figure out what the value is. And so if you're now messing with this stack discipline, how do you do these dynamic variables? And he's basically saying, we cannot clean up the stack in that case. But that you probably shouldn't be using dynamic variables by default, and later on, I'm not going to read that, it's not that interesting, but he talks about how maybe in those cases you special case it, right? You special case these dynamic variables that do maintain a stack frame. You can't do this trick of tail call removal, where you just jump instead of having it return. You can't do that anymore. For those particular functions that do set dynamic variables, but just don't do it as the default and you won't have it to be such a problem. But what's cool is you can see he's been thinking about this stuff probably because he's had to implement it, right? He's implementing scheme or he had implemented scheme, it's probably just working on the cool internals of the compiler. Okay, so B, section B, procedure calls can be fast. The above examples have shown procedure calls encoded as a simple jump, or at worst, a push J. What then makes procedure calls so expensive? Hey, like I just shown you like it's just a jump or sometimes a push and a jump. Like why, what's going on in all these other languages? The answer seems to be that most implementations are rather thoughtless or careless in this regard. It is usual for a compiler to slap standard, prologue, and epilogue code around every procedure. This code typically sets up complex procedure frame structures, allocates all declared variables, and sometimes even saves and restores, all registers. House lander and strong report that one simple procedure call compiled by the OS360PL1 optimizing compiler pushes 336 bytes onto the stack. So just the procedure call, like no, you know, that's not talking about arguments and stuff. I mean, this is a pretty damning statement, he's just saying that compiler writers have implemented them for whatever reason in an inefficient way, that they're being maybe too safe. They think, oh, what if someone used a register? Well, let's save them all just in case, that this was seen as kind of the, this is the job of the procedure call discipline is so that programmers don't have to worry that their registers are going to get overwritten when they call a procedure. Of course, we know now that when we're dealing with high level programming languages, we're not thinking about registers. It's the last thing we want to think about. We don't think about stack so much either. So I believe that this paper and this work, not just this one paper, but this work, has really changed how compilers are written. Okay, so that's, that section just analyzes why they're slow and now we're getting into see why procedure calls have a bad reputation. The very notion of procedure call in a programming context was only worked out painfully after computers had already existed for some time. Okay, it's historical reasons, that's what he's saying, historical reasons. Since procedure calling instructions were not planned ahead of time in a way that arithmetic conditional and branch operations were, one would assume they were implemented on early computers in a rather clumsy fashion. Machines before 1960 typically had only one instruction for subroutine calling which saved the instruction counter in a register or in the first memory location of the called subroutine. The PDP-1 had three instructions which were all variants on this theme. The IBM 360 had only one such instruction, but the programmer had the additional choice of which register to store the program counter into. As far as I know offhand, stack instructions did not generally appear in non-stack oriented machines until the mid 1960s in such machines as the PDP-6 which in addition to push J offered three other subroutine calling instructions. In retrospect, this seems to have been more out of uncertainty as to which would be used than out of necessity for a variety of instructions. The PDP-11 is the earliest, this is 1970 or so, is the earliest machine I know of which was not essentially stack oriented but which provided only a stack pushing subroutine call instruction. So you can see he's trying to go back through the history of these machines and basically people didn't know how to implement procedure calls, what people might need, what these machines were going to be used for, so they were guessing we're going to need something to do with that. There's probably also some efficiency concerns, before 1960, you really had to optimize your use of registers and pushing onto the stack might have seemed expensive at that time. You'd want to pre-allocate everything you needed and have a plan for every byte because the machines were so small. Okay, the interesting thing is that all of these examples have attempted to compress the procedure call operation into a single instruction as may be inferred from the discussion above and in Steel's 1976 B paper, which I don't have the bibliography with me, but it's probably another one of the Lambda the ultimate papers since it's around the same time. This isn't necessarily the right way to do it. The procedure call may conceptually be divided into three phases. Push a return address if necessary, set up arguments, go to call procedure. It is important to note that the first step naturally comes first and that it is conditional. If necessary, push a return address if necessary, it's conditional. The mistake that we make is to attempt to combine the first and third steps unconditionally into a single universal instruction. The result is that the return address is always pushed whether it is necessary or not. This is really interesting. He's actually deconstructing what a procedure call requires. It probably needed this long, right, like people were guessing what procedure calls might look like, what kinds of stack operations people wanted, but from the viewpoint of 1977 or 1976, which is the paper that he's referring to, it was clear, these are the three steps we need, need to push if necessary, set up arguments, and then go to the call procedure. And we want them separate. We want to be able to control, do we push or not? What arguments are we pushing? Maybe there's some trick we can do there. And then the go to obviously is necessary, but it doesn't have to be a special go to. Okay, so I'll say this again. He's decomposing this into its independent parts and says, we just need one instruction for each part. The mistake that we make is to attempt to combine the first and third steps unconditionally into a single universal instruction. The result is that the return address is always pushed whether it is necessary or not. All right, he continues, compiler writers have often simplified their job at the expense of the procedure call by adopting standard, certain standard protocols. One of these is that the procedure, the call procedure should save all registers that it uses. This is in turn often simplified to save all registers. So go from save the registers that it uses to save all registers, because sometimes the analysis of what it's using is difficult. So just default to saving everything. It is seldom the case that all of these registers actually need to be saved. Indeed, such languages as Fortran with little global optimization, oh wait, sorry, I just got lost. Oh, sorry. Indeed, often no registers need to be saved across a procedure call. Okay, sometimes you don't even need any. Why are you saving all of them? The great speed of compiled Mac Lisp procedure calls, that was an implementation of Lisp. The great speed of compiled Mac Lisp procedure calls is primarily due to it's taking the inverse approach. The caller is responsible for saving any registers that it will need after calling another procedure. So the calling code is responsible, not the called code. Before they made it that you call a procedure, the first thing it's going to do is save all the registers that you know, so you call it procedure B, procedure B, saves all the registers, does its work, then restores all the registers, then returns, right? So this reversed it and said, hey, A knows what registers it actually needs to save. So it should save them. Call B, B just tramples all over the registers, does whatever it wants. And then when B is done, jumps back, then A restores the few registers that it needs. Okay, the caller is responsible for saving. It might be thought that this would lead to a much greater code size than the other convention, but this is offset by three effects. One is that few registers actually need to be saved in practice. Another is that the compiler can know which registers are not destroyed by built-in functions and avoid saving such registers unnecessarily. So you might know that B doesn't even use the registers, so we don't need to save, right? Or it doesn't use register, register, register one, but A does. So we don't need to save A one or register one because B is not going to trample it. The third is that the architecture of the PDP-10 does not make references to stack values unduly difficult, thus it is often just as easy to keep a variable on the stack rather than in a register in the first place. And that's kind of where we're at now, where we don't, saving stuff in registers is not, I mean, it's definitely faster, but we just allocate memory on the stack, on the heap. We're allocating tons of memory all the time, and we kind of forgotten about the need for this register management. That's the compiler's job. Okay. At the source language level, there are other factors which contribute to the procedure calls poor reputation. So this is not implementation. This is more like language design, source language, syntax. Nearly all algebraic computer languages draw a syntactic distinction between operators and user functions, if not also a semantic distinction. Okay. Then we see this still in languages, like in JavaScript, the plus operator is totally different from a function, you would call syntactically, it's different. And we also have a notion that, oh, this is faster, like somehow compiles to an ad instruction in the machine, which might not even be true. We don't even know. Some languages it does, right, but we do have this idea like this function call is probably more expensive than a plus. There is no reason one cannot accept the general case and handle important special cases, especially. For example, if a user should try to pass a statement function or intrinsic function as an argument in Fortran, the compiler could jolly well provide a reference to an external version of that routine while continuing to use the internal version were applicable. All right, so this is a very interesting idea, like why don't we make everything act like a function, and then a special case, the operators that could be kind of compiled in a fast way. So for example, with plus, why can't you pass plus in as an argument to a function? And in that case, it would basically kind of box it, you know, it would box it up and make a function that just does a plus b inside, you know, takes two arguments a and b and does the a plus b. Whereas the plus, the plus could still in, in, in, if you just use it in line in your code, it could still just compile into an ad instruction. Okay, now here's another thing. This is another piece of syntax. Most languages require user functions to be referenced in a rather awkward manner and subroutines in even more awkward ways. Fortran requires subroutines to be invoked using the keyword call. Cobol is even worse. It uses the longer keyword perform for internal subroutines and two keywords call using for external subroutines. Moreover, many, for many years, it took three statements to call an external procedure. So here's Cobol, I'm going to read it because Cobol is supposed to be readable. Enter linkage period. Call foo using arg one, arg two, arg three period, enter cobol. That's one function call. I'm very glad that we've optimized the syntax for function calls, even in stuff like JavaScript or Java into just, you know, name the function, open parens, put the arguments in there, close to paren, much better. It is generally true that we tend to believe that the expense of a programming construct is proportional to the amount of writer's cramp it causes us. We think of addition as cheap, partly because we can notate it with a single character plus. Even if we believe that a construct is expensive, we will often prefer it to a cheaper one if it will cut our writing effort in half. Now I think this is still true today, but less so, less so. I remember hearing a talk by the designer of the JVM. Gosling. Yeah, Gosling. And he was talking to some C++ programmers who were asking for a language feature that they have in C++ called operator overloading. There you can basically program the compiler to change, to use the type inference in life. I use plus on this type, don't do addition, do concatenation, right? If you take plus on some, let's say binary tree, it's going to merge them together. You can make the plus operator do whatever you want based on the types that are passed to it. He said, wait, you don't need that. Just define a static method that takes the two arguments to the compiler, it's the same. You avoid this problem of overloading the symbols to do all these things that maybe is hard to read. The fact that him as the compiler writer, as the virtual machine maker, he knew that that static method is probably going to be in line, it's probably going to have all these optimizations done by the JIT. It's going to be very tight code, but it's syntactically heavier than the plus. When you try to think, binary tree dot add versus the one plus character, it's a lot heavier weight syntactically. It just doesn't feel right. This is a lesson that practitioners of programming discipline have been trying to tell us, but it is a good one, only if our programming languages are designed to cooperate with our natural tendency toward brevity. It's this little nod that there's something there psychologically programmers are going to always feel this, and a programming language designer has to take that into account, that the syntactic weight of the operation can contain meaning. That meaning is usually going to be how often, what should you prefer? Should you prefer the plus, or should you prefer using a function call? I'm going to skip ahead a little bit. He mentions Dijkstra that Dijkstra prefers syntactic constructs rather than doing stuff with recursion and defining a recursive procedure that's universal like a wild do. As an additional defense of the procedure call, it should be pointed out that it constitutes a universal construct when properly implemented. It ought not be forgotten, however, that procedure calls can easily simulate sequencing, conditionals, and repetition, while the converse is decidedly not true. He's basically saying, like, function calls are universal. This is the church touring thesis, or hypothesis, I don't remember what it's called anymore, but that lambda calculus is touring complete. Even sites, church, which I think is actually really cool, that in the 70s, you could still refer to church, like, hey, remember this guy who invented lambda calculus and proved all these things about it? Like, it's still relevant, and people might not have been that familiar with it like they might have known, oh yeah, there's this math thing, but it's not that practical, but he's like, no, it is, like, it's just go-to's, if you know what you're doing. Okay, additional tricks are easily played, such as escape expressions, call-by-name, procedural data types, et cetera, so he's talking about how they're universal, and you can implement all these different things that you wouldn't, that don't seem obvious at first, but if you get into lambda calculus, you can see how you can actually use function calls to simulate call-by-name, basically the lazy evaluation, procedural data types, escapist expressions, which is, you know, basically you do a go-to in the middle of a loop when you know you're done, you don't have to finish the loop, you can simulate that with function calls, and his other lambda, the ultimate papers, actually show how to implement them as function calls, and this is, this forms the basis of the scheme compiler that it can transform, it can transform all this stuff into, like, tail recursive functions, and then you can do people optimizations like, oh, we don't need to jump, we can just inline this function, and inline this, and you know, it's a, it's a interesting way, like, you basically turn your program into lambda calculus, and now you can do all these interesting lambda calculus optimizations on it before you get into machine code, and you can implement all the semantics of any language you want in terms of lambda calculus. Okay, D, what are we doing about it? Okay, so he's going over some of the other work that's been done to improve the situation with procedure calls, that they're too expensive, all right, so there's been some mathematical work done on recursion removal, which is aimed both at converting procedure calls to go-to statements, and at transforming programs into other forms, requiring less recursion. They report that this technique improves runtime by 40% and space use per level of recursion from 336 bytes to 9, a 97% saving. This seems impressive, until we realize that their transformations are almost entirely straightforward and mechanical, and could easily be made a part of the PL1 compiler. And furthermore, that they are essentially techniques which have been used by the Macless compiler, and others for almost a decade, that's smugless bweeny, like, to a T. Turning procedure calls into go-to when possible, and avoiding pushing variable values unless necessary. He's saying, y'all, we've been doing this for years, the LISP community, like, just listen to us. We know what we're talking about when it comes to procedure calls. All right, so he talks about how they conclude that, man, we shouldn't put these in the compiler. Let's just teach this as a discipline to programmers, so that they can always do this and save this 97%. I'll read his response to this. We are now so afraid of procedure calls that, rather than fix our compilers, we wish to teach programmers complex techniques for using go-to to circumvent the shortcomings of our compilers. Such a desire is completely outrageous. Not only does it violate the philosophical principles of clarity and language design and programming style, we have slowly been forced to accept, but it is demonstrably ridiculous. Because, while the complete generality of these techniques has perhaps not been implemented in a compiler, a good part of it has and has worked for eight to ten years at least. Such is the ludicrous position we have come to. All right, so this is funny because he's not holding back any punches in terms of talking about how ridiculous this is. Such a desire is completely outrageous. Not only does it violate the philosophical principles, but it's demonstrably ridiculous. It's ludicrous, these are his words, and he's right. In their defense, I do have to say that I think that the Fortran compiler, these are just figures I have off the top of my head. I'm not sure if they're right. But the Fortran compiler took hundreds of thousands of man hours to develop. At that time, when they developed Fortran, we didn't have the parsing theory and frankly also the memory to do what we do today, the computer memory. It took a long time to design and develop, lots of people working on it for a long time. To go back and tell them, "Your procedure calls are kind of slow. Do you think you could redo a lot of that work you just did?" That probably seemed like not possible, and so to say, "We're just going to teach programmers to do it," probably seemed like a much cheaper way to do it. Look, we've got these languages. They work well enough. They're better than what we had before. Let's just teach people how to use them better. It's going to be a little awkward, but it's better than going through that pain again of writing a new language. Further, you have Algol, it's like a huge committee designing this thing, and then the implementation is still waiting. How do you implement these? You have to implement it on all these different computers. It seems like a lot of work. On the other hand, you had the LISP community, who basically were grad students and probably some professors too working on this, but they had developed multiple generations, multiple branches of this programming language, of the family of programming languages, and experimented with all these different techniques. While people at IBM were developing Fortran and doing one language, they were developing them all on their own and experimenting with how do you get more performance out of these machines? How do we design machines that can run this code better, better procedure calls? What instructions should we use? They had been doing all this tinkering kind of research. And he doesn't talk about that so much here, but he's saying there's a lot of good stuff there that other compilers can learn from. One of the things is that it doesn't have to be so hard to write a language. You can spend much more time designing it than designing in an iterative way than coming up with this big design by committee than implemented, you know, waterfall style method. Okay, so another section, E, the expressive power of procedure calls. So he's going to use this example of someone who was talking about structured programming and he managed to turn some spaghetti code into a state machine with a simple discipline where you have a variable that stores the current state and there's a big branch statement that looks at what the current state is, you know, are you going to state one, state two, state three, and then if you're in state three, you do this code and then you set the new state and then you jump back, you know, to the beginning and do a loop, right? And then, you know, if you hit state zero, that's the end, right? So something wants to send just to say this is to end the whole thing, just set the state to state zero and the next iteration through the loop, it'll be done. Okay, so he shows a little bit of this code and now he's going to make some observations on it. The outer loop of our program may be compared to a hardware CPU and the modules to the instructions of that CPU. It has a program counter, which is the state that, you know, what state of this machine are you in, it has a program counter, which takes us from instruction to instruction. Our nice program has become a rat's nest of jump statements. This is not at all surprising since the structure of our program merely reflects the structure of our problem as exhibited in the state transition diagram and that too is a rat's nest. Our attempt to improve the program by using a nice structured approach has resulted in the effective use of go to all over the place. I forgot to mention something in the original program, he wasn't using go tos, he was using procedure calls. So, you know, if you're in state three, you do a couple of checks to make sure everything looked fine and then you'd run, you know, module three, which did the code and then set the program or the state to the next one, right? And so it's very structured in the sense of like, I'm not using go tos, I'm just using procedure calls, but steel does a few simple tweaks like say, well, let's replace these procedure calls with go tos and let's change the name of the state to program counter and look, you've just created a little CPU that does go tos and jumps all around. And that's the way the problem was framed. It was framed as a state machine and if you look at state machines, if they get complicated, they look like, you know, this one has 23 states or something with lines and arrows between all of them. It's a rat's nest. So your program is going to eventually look like that, because the problem is defined that way. So it's not about using go tos or not, like the rat's nest is just inherent in the problem. All right. So we can certainly express all programs in a structured form by introducing flag variables, but the preceding series of reasonable transformation transformations merely renders it unstructured again, demonstrating that we had not really made the program structured in nature, but only in form. There is another way out. He's going to start going lispy again, showing like this is what the people in the list community have been doing. You should just, you know, maybe look over here a little bit, let us not use go to statements to jump between modules and let us not use flag variables to signal what are effectively go to statements to an outer control loop. Instead, let us consider what a single module does. It performs its bit of processing and then invokes or otherwise designates another module to complete processing. Suppose therefore that at the end of every module, we had a procedure call to the next module. This ends as some code. This will certainly do what we want. So you know, if you're in state 23, you do some stuff and then you want the next state needs to be 21, so you just call function module 21 and it jumps there. This will certainly do what we want. If module 23 calls module 10, then module 10 will carry on calling other modules in the process and eventually the program will complete. Which is also structured, the only constructs use are sequencing, possibly conditionals, and procedure calls. It uses no go to statements. There is also an additional bonus. If module 23 wants to pass some information to module 10, it can pass them as parameters rather than having to use global variables. This can help prevent conflicting usages of global variables from among disjoint module sets. So in his system, it's just functions calling each other, but you also don't have global variables, which sounds good. You know, from our perspective in the future. From this example, we can induce the following theorem. Any flowchart can be written as a program which uses only sequencing, conditionals, and procedure calls. This theorem is not new, it appears in McCarthy's 1960 paper. Again, hey, with lispers over here doing our thing, is quite easy to prove. We assume without loss of generality that the boxes of the flowchart are of two sorts. Process boxes with one exit and conditional boxes with two. A process box may contain any single exit computation whatsoever, a conditional may contain a predicate which decides which of two exits to take. So we had that phrase without loss of generality. You know, what if you need three branches? Well, you could imagine it's two nested ones, and you get the three. For each, so basically you have two kinds of boxes, one that just has one exit, just one arrow pointing out of it to the next box, we're talking about flowcharts here, right? And then you have the one that has two, and there's a choice. Do I, do I do yes or no? You know, we've all seen these flowcharts. A conditional, wait, sorry. For each box in the flowchart, we construct a procedure. If a box A is a process box whose exit branch leads to box B, then the procedure for A is procedure A. Begin processing, call B, end. So you define A, if it's just going to call B at the end, it's going to do its process and then call B. That's it. So simple transformation rule. If a box C is a conditional box whose exit branches lead to boxes D and E, the procedure for C is, you just do the if with the predicate, then call D, else you call E, very straightforward. Every module is structured and formed, moreover, the modules are not necessarily as tiny in size as the examples indicate, since a process box may contain an arbitrarily large one-in-one out program. So this is actually the style that Scheme uses, at least last time I looked at Scheme a few years ago. If you wanted to code a state machine, they would suggest this approach, that you just use the tail call to indicate, like, each function represents a state, and you would just call the next state, and because it's a tail call, you're not going to return. It takes a little bit of getting used to, like, a little mental effort to say, okay, this isn't a procedure call where I'm expecting to come back here, right, because the state, you don't think of it as, okay, I'm going to go to state two, which will come back, right? And you're just thinking of it as, and then the next state is four, and then I do some stuff, and then the next state is three, and I do some stuff, and the next state is ten. And it does require a little bit of mental, I don't know, a shift somewhere in there. Okay, an additional observation should be made, of course. In the example above, the use of procedure calls hasn't endowed the program with any more structure than the use of flag variables or program counter did, 'cause that's the state variable, compared with the go-to version. All it has done is possibly make the code more palatable. This may be a useful psychological illusion, but it is as much a myth as the belief that the procedure calls are inherently expensive. So he's saying like this whole discussion about structured programming, like maybe it's not a thing, like it doesn't really help, maybe this is more palatable. But it's just as much of an illusion, because it's still go-to's, right, this, we've shown that this function call is just a go-to, like maybe we haven't really gained that much, except to make it a little bit more readable. F, procedure calls and modularity. So this is, that's rough, right? What he's just shown, I want to, it's very unexpected when I write it, I was like, oh, he's not saying that this is amazing. He's saying it's just kind of not much better. It's like one myth or the other myth, like what are you going to take? That's rough. Okay, F, procedure calls and modularity. I would like to suggest that our difficulties in dealing with programming and programming languages, stem in part from a confusion between the abstract notions of program organization and the concrete embodiment of these notions as programming language constructs. Okay, he's going deeper here. This is not the practical how do you compile functions, but he's bringing up a philosophical notion that maybe we are too, we too easily confuse these programming language constructs with the machine instructions that are there. In order to simplify our thinking, we have attempted to enforce a one to one mapping between these two sets and it doesn't work very well. For example, we decree that procedures are the method of producing modularity, that while do loops are the way to iterate, that if then else is the way to produce conditionals, that go to is the way to produce peculiar program structures. I'm going to continue reading because I think that this is deep enough that we need a lot of material to talk about and I don't want to refer to stuff that we haven't read yet, so I'm going to keep reading. The fact is that this just isn't so, consider the notion of modularity. While procedure calls are indeed a method of modularizing programs, there are other methods. The PL1 include construct or the cobalt copy construct are one alternative. Another is the print level feature in some Lisp systems, which allows you to print the overall structure of a program while suppressing the detail of computations below a certain level of nesting. A third example is the common fortran trick of breaking up a single complex assignment statement into several smaller ones, and so he shows you could do like one big statement of like x plus y plus c plus d plus w plus z, or you could do it in multiple lines. You do x plus y on one line and you save it to a local variable, a plus b on another line and save it to a local variable, and then you add them all up at the end. You're using assignment, the local variable. If someone had asked me whether assignment statements were an agent of modularity in programming languages, I should certainly have replied in the negative before seeing this example. That's cool, right? He's saying we need to divorce our mind from modules is that syntactic construct called module, or modularity. I need to use a module. There's all sorts of ways to get modularity in a program. Even the assignment statement can be used that way. I want to give a personal story to help illustrate this concept, because I think this is the meat of it, and it's like I did not know the paper was going here in this kind of language design philosophical direction. I was working in Haskell with another person who was learning Haskell too. I was learning Haskell. She was learning Haskell. This was a very new company and she was more used to Java, I was more used to closure. She was feeling that Haskell was poor because it didn't have stuff like a class construct. She had to learn all these new concepts like modules and how functions worked, and don't forget about monads and stuff. Having done Lisp for a long time, I had had this mindset that Steele is talking about, that sure the thing about classes in Java is that they're used for so much. They're a unit of modularity. If you need some functions in a place to put them, like just static methods, you make a class and you put them in there. That's what a module in Haskell is doing. It's also this data structure thing where if you need to store two values together, you make a class and have the two fields. If you also need to implement this API that requires you to implement this interface, you need a class. It's also hook into the type system. The fact that this one syntactic or semantic construct, the class, was doing all this different work, she hadn't separated those pieces out yet. Talking to her about that, she had an epiphany like, "Oh, I see in Haskell when you want to put functions in a place so that you can use new names." You put them in a module. If you want to store data together, you make a new data type. If you need to do some type hierarchy stuff or the equivalent in Haskell, you might use type classes. All of these things are separated out in Haskell. It's not to say that you can't do modularity with another feature, but there's not this one overarching dominant construct that is confusing all of these different intents together. He's trying to open up our minds to show that we too often confuse the if statement with conditional branching. But if something else could do conditional branching, what if there's another construct that is actually maybe even commonly used, you just hadn't thought of it? You could say, "I don't need to branch because I can just do a look up in a hash map that returns a function." That function, I just call it because I know it's going to do the right thing for that value. There's all sorts of ways to do conditionals. You can see this in some object-oriented circles. They'll say, "Don't use conditionals, but what do they use?" They use the polymorphism of their inheritance hierarchy or duck typing or something. Basically, you don't have to do an if, the branch, it still happens in the machine somewhere. There's still a look up, there's still a branch somewhere. It's done in the vtable, in the method table of the class. You're doing it somewhere, it's just done in a different place. He continues, "As shown earlier, procedure calls can be an agent of iteration." He defines the recursive procedure that iterates until some condition is found just to show that like, "Look, function call can do this." It is even possible to use procedure calls to implement conditional expressions, though this is here to vore been largely a curiosity for students of Lambda Calculus. Ah, here's where he cites church 1941. I just think that's so cool. I'd love to write some paper on programming language design and be like, "Oh yeah, and Pythagoras defined this, remember that?" That just sounds so cool. What do they call that? Like a deep link? Just some real deep thing, like, "Ah, remember that guy? He did this." All right. Finally, many assignment statements can be modeled and even implemented through the use of procedure calls. I have written two Lisp compilers which use procedure calls to implement all assignments and iterations within the compiler, and in other papers, these Lambda ultimate papers, he definitely does that. I mean, he exhaustively kind of boring to read all this old Lisp code like, "Oh man, is that how you implement stuff? Oh, that's crazy." Okay. So many parentheses. I have used these compilers to compile themselves, and there seems to have been no demonstrable sacrifice of speed due to the use of procedure calls. Moreover, the code totaling some 70 pages of source text has been extremely easy to modify and maintain. So this is the same point I was trying to make before. But Fortran, I think I said 100,000 man hours, right? That's off the top of my head, but you can look it up. It's a lot. It's years of work. And here he is, he's like, "Yeah, I did two compilers myself." This is one of those un misunderstood, unappreciated ideas about Lisp that is kind of dear to me. Maybe too much so. I have a very strong bias towards technology that I can understand. I just wonder if we've gone wrong where, for instance, we're running on the JVM, which requires billions of dollars of research to create, where you can never recreate it, right? It hasn't thrown off like, "Oh, and here's how you do it yourself." Which is kind of what I want. I really appreciate these papers that are like, "Oh, yeah, you could do this. See, it's just a jump." Oh, cool. Thank you. I'm going to go do that this weekend. That, to me, is what I really like about Lisp, is that it was designed to be something that a single person, a small team, could work on and make something significant with. It might take some time, but often the third iteration, the fourth iteration, takes less time because you've learned all these lessons and can just immediately use them and write papers about them, and then they're available to everyone. I'm also reminded of the small talk system where they created a complete gooey environment. They invented WYSIWYG and desktop publishing and all this stuff, and it's just a small team of researchers working on a small budget, and it was just this iteration. It's like, "Let's build it again. Let's build it again. Let's build it again." The last time they built it, it's like, I don't know, a few thousand lines of code, and it does everything that we expect a modern system, a modern gooey system to do, doesn't do all that. It doesn't have a web browser because that's really complicated. They were able to, in a sandbox, create everything they needed. But even today, you look at something like the Squeak language, the Squeak implementation of small talk, and it's got an MPEG decoder in it, and it's still a very small system, whereas if you download the codec for MP4, it's like hundreds of megabytes. How did that happen? I don't know. It just seems like we've gone way off, like orders of magnitude of where it should be. You know, the codec should be small enough that it can fit inside the video file as an executable, and you just run it. Sorry. I'm granting now. Let's get back to the paper. The important point is that for each important programming concept, we have found useful. So moderate modularity, iteration, assignment, conditionals. There may exist more than one programming language construct which can embody that concept in a reasonably natural manner. Furthermore, it sometimes requires more than one construct to properly embody a given concept. For example, while do would be utterly useless in expressing iteration if some form of assignment statement were not also used. You're doing this loop and you need to get a value out. You need to store that and maybe accumulate a value over the iterations in some variable, whereas something like recursive function has a return. So that last time through the loop, it's going to return the value so you don't need a separate thing besides the procedure call. I think that this is super important to pick apart why am I using this construct? What am I trying to achieve? Am I trying to achieve modularity? Am I trying to achieve? The openness, like this is an open protocol, I need other bits of code that I haven't even thought of yet to be able to participate in this, there's all sorts of things that we're doing that might not be directly implemented as a construct. I'm writing my book and I was writing my book, I'm trying to tell a story in the present tense, but I was writing the book, Grocking Simplicity and it's all about functional programming so I want to show some high order functions for let's say sequencing Ajax requests. You need to make multiple Ajax requests and so you have to wait, let's say you click the buy button and that actually triggers call this and then call that and then call this Ajax and then finally on the server it is bought but then if they click it again, those Ajax calls are going to get interleaved so you want some way of sequencing the independent clicks so that you don't interleave the calls to Ajax. I created a little primitive that has a little queue that instead of the button actually just doing the work, it puts the work into the queue and the queue will process until the last call back is done and then start the next one. So I created this little thing and it's, I don't know, 20, 30 lines of code, I mean it's not huge and I'm showing it to my editor and he's like, wait, did you just create a queue? I was like, yeah and he's like, well, you know, why don't you use an existing queue? And I'm like, because it's 30 lines of code and I get to show how it's done and he's like, wow, like no other author has ever done something like that. They've always like imported in the library from NPM or something and he wanted me to focus on that idea, maybe it's like a permission, the freedom that we have, a right, a privilege to create our own constructs, to get the thing that we need done, done. That in most languages, they don't have, in most language communities, they don't have a value, they don't value creating it yourself when it's a simple thing. They value using someone else's thing or you know, just you can't do that because it's not provided by the language. And I think that LISP has this value, a core value that, hey, we all implemented LISP ourselves and we made it easy to extend, go ahead, like build something yourself. This probably leads to the curse of LISP, right? But at the same time, it's, it also helps all the people who are developing those things to develop this deep understanding of the difference between the programming constructs and the intent that you're trying to get out of it, sort of the logical construct that you're, you're doing. So if you're doing, let's say, what we would think of as a map, you could use a for a loop, you could use like the built in map or what you might, but you, like, I mean, this is, this is my personal experience, if you write map yourself, people are like, what are you doing? Not because it already exists, but because they're like, I didn't know you could do that. I didn't think it was legal, right? They just don't have this idea that you can build these general purpose, low level constructs yourself, right? It's almost like they're just deferring their programming abilities to whatever other libraries give them. So like, oh yeah, the low dash gods made that, so I don't, you know, so it's okay. But like, I'm always on the lookout for some cool idea in a language that I could implement myself in the language I'm using, okay, another personal story, I'll get back to the paper. In understanding a piece of a program, it is necessary to determine not only what construct is being used, but the concept it is intended to express. While we may prefer to think of a procedure call as meaning go do this other module and then come back, this is only one possible interpretation. We must realize that there are situations where coming back doesn't matter. Those are the tail recursive cases and these situations may be exploited. Just as a concept such as modularity may be expressed by diverse constructs, so may a language construct be interpreted in various ways, some of which may lead to superior compilation techniques. Okay, so this is definitely language design stuff. It is not unreasonable to want to be able to infer the intent of a piece of code from the particular construct used to express it. If only go to is used to express all control structure, it is hard to identify the conceptual important notions of conditional iteration and escape occurring in the program. It is important that alternative modes of expression exist, but the mere banishing of one abused alternative is unlikely of itself to cause correct usage of the others. Instead a language should be so designed that one is encouraged to use a construct if and only if it is appropriate. It must also provide enough, sorry, there is a typo in the paper. It must also provide enough constructs to cover all reasonable programming concepts. Otherwise programmers will merely misuse the constructs they are given and most programmers are very good at this. The structure of a program is dictated largely by the structure of the problem. If the problem solution is best expressed as a finite state automaton for example, then no amount of structured camouflage will help that fact. This is the essential frustration we have experienced with go to. We discovered that go to was often being used even in situations where more appropriate and expressive constructs were available and that it was being used for all sorts of purposes we hadn't anticipated and so we sought to eliminate unwanted concepts and programming styles by banning a construct. This is as fruitless as eliminating iteration by banning do loops would be or eliminating recursion by banning procedure calls. We need to get a better grasp on organizational concepts and their relationship to the individual constructs which make up our languages. Wow. He basically uses this paper to rebuff the entire go to considered harmful movement that like we need to look deeper than just the go to statement. It's not its fault that we use it that our code looks bad when we use it. It is more likely like you said in the beginning to that our code is better without the go to statement but the mess is still there and it's because we can't see the conceptual notions for all we can see is the syntactic use of go to. We can't see that that go to is used for iteration that was making a loop and that go to was used to escape out of the loop and this go to is used as like a procedure call. We need to have constructs that map a little more cleanly to these ideas. It's logical constructs. We want to use it if and only if it is appropriate. We should have an escape statement. A lot of languages have that now like a break or a continue something like that to get out of a loop you probably don't need that it's not as universal as the function call and then you have procedure calls but you also have iteration so you know while loops and for loops are better than go to's. I appreciate this that he's trying to show that this is actually like a language design problem that some languages may go to seem like the right thing to do because they don't have any other way to escape right there's no like let's say return statement in the middle of a function okay I did not know he was going here when I first read the paper I thought it was a kind of a surprise ending it just leaves me with that like you know you watch a really movie with a lot of philosophical ideas and like a weird ending and you're just like walking out or like an ending or it doesn't really tie a ribbon you know and you walk out of the movie theater like huh what do I think about it and you know you're going to be thinking about it for a week okay let's go to the conclusions he's going to he just summarizes a lot of the stuff. Procedure calls are demonstrably not inherently as inefficient as computing folklore would leave us to believe there are implementations of higher level programming languages in which procedure calls are almost as cheap as go to statements. Not all procedure calls need push a return address tail recursive procedure calls can be compiled almost as if they were simple go to statements. In fact procedure calls can be uniformly can uniformly be treated as go to statements which pass parameters such a technique reduces the amount of stack space needed. Procedure calls permit an extraordinarily powerful style of programming which even though it is completely structured includes most rats nest of go to statements as a subset. This style merely involves writing a procedure call where one would ordinarily write a go to at the end of a procedure. We might wonder why rats nests are not written using procedure calls in practice when they are certainly possible and violate no rules of structured programming. The answer is probably that go to statements being cheap are used freely enough to produce rats nests while procedure calls being expensive are used sparingly. We therefore come to the paradoxical conclusion that improving the implementation of procedure calls is a mixed blessing. We can improve our programs both in time and space but we may bring on the same problem we had with go to by encouraging the use of procedure calls in stylistically diverse ways. I have in the margin I made a little note call back hell that's you know that's that's what he's talking about he's kind of foreseeing call back hell like you can still write terrible code with these callbacks that you see in JavaScript function calls are now cheap there's no go-to like let's just make a mess okay we should respect procedure calls because they are powerful and not because they are expensive that's that's his concluding sense I like them they're powerful and not expensive there's a bunch of notes in this one I'm not going to read them but they're pretty cool I also like that he talks about the he thanks the people that he talked to discussed to get some key points and I mean there's some really great people in here but he's got Richard Stallman in there too which I think is cool just a little connection there between a couple people that don't always think of them in the same breath but yeah Richard Stallman he wrote GCC you know I maybe maybe there was some like cross collaboration between the scheme compiler and GCC compiler I imagine that's that was happening and that's pretty cool to think that that the scheme world had this big effect on see through the GCC compiler right that's pretty cool all right I'm going to wrap this up hope it leaves you in a kind of thoughtful mood as it has me this paradox that we make procedure calls faster they're just going to be you know used abused as much as go-to statements are that's that's rough so my name is Eric Normand this has been my thought on functional programming it's always rock on.