Can Programming Be Liberated from the von Neumann Style?
This is an episode of The Eric Normand Podcast.
Subscribe: RSSApple PodcastsGoogle PlayOvercast
In 1977, John Backus presented an algebraic vision of programming that departed from the von Neumann fetch-and-store semantics. This seminal paper has influenced functional programming in many ways. In this episode, I read excerpts from and comment on John Backus's 1977 Turing Award Lecture paper Can Programming Be Liberated from the von Neumann Style? A Functional Style and Its Algebra of Programs.
Short biography - Turin Award Lecture
Transcript
Conventional programming languages are growing ever more enormous but not stronger. Inherent defects at the most basic level cause them to be both fat and weak. Their primitive word at a time style of programming inherited from their common ancestor, the von Neumann computer, their close coupling of semantics to state transitions, their division of programming into a world of expressions and a world of statements, their inability to effectively use powerful combining forms for building new programs from existing ones and their lack of useful mathematical properties for reasoning about programs. An alternative functional style of programming is founded on the use of combining forms for creating programs. Functional programs deal with structured data or often non-repetitive and non-recursive are hierarchically constructed, do not name their arguments, and do not require the complex machinery of procedure declarations to become generally applicable. Combining forms can use high level programs to build still higher level ones in a style, not possible in conventional languages. Hello, my name is Eric Normand and today I'm reading a paper by John Bacchus. Can programming be liberated from the von Neumann style? A functional style and its algebra of programs. John Bacchus won the Turing Award in 1977 and for those awards the honoree gives a lecture and this was the written form of that lecture published in 1978 by the ACM. So I like to bring up the dates because the historical context is very important, especially for this one in the part I read which is part of the introduction to this, the abstract I guess. He talks about functional style, functional programs, and he's using the term in a way we don't use today. So typically when we say functional programming or functional style we mean something like you find in Haskell or Clojure and he is not talking about that and we'll get more into what he is talking about in this paper. Okay, so just a little bit more about him because he uses a really strong language in this introduction. He was at IBM, he led the group that developed Fortran. He also was very important in the algal committee and he is also known as the B in BNF which is Bacchus normal form or Bacchus narrow form. And in that form it's a way of expressing a grammar and this form he used to present a language in 1959 and now basically every language's syntax is defined in something like that. So for those reasons he won the Turing award in 1977. Okay, in this paper he is trying to escape from a problem that he saw happening in programming languages which was that they were basically developed and they were getting more complex, more sophisticated. They would take all the features of the language from before and just add new features and he was saying we need to go all the way back to how we even conceptualize the computer and start over and this was the result of that. Now remember he was on the algal committee he developed Fortran and so he is largely responsible for the kinds of languages he's talking about. So when he uses the strong language he's not some outsider who's like y'all are all doing it wrong. He's saying what we did the stuff we need to learn is not to do that but that's not the right way to do it. Okay, so he is a person who can use such strong language and be taken very seriously as somebody knows what he's talking about. Okay, so I'm just gonna go through there is a long paper there's a lot of I guess program listings. They're very short but they're impossible to read and they're presented often just as examples of programs and sort of algebraic manipulations algebraic proofs. I'm not gonna read those I'm not gonna try. I hope that the stuff I'm talking about gives you enough the stuff I will excerpt and comment on I hope that that will give you enough to read the code listings and proofs yourself. I don't think it's worth me trying to comment on them specifically because he does explain a lot in the text so I'm just gonna talk about that. Okay, so remember I said he's talking about a different thing when he talks about functional programming. This is at a time when that term wasn't used and so he could use it the way he wanted and so he's using something different. So here's where he starts explaining. Associated with the functional style of programming is an algebra of programs whose variables range over programs and whose operations are combining forms. Okay, so the variables are not values they are other they represent programs and the only operations you have are combining forms meaning I guess you could think of them as functions from a program to another program. Okay, this algebra can be used to transform programs and to solve equations whose unknowns are programs in much the same way when transforms equations in high school algebra. These transformations are given by algebraic laws and are carried out in the same language in which programs are written. Combining forms are chosen not only for their programming power but also for their associated algebraic laws. General theorems of the algebra give the detailed behavior and termination conditions for large classes of programs. Okay, so he's trying to bring a much more mathematical algebraic view of programming than the one we have today even which is that the machine is carrying out a bunch of instructions and we are simply controlling which instructions get executed on the CPU. Okay, now it's important to talk about algebra. You know, we often learn algebra in high school and we associate it very strongly with, you know, solving equations and things like that and he's trying to say that there is an algebra of programs. So instead of ranging like variables in the algebra in high school range usually over numbers, right? So you have x and it's an unknown in the equation and you have to solve for x which is going to be some number. In this algebra, the operations aren't plus and times and minus. They are ways of taking a function and or taking multiple functions and combining them into a single function and the variables, the unknowns are functions. Okay, and that you could do the same kinds of manipulations you do to like subtract x from both sides, divide both sides by the similar number, do the same operation on both sides of an equation. You could solve for the unknowns, let's call it f which is the function that you're looking for by doing this kind of algebraic manipulation. So I see this is kind of like the end goal that he's talking about, but he's going to go deep and try to uproot the whole notion of how we develop software today. Okay, so this is part one, section one. Each successive language incorporates with a little cleaning up all the features of its predecessors plus a few more. Some languages have manuals exceeding 500 pages. The plain fact is that few languages make programming sufficiently cheaper or more reliable to justify the cost of producing and learning to use them. So he's saying that just adding more pages to your manual, you know, changing the syntax, adding some new features like a new way of accessing the contents of an array or something like that, which is what a lot of languages have. It's like, Oh, look at all these cool new features. You're just making the manual fatter and that doesn't justify the cost. It doesn't make programming that much easier. But there is a desperate need for a powerful methodology to help us think about programs and no conventional language even begins to meet that need. There's no methodology for thinking about programs. Okay, the language doesn't help with that discussions about programming languages often resemble medieval debates about the number of angels that can dance on the head of a pin, instead of exciting contests between fundamentally differing concepts. This is a criticism that we still have today, where people you're rewarded as a programmer for knowing the minutiae of a programming language construct, like, Oh, you can't use this or there's some weird case where you have to put a semicolon here. If you don't, then the compiler is going to misunderstand. And, you know, even stuff like in Ruby, like what's the difference between a block and a lambda? Like there are differences and you're rewarded for knowing that because you can avoid problems in your programs by knowing them. But they're not different enough that it's like some exciting debate. Right. What we need is some way of, of having really different ideas that we can debate. Okay, so this is section two models of computing systems. Underlying every programming language is a model of a computing system that it's programs control. Some models are pure abstractions. Some are represented by hardware and others by compiling or interpretive programs. Okay, that's interesting. I don't know if that's true, but like, I think a lot of programming languages are based on a computing model. But everyone, I'm not sure. It's a very strong statement. And, you know, maybe in 1977, that was true. Okay, so he's trying to kind of figure out what are the criteria? What are the characteristics of these computing systems? So the criteria for models, foundations, is there an elegant and concise mathematical description of the model? History sensitivity. Does the model include a notion of storage so that one program can save information that can affect the behavior of a later program? It's basically state. Type semantics does a program successfully transform states until a termination is reached? Or can a program be successfully reduced to simpler programs to yield a final normal form program? Okay, this this is kind of a grab bag. This whole semantics, like, how does the language operate? What are the semantics? What is the meaning of a program? And then clarity and conceptual usefulness of programs. Are programs of the model clear expressions of a process or computation? Do they embody concepts that help us to formulate and reason about processes? Oh, this is interesting. This is a kind of meta idea that like, well, we have to be able to reason about these programs. So does this the model of computing even allow that? Okay, so he uses these criteria, and he's trying to kind of crudely figure out what kind of classes of computer models there are out there. Okay, so he has three in this. There's this simple operational models, and the examples are Turing machines. They just have this very simple, very elegant definition, very basic. They don't, you know, allow for much, but they're useful for doing mathematical proofs with. Then there's the applicative models, such as Church's Lambda Calculus. Okay, so the problem with these that they're concise or they are concise and useful, but they have no storage and they're not history sensitive. So you can express every computation, but you don't have state. So then he's saying the programs can be clear. Now then the fourth or third one is von Neumann models. So basically every conventional programming language is are his examples. So the foundations, remember, these are the criteria complex, bulky, not useful. They have, they are history sensitive. They have storage semantics. It's state transition with complex states and program clarity. Programs can be moderately clear and are not useful conceptually. Okay, so now the whole section three is about von Neumann computers. Remember, he's trying to uproot this notion that underlies all of our programming languages. What is a von Neumann computer? When von Neumann and others conceived it over 30 years ago, this is in 1978, it was an elegant, practical, and unifying idea that simplified a number of engineering and programming problems that existed then. So it really was true. At that time, you just had vacuum tubes and you could arrange them in any kind of circuit that you wanted. And it was difficult to see how you could get when you were transitioning from a custom circuit to solve one problem to one that would run a program software so that it could be programmable. How do you conceptualize what's going on? How do you make this into an architecture that into some kind of architecture instead of just vacuum tubes and wires everywhere? Although the conditions that produced its architecture have changed radically, we nevertheless still identify the notion of computer with this 30 year old concept. And I think even today, in 2020, this is still the dominant model, it's what you learn in school, how a computer works, and it's how our computers are built. So there are three parts to a von Neumann computer. A central processing unit, a store, we call it RAM now, and a connecting tube that can transmit a single word between the CPU and the store. I propose to call this tube the von Neumann bottleneck. Okay, so this is the model. Of course, our buses these days are more complex. They have caches, they're not sending one single thing at a time. They can send big, you know, blocks of memory, but it's essentially the same thing. It's just kind of a practical optimization of this model. Ironically, a large part of the traffic in the bottleneck is not useful data, but merely names of data, as well as operations and data used only to compute such names. Before a word can be sent through the tube, its address must be in the CPU, hence, it must either be sent through the tube from the store or be generated by some CPU operation. So he's saying that this bottleneck is made worse by the fact that you have to compute all these pointers. And so you send the pointer to your CPU, and then you take the pointer and then you have to go fetch where the pointer is pointed to. And then this is just making the bottleneck worse because you have all sorts of data that isn't the data that's transferred through the bottleneck. Thus programming is basically planning and detailing the enormous traffic of words through the von Neumann bottleneck. And much of that traffic concerns not significant data itself, but where to find it. Okay, so this is just the this is the problem with the model. There's other problems. Okay, with this model that it's it's a sequential model of computation, it doesn't include any way of doing parallel programming. But the sequentiality is made worse because you're waiting for data from RAM. And often that data is not even the data you want, it's a pointer to the data you want, and then you have to fetch again and again and again, right? And this is well understood even today. This is the the problem we face. It's why we have caches and branch prediction and prefetching and all this stuff that that we have in our CPUs today. It's to try to get over this this problem of the bottleneck. Okay, and now he's talking about languages, which are languages that assume this von Neumann computer von Neumann programming languages use variables to imitate the computer's storage cells control statements elaborate its jump and test instructions and assignment statements imitate its fetching storing and arithmetic. The assignment statement is the von Neumann bottleneck of programming languages and keeps us thinking in word at a time terms in much the same way the computer's bottleneck does. Okay, so he's saying there's this direct link between the problems with the computer model and the languages that we use to program it that we use this assignment statement, which is, you know, an assignment to a variable is basically basically directly linked to a place in memory. So it's a pointer. So it does have this problem of the bottleneck, like you have to send the pointer and then get the data there and update it and store it so that you're going back and forth through this bottleneck. The assignment statement splits programming into two worlds. The first world comprises the right sides of assignment statements. This is an orderly world of expressions, a world that has useful algebraic properties. It is the world in which most useful computation takes place. So if you go look at the right hand side of assignment statements, this is true. It's going to be very orderly. You could, you know, often they'll just be mathematical expressions, you know, stateless, purely functional expressions. It might be the result of a function call or something, but it's usually something very pure and clean and easily reasoned about. The second world of conventional programming languages is the world of statements. All the other statements of the language besides assignment statements exist in order to make it possible to perform a computation that must be based on this primitive construct, the assignment statement. Okay. So, you know, this is something that we know as functional programmers, you know, in an imperative language, what we would call an imperative language today, the assignment statement makes it really hard to reason about programs because the assignment, the variable that you're assigning to, is mutable, it's a mutable state. Applicative computing systems lack of storage and history sensitivity is the basic reason they have not provided a foundation for computer design. Okay, remember, this applicative computing system, this is like lambda calculus, it's the, it's Lisp, what we would call functional languages today. Okay, so he's saying they're not history sensitive, they can't store state by themselves. Most applicative systems employ the substitution operation of the lambda calculus as their basic operation. So the substitution is like beta reduction, which is basically when you apply a function to an argument, you can substitute the argument's name for it, the value that you're applying it to, right, inside the body of the function. So if you have a function f of x, so x is the variable, you would look in the body of the function f and replace x with whatever the argument is at the time you're applying it. So f of one, you would replace x with one in the body, and that's how you, that's how you evaluate a program. This operation is one of virtually unlimited power, but its complete and efficient realization presents great difficulties to the machine designer. Furthermore, in an effort to introduce storage and to improve their efficiency on von Neumann computers, applicative systems have tended to become engulfed in a large von Neumann system. For example, pure Lisp is often buried in large extensions with many von Neumann features. The resulting complex systems offer little guidance to the machine designer. Just a little note here, and just to first I'll explain what he's saying. He's saying that you might have some pure core of your functional language, but then you're going to have all this machinery about variables and other stuff so that you can actually use the machine that it's running on, right? You can actually store values and then later read them in, which is what you need to do in a practical system. So as an example, you have Haskell today, where everything is pure, everything is an expression, but then at the end of the day, you're going to use monads for getting the work done. So it has this huge machinery, you can use the I/O monad. It has this huge machinery to allow it to be what he calls history sensitive. So read in the current state, do something, and then store it back in. All right, so I just want to say again, before we go further, that he's really saying that these current languages, certainly the imperative languages and the I/O languages that we have today, but even functional languages that we have today, even the Haskell and the most pure languages we have are still based on this von Neumann model of the computer and still have this problem. Okay, that they are not, they have inefficient ways of applying the functions to arguments, and they still have to have all this machinery about state, and he's trying to do something about that. Okay, so now he's going to compare von Neumann and functional, what he calls functional programs. So he's developed a whole system which he'll present in this paper, and this is the beginning of that presentation, because he's kind of dug out this root of von Neumann that underlies all the languages we have. Okay, so section five, oh, excuse me, a von Neumann program for inner product. I'll read this code, it's not too long. So assign 0 to c, then for i equals 1, step 1 until n, do c equals c plus the i-th element of a times the i-th element of b. Okay, so it's just a standard for loop and updating c in each iteration. Okay, so now, you know, you would recognize this program, a functional program for inner product. So now he's showing what he's calling fp, not what we call fp, and this one is def inner product equals insert plus composed with applied to all x, or times, applied to all times, composed with transpose, and then he has an abbreviated form. So he says composition, insert, composition has a dot, insert is a slash, and applied to all alpha are functional forms that combine existing functions to form new ones. All right, so just a few comments on this. You will probably recognize this if you do functional programming. You will recognize this today as what is called now point-free style. Point-free style basically means you don't name your arguments. You use higher order functions to build the definition of a new function without naming the arguments. And so you use compose. He's using it twice. Insert is what we would call fold or reduce, applied to all is like map, and transpose is zip. So taking two arrays and zipping, you know, the first elements together into a tuple, the second elements together into a tuple, a third, etc. So, you know, he's he's found these composition, insert, and applied to all, and then a couple of built-in functions, plus times and transpose, and putting those all together, you have inner product, a one-liner, very concise. Okay, now here's the thing. I'm not a big fan of point-free style. To read this, it doesn't scream to me. Oh, that's inner product. Okay, I understand all the parts. I use them all the time, but I find this style, it has its benefits, but one of them is not clarity. Okay, but let's see what he has to say about it. It's advantages. Okay, let us compare the properties of this program with those of the von Neumann program. A, it operates only on its arguments. No hidden state or complex transition rules. That's true. B, it is hierarchical being built from three simpler functions and three functional forms. That's nice. So it is, it's denotational, as we saw in the last one. It is the expressions are understood in terms of their sub-expression. It is static and not repetitive, non-repetitive, static and non-repetitive, in the sense that its structure is helpful in understanding it without mentally executing it. That is true. Okay, so you don't have to write a for a loop and kind of execute that in your mind. You can just see that what he calls insert, which is like fold, fold plus. Well, that's just some. Right? That's just summing it. I don't have to actually do the sum. Right? D, it operates on whole conceptual units, not words. Okay, that's true. The ideas like the words remember are the bottleneck. Right? So if you look, you know, I guess you would have to be a little generous because he's really saying that this von Neumann thing is it kind of infects programming languages in a way we don't realize. If you look back at this for loop, you can look at it as like, okay, go fetch the current value of C and go fetch the pointer to A. Now add i to it and fetch that value and also fetch the pointer to B, this is another array and then add i to that pointer and now go fetch the result of that, multiply them together and then add them to C and store that in C. Okay, that's the way he wants you to read this, that it is about the fetching. Okay, all right. It incorporates no data. This one is E. I think I haven't been counting, but this is E. It incorporates no data. It is completely general. It works for any pair of conformable vectors. I think what he's saying here is by no data that it's point-free and that there are no types. Right? It's not about a specific type. As long as the plus and times work on it, it's going to be fine. Okay, F, it does not name its arguments. It can be applied to any pair of vectors without any procedure declaration or complex substitution rules. That's interesting because he's only using higher order functions, what he calls combining forms. It does not have to name its arguments. You don't need this beta reduction that a Lisp interpreter uses. Okay, gee, it employs housekeeping forms and functions that are generally useful in many other programs. Okay, so this is interesting. He's saying that we're reusing these highly reusable forms, these combining forms. All right. Much work remains to be done before the above applicative style can become the basis for elegant and practical programming languages. That's just a hint, right? He's like, this is very preliminary work. He hasn't finished yet. Okay, but he still has, you know, a lot of good stuff to present. He's not presenting the language yet, though. It's still preliminary. Okay, section six, language frameworks versus changeable parts. All right, he's going to try to construct a language from first principles. Let us distinguish two parts of a programming language. First, its framework, which gives the overall rules of the system. And second, its changeable parts, whose existence is anticipated by the framework, but whose particular behavior is not specified by it. All right, so this is basically saying there's going to be some part that is built into the language that you cannot change. And then the changeable parts are the part that help you write your program, right? You write all these helper functions and other things. And those are the changeable parts. Now, suppose a language had a small framework, which could accommodate a great variety of powerful features entirely as changeable parts. So it's an interesting idea that you would want, you might want a small framework, small language that allows for a large number of changeable parts. Then such a framework could support many different features and styles without being changed itself. In contrast to this pleasant possibility, von Neumann languages always seem to have an immense framework and very limited changeable parts. Okay, I don't know if this is true characteristic of von Neumann languages. But it does seem it's not is it essential to it, but it does seem like that's the case that a lot of what you might do with just functions. So you just need to be able to define and call functions these languages do with tons of syntax. Thus, von Neumann language must have a semantics closely coupled to the state in which every detail of a computation changes the state. Remember, he's talking about the statement oriented thing where every statement changes the state in some way, or is at least supporting that change. Every detail of every feature must be built into the state and its transition rules. Okay, and because this state transition is really complicated, it's really complex, then every feature has to be spelled out in stupefying the tail in its framework. Okay, he's obviously not a fan. All right, I'm going to skip section seven. Not much important stuff there. Okay, eight. He can get repetitive, you know, he's just kind of digging this deeper and deeper. Now he has to make a statement about APL. What he's shown so far is kind of like APL. APL is is not a von Neumann language. It is an applicative language in his rough category scheme. And it is it operates on a raise of data and can get very terse and has this also higher order style. Okay, APL, unfortunately, still splits programming into a world of expressions and a world of statements. Thus, the effort to write one line programs is partly motivated by the desire to stay in the more orderly world of expressions. APL has exactly three functional forms called inner product, outer product, and reduction. These are sometimes difficult to use. There are not enough of them, and their use is confined to the world of expressions. Okay, so he's basically saying there's a lot of good stuff in there, but there's only three combining forms, and he wants a much richer set of combining forms. And yeah, he wants to define the language. Basically, he wants to start defining stuff without this whole set of state transition stuff. Okay, nine von Neumann languages lack useful mathematical properties. So far, we have discussed the gross size and inflexibility of von Neumann languages. Another important defect is their lack of useful mathematical properties. Okay, so they're really hard to use math to reason about them. In any case, proofs about programs use the language of logic, not the language of programming. proofs talk about programs, but cannot involve them directly since the axioms of von Neumann languages are so unusable. So he's talking about how like you can have all these tools for analyzing von Neumann languages, for doing kind of static analysis and and and and proving theorems about them. But you're always going to be using a different language from the language you're using. You have to learn this other tool and its logic to be able to use it. In contrast, many ordinary proofs are derived by algebraic methods. He's just talking about mathematical proofs, just use algebra to prove something. This is equivalent to that. Let me show you through algebra. These methods require a language that has certain algebraic properties. Algebraic laws can then be used in a rather mechanical way to transform a problem into its solution. And he gives an example of solving an equation and solving for x. There's a bunch of unknowns, but you can solve for x regardless. And he just uses algebra. And he's using so he's trying to say that you're using the language itself. You're not using some other logic language with a tool. You're just using the language itself. And you can your tools can can verify your your algebraic manipulations and proofs. This algebra can be used to transform programs and to solve some equations whose unknowns are programs in much the same way one solves equations in high school algebra. Algebraic transformations and proofs use the language of the programs themselves rather than the language of logic which talks about programs. Okay, so this is an interesting idea that you might have a, let's say, an expression in your language with an unknown value, an unknown program, an unknown function, let's call it f. And by algebraic manipulations, solve for f and know how to define it. It's really interesting. Okay, say, I know the result has to be 100. I know the input is zero. And I know it has to do this, this is it. And like, I can solve for f, right? That seems pretty cool. Or like, for you could maybe say, for this input, this is the output for this input, this is the output for this input, this is the output. I know it's going to have to do a sum in there. And I know that these are the possible combining forms I have. Is there a way I can solve for f? It's interesting. It's interesting. Okay. What all, this is 10. What are the alternatives to von Neumann languages? Okay, in seeking an alternative to conventional languages, we must first recognize that a system cannot be history sensitive unless the system has some kind of state. Thus, a history sensitive model of a computing system must have a state transition semantics. Okay, to illustrate some alternatives to von Neumann languages, I propose to sketch a class of history sensitive computing systems where each system a has a loosely coupled state transition semantics in which a state transition occurs only once in a major computation. B has a simply structured state and simple transition rules. C depends heavily on an underlying applicative system, both to provide the basic programming language of the system and to describe its state transition. This approach involves four elements which can be summarized as follows. A, a functional style of programming without variables. Okay, so this point-free style, what we call a point-free style that he's got. B, an algebra functional program. C, a formal functional programming system. This is an FFP, which he will talk about soon, and D, applicative state transition systems. Okay, so he's talking about a very simple model of state where you have the initial state, you have this big expression that does the, that results in a new value of the state. Okay, I know he's, he's talking a lot around it, but that's basically what he's saying. The transition rules are very simple. The result of one expression is the next state. It's, it's, it's crazy how much language he has to use to describe it, but I think back then this was a very new idea that you wouldn't have a function or a procedure that calculates something by doing these fetches and stores and stuff. That you would just have one fetch at the beginning, you'd calculate the answer, and then do a store at the end. That was your model of state. Okay, but we'll get to that. He will get to that in section 13, I think, where this is section 10. Now we're going to start section 11, functional programming system. Okay, so finally, he is giving us this what he's talking about the FP. Okay, in this section, section, we give an informal description of a class of simple, applicative programming systems called functional programming systems, FP, in which programs are simply functions without variables. An FP system is founded on the use of a fixed set of combining forms called functional forms. We would call them higher order functions. These, but they are, they take, they take functions and return functions. Okay, they always do, do that. These plus simple definitions are the only means of building new functions from existing ones. They use no variables or substitution rules, and they become the operations of an associated algebra of programs. All the functions of an FP system are of one type. They map objects into objects and always take a single argument. Okay, I think I should mention this now. This paper was very influential on current functional programming languages. It didn't have the effect that I think he intended, which was to completely eliminate the applicative style, the lambda calculus style. He wanted something where you never defined a program referring to values. So you never said Fibonacci of n is equal to, you never did that, because n is a variable that refers to a value. Okay, he wanted you to define it in point freestyle, because by using this limited set of higher order functions, what he calls combining forms, you get an algebra. You can actually have laws that relate this expression to that expression. This expression is equivalent to that expression, and it's by the definition of these combining forms. Okay, so in contrast, a lambda calculus based system is founded on the use of the lambda expression with an associated set of substitution rules for variables for building new functions. The lambda expression is capable of defining all possible computable functions of all possible types and of any number of arguments. This freedom and power has its disadvantages as well as its obvious advantages. It is analogous to the power of unrestricted control statements in conventional languages. Okay, so he's saying that there's problems with the lambda calculus. Okay, just like unrestricted control statements, he's saying go to. Right, so if you have go to in a language that has problems, we know that. But he's saying that the fact that you can define any type of function from any number of arguments to any type this with unrestricted freedom comes chaos. If one constantly invents new combining forms to suit the occasion as one can in the lambda calculus, one will not become familiar with the style or useful properties of the few combining forms that are adequate for all purposes. Just as structured programming eschews many control statements to obtain programs with simpler structure, better properties, and uniform methods for understanding the behavior, so functional programming issues the lambda expression substitution and multiple function types. It thereby achieves programs built with familiar functional forms with known useful properties. These programs are so structured that their behavior can often be understood and proven by mechanical use of algebraic techniques similar to those used in solving high school algebra problems. Basically what he's saying is if you restrict the ways that you have to express a program to this very carefully chosen set, you can more easily understand, read, and prove things about your programs. That unrestricted lambda calculus, which is what we use today, is much harder to reason about than a restricted set of combining forms. That is quite a statement coming as a functional programmer myself. I'm intrigued because I think of the biggest problems as what he was saying is like complex state transitions and using variables and things. He's saying that no, you don't realize that if you just got rid of basically unrestricted functions and you restricted them to this very small set, then you would have so much more power, just like getting rid of go-to gives you power. It gives you power to reason. It's not like your language is more powerful, but it gives you power to reason. One chooses only those functional forms that not only provide powerful programming constructs, but that also have attractive algebraic property. One chooses them to maximize the strength and utility of the algebraic laws that relate them to other functional forms of the system. This is really intriguing. I do have to say that in other papers that I've read by him that are trying to tail small things, I think in the 70s, people weren't doing so much higher order programming as we do today. Just by the way, he talks about it. He will say like, "Oh, well, in Lisp, this is how you would solve the problem," and he shows this recursive solution. He's like, "This is just like a for a loop. It's got the same problems," which is true. Except when I look at it, someone from 2020, I would never solve the problem that way. I would use reduce and map, and it would be done. I think a Haskeller would do the same. I think that this paper probably caused that, at least had an influence on it, that because of this paper, people realized the importance, the power that you get from doing more higher order functions. If you look at something like closure or the prelude of Haskell, a lot of it is saying, "Look what we can do if we just have this stuff built in." If you look at the standard library of closure, it's got a bunch of these higher order functions, and Haskellers do a lot of point-free style. I think that this paper had that kind of effect to say, "Hey, we should be thinking about higher order abstractions more than we do." What I disagree with him on is that I think that these point-free expressions, where you're talking about you never reference a value. Your variables only can talk about other functions, and that there are no types. I think this is very hard to read. It might have useful properties, so if you can get your head into that space, where you're dealing with transpose, composed with insert plus, composed with applied to all times, I don't remember the exact code. If you can get your head into that space, and you write out this program, you see, "Yes, this does exactly what I needed to, and wow, look at all these cool properties." When you come back two weeks later, six months later, and you read it, it's not clear what it's doing. It's not. I think a Haskeller would say, without the type system, that would get really hard to reason about. The type at least is there. If you haven't written it, you can at least ask the type inference, or like, "What is the type of this thing?" The type inference is there to help you understand what this function will do. He's not using types. There are no types in this system. There's only functions of one argument, and that argument can be like an array of the actual arguments. It's a tuple. If you look at these programs, I would say that they're right once. You can get it right, you can type it out, and then you read it again, and it's really hard to understand what it's doing. Now, it has all these algebraic properties. You can use the fact that something is associative and say, "Oh, I can move this around here and do this because it's commutative. I can move that down here." That's all really nice, but it's so hard to understand what the program is going to do just by looking at it. Okay, that's an aside. Personal opinion. But I think it had this effect of making people think, "Huh, there probably are some really useful higher order functions that we can build into the language that, when it's appropriate, will be really useful." Okay, even though in our language in Haskell, we're going to allow variables to refer to values and make functions a kind of value. This is another thing I didn't mention. In this FP system, there are two classes of things. Three, there's the objects, so those are your values. There's functions. Okay, so you never refer to the values. You have functions which you can refer to, and then combining forms which take a function and return a function. So, you never can talk about values in your program. It's not like first class functions where a function can be stored in a variable or something. They're separate. They're separate levels. They're not first class. They're three different classes. All right, so he gives a bunch of algebraic laws that are really pretty cool. The definitions of these functions read very algebraically. Oh, and I should mention something else. How do you refer to the third argument, for instance? So, an argument has one, I mean, a function has one argument, but it's a tuple. So, how do you get the third thing out of that tuple? Well, in his algebra, there are defined primitive functions called one, two, three, four, etc. So, one takes the first element of the tuple. It is a function that takes its argument, takes the first element of its argument, and returns. Okay, so you can talk about, remember, you're only talking about, you're only ever talking about functions and combining them. You're not applying the function to something. So, you'd say, this, you could say this function is equivalent to, let's say, a function that squared its first argument. It'd be square composed with one. This one is the function that takes that returns its first argument. So, there you go. Like I said, it's hard to wrap your mind around, like, what would it look like? Just another, while I'm here, what we're already an hour in, an hour and six minutes, great. Just another example of something that is used today in section 11, he starts using this square bracket notation, which basically says, make a tuple by applying the functions in this to the value. And if you use closure, you might be familiar with the function called juxed. And juxed does just that. It takes a number of functions, and it returns a tuple of applying each of those functions to its argument. So, you can see that it's there, like, the influence of this style of this idea of what if we could just operate at the function level? That's what he calls it, the function level. So, we're not at the value level, we're operating on only on functions. It's there. The Haskell prelude, I mean, you can see how these very simple functions are defined, and the combining forms are defined very clearly, very concisely, and how having a set of these is all you need. You just need, like, a bunch to, like, kick-started to bootstrap into a very useful programming system. I think that this paper had this effect. All right. So, I'm going to talk about a program where he's defining matrix multiplication. All right. I'm not going to read the definition. I don't know how to read it. I'll read it so you know. Okay. You should look it up. It's on page 622 of this. You know, this was published in a magazine, so it's, like, 622. All right. All right. There, I'm going to try to describe it at a high level. There are four functions composed together. The first function in kind of like a chain. All right. The first function, one on the left, is alpha alpha 1p. Okay. So apply, apply 1p. Okay. The second one, second function is alpha distribute last dist l. Okay. And then there's distribute. That's the third one. And then there's what I'm going to call juxtapose. One and trans compose with two. Okay. So you can see how terse this is. And, like, looking at it, I mean, you can sit there and break it down and figure out how do you get matrix multiplication from this? It's four steps. Four functions composed together. And you can read it from right to left. Okay. Each is applied in turn. So one, which means the first argument. And then made into a tuple, the first argument and transposition of the second argument. You know, to me, it's like this is not, uh, I'm, I do functional programming. And I'm just like, this is really hard to read. A name or two would be wonderful for understanding what is going on here. And I know that the names, you know, maybe make the algebra harder to deal with. Uh, maybe he's showing this super extreme, like, oh, if you go all the way over here, look at all the great properties. And it's not great to be there. But, um, you can maybe learn something that, you know, maybe, maybe from where we are, we can move a little bit closer to it and get some of those properties. We don't have to move all the way there, but he's taking this very extreme position. And it has extreme results. All right. So now let's talk about this program. This program matrix multiplied does not name its arguments or any intermediate results, contains no variables, no loops, no control statements, nor procedure declarations, has no initialization instructions, is not worded a time in nature, is hierarchically constructed from simpler components, uses generally applicable housekeeping forms and operators, is perfectly general yields bottom, whenever its argument is inappropriate in any way, does not constrain the order of evaluation unnecessarily. All applications of IP of, yeah, IP to row and column pairs can be done in parallel or in any order. And using algebraic laws can be transformed into more efficient or into more explanatory programs. None of these properties hold for the typical von Neumann matrix multiplication program. All right, there's some good stuff in here. We got to break this down a little bit. So it doesn't name its arguments or any intermediate results. That's kind of nice, because sometimes you think, like, what would I name this besides x or t, because it's temporary, like it's sometimes you don't have a name for those things. But to not name its arguments, is that a benefit? Names provide a lot of clues, a lot of redundancy to help you understand what something is doing. And so I often think that that is useful sometimes, but then you know, you regret it later when you come back to read it. Okay, contains no variables, no loops, no control statements, nor procedure declaration. I think that's nice. Maybe the procedure declarations would be good. Again, another chance to add names. But the no loops, no control statements, that's good. It works without an if. I mean, that's awesome, right? No, it has no corner case. Has no initialization instruction. That's good. Those are hard, hard, hard problems. Is not word at a time in nature? Okay, good. Is hierarchically constructed from simpler components? Good. I'm a fan uses generally applicable housekeeping forms and operators. Okay, it's reusing useful parts. That's good. It's perfectly general. Okay, so basically it works for any matrices. Yields bottom whenever its argument is inappropriate in any way. So it doesn't have any, it might not. So it's defined, its behavior is defined for all arguments. But where it's not appropriate, meaning that's not a matrix. If you pass me something that doesn't really fit this form, it will yield bottom, right? That's good. Maybe someone who's more into Haskell would say, wouldn't it be nice if you could catch that at compile time? Okay. All right, does not constrain the order of evaluation unnecessarily? This is good, because a typical von Neumann program, an imperative program, let's call it, would use maybe four loops, nested four loops, and would basically bake in an order when all of those multiplies could happen in parallel. There's no reason they couldn't. And then all the ads of all the parts could happen in parallel as well. Okay, and then this is the biggie, this is the big one. Using algebraic laws can be transformed into more efficient or into more explanatory programs. So that's interesting. A compiler can use the algebraic laws and figure out a more efficient way to calculate this. If you had some interpreter or some virtual machine or runtime that this was going to run on, maybe it knows something about the hardware and can transform the program algebraically, maintaining all of the values and the same semantics and get a better program. Likewise, you might say, "Hey, let's expand this out into a maybe a more verbose form that is better for explaining exactly how this does what we think of as matrix multiply." So for clarity. So FP systems are so minimal that some readers may find it difficult to view them as programming languages. Viewed as such a function F is a program, an object X is the contents of the store, and F applied to X is the contents of the store after program F is activated with X in the store. Okay, so I like this statement here that some people might find it's so minimal that it's difficult to see that this could be a programming language. Like, "Is it a programming language all about the syntax and where's your semicolons and your curly braces?" I think that this is a problem we face even today where people look at some languages and they're like, "Oh, but how do I do what I'm used to?" Which is if statements and defining classes and stuff like that. I think this still rings true today. I should mention this because I've been making statements about how hard to read this is. You might get used to it. You might. I know that a lot of people look at my closure code that I'm used to that other closures can read and they look at it like, "How do I know what's going on?" I have no sort of mental hooks into understanding this kind of program. It takes a while, takes getting used to a program as an expression. Where are you? They're so used to this von Neumann style, this imperative style of defining a variable and banging on the variable, adding stuff to it, accumulating stuff in the variable over many iterations of a loop that when they see something where you just reduce +0 or array, they see that and they're like, "But where does the work happen?" The work is adding to this variable and looping. Where is that? Well, it does it for me. I don't have to think about all those details. There's a training that you have to go through. You have to train your mind to see that. I can imagine being able to, with a lot of time and practice, being able to look at these programs and say, "Oh, yeah, that's obvious. That's obviously matrix multiple." I can imagine that. Sometimes when I make these statements about, "Oh, it's hard to read. Wouldn't a name clarify things a lot?" I wonder if I'm not being fair that this is foreign, even to me as a functional programmer that this style is so extreme that it's just hard. It's not fair to say that. I'm not sure. I'm not sure. I think that I am used to doing higher order functions, and especially the big three, the map, filter and reduce. A bunch of others, but those are the big ones. When I see them, I don't have trouble thinking about them. I do sometimes see, for instance, closure expressions that use maybe a nested reduce or something like that where I'm like, "Oh, wow, this is too much." I can maybe understand what it's doing, but I won't be able to really figure out if it's correct or not. Can we please break this up? Maybe name this results, this intermediate result, so I can have some sense of the steps and what you're trying to do at each point. I do see code like that. Even as someone who programs enclosure all the time, and I use reduce all the time, these things can be maybe too terse, too concise, that there's not enough information in the program for us to wrap our heads around what's going on. We need a little bit of redundancy. We need a little bit of like a pause, like, okay, and then I'm going to call this the sum. Oh, okay, sum. That's something I already understand. It's something that I can say, okay, well, these two parts, I only have to understand this first part up until where you assign it to the variable sum, and then I'll know what it is. I just have to figure out, is that a sum? Okay, yes it is. Now, I don't have to think about it anymore. I can move on to the second part. So that, that naming is a, it's like I'm pause. It's a, it's a break. It's a way of saying, I don't have to understand this as a whole. I can understand individual parts, and the person has broken them into parts for me that have a real meaning at the value level. That is something that, I mean, I, this is just my personal opinion. I think that, that that is necessary, that we're not all at our best all the time. We don't have, we can't reason, we can reason a little bit this higher level of abstraction, higher order functions. But if the whole program is like that, our brains cannot, cannot do it easily, that we need breaks, we need to program for ourselves when we have, when before we've had our coffee, maybe we're not getting the best sleep, we need to be able to reason in those times too. And so, okay, that's my little, my little rant about style doesn't take away from the, the sort of algebraic properties and the usefulness of that in here. Oh, and, and, you know, in Haskell, it's the same. The, the variables in Haskell refer to values. And they, even if you're using point free style, you're, you're creating a function that operates on values to values, and often immediately applying it to a value or something like that, where you're still thinking in terms of values, not function. Okay, so he's talking about, he's got some remarks, he's talking about the limitations of FP systems. A given FP system is a fixed language. It is not history sensitive. Okay, there's no way to change it. Once you've written it, there's no way to change it. And it's, doesn't have any way of storing state. If the set of primitive functions and functional forms this week, it may not be able to express every computable function. Okay, you're, because it's fixed, you're not able to like expand it. It could be that there's something that you just can't express. An FP system cannot compute a program since function expressions are not object. That's another limitation that because you're operating at the function level, there's no way to kind of mix objects and functions and somehow generate a program, a new representation of a program. Okay, so that's, that's interesting too. Here's another remark he's making about the expressive power of FP systems. Like can we compare to FP systems? You know, like it's talking about maybe this one is weak because it doesn't have the complete set of primitive functions and functional forms that you need. How can we talk about them in relation to each other? Okay, so suppose two FP systems FP1 and FP2 both have the same set of objects and the same set of primitive functions. But the set of functional forms of FP1 properly includes that of FP2. FP1 is a little bit more than FP2 in terms of functional forms. Suppose also that both systems can express all computable functions on objects. They're Turing complete. Nevertheless, we can say that FP1 is more expressive than FP2 since every function expression FP2 can be duplicated in FP1. But by using a functional form not belonging to FP2, FP1 can express some functions more directly and easily than FP2. Huh, that's interesting. So using this model, he has a very precise and objective definition of expressivity. That's interesting. All right, now, advantages of FP systems. They avoid the complexities both of the naming systems of conventional languages and of the substitution rules of the Lambda Council. Okay, so this is something worth mentioning. I think I didn't read anything from this. But he talked about how there's often in functional language, sorry, von Neumann languages, there's a lot of rules about naming. We talked about this also in the next 700 programming languages, like pages and pages of rules about what constitutes a valid name and how to name a class versus how to name a variable versus how to name this. Like there's all sorts of different rules. And he's saying, okay, we got rid of all that. We don't have any names. We don't have to name any variables. And the substitution rules also can be complex. Right, because, you know, you have shadowing of variables and you don't, you have to worry about like, what is this thing referring to? And you have free variables and stuff like that. You don't have any of that. So that is all gone. That's an advantage. Okay, so now in section 12, the algebra of programs for FP system. If the average programmer is to prove his programs correct, he will need much simpler techniques than those the professionals have so far put forward. I think this is true even today. There are a lot of proof assistants and things to help you reason about your programs. But average programmers don't use that for sure. And so one advantage of this algebra over other proof techniques is that the programmer can use his programming language as the language for deriving proofs. I pause there because it should be his or her. One advantage of this algebra over other proof techniques is that the programmer can use his programming language as the language for deriving proofs rather than having to state proofs in a separate logical system that merely talks about his programs. That's nice. You want to be able to just manipulate the programs you have to prove something. At the heart of the algebra of programs are laws and theorems that state that one function expression is the same as another. Okay, that's the law. Jux f g composed with h is equivalent to jux f composed with h and g composed with h. So the law says that the construction of f and g composed with h is the same function of the construction of f composed with h and g composed with h no matter what the functions f g and h are. All things are unknown f g and h are all unknown but it still holds true which is really nice. Such laws are easy to understand, easy to justify and easy and powerful to use. Okay, so this could be like saying that he's making the programs themselves harder to understand while making the laws easier to understand. Okay, that the the laws about these things are very clear like oh yeah that makes sense. You read that and you're like yes, true. I don't know what f g and h do. I don't have to but I can believe this statement. But then if I did substitute something for f g and h would I look at it and know what it did or that it was equivalent to matrix multiplication. That's the question. However, we also wish to use such laws to solve equation equations in which an unknown function appears on both sides of the equation. The problem is that if f satisfies some equation, it will often happen that some extension f prime of f will also satisfy the same equation. Thus to give a unique meaning to solutions of such equations, we shall require a foundation for the algebra of programs to assure us that solutions obtained by algebraic manipulation are indeed least and hence unique solutions. Our goal is to develop a foundation for the algebra of programs that disposes of the theoretical issues so that a programmer can use simple algebraic laws and one or two theorems from the foundations to solve problems. So he's talking about these problems later in this section and he admits he has not solved this problem. Now he's talking about some laws in the algebra programs. A lot of proofs in this, a lot of little algebraic laws defined, they're usually defined in the form of equivalencies, identities. So this expression is equivalent to that expression and sometimes he gives a little proof of that. I'm not going to go into all those. They're too hard to read and they're very small and like usually very obvious, which is nice. Okay, this is section 12. I'm just going to turn these pages here because there's like pages and pages. All right, so section 13. This is where he talks about formal systems for functional programming, FFP systems. Just before I read some, what he's basically doing is making an interpreter. So he's taking an FP system, this pure no state system that can only refer to functions, not objects. And he's making an interpreter that will be able to encode functions in as objects. Right, so just like how in Lisp, you can define an interpreter that takes in a Lisp form. So a list with atoms and interpret it into, as a program, he can interpret an object, so a tuple, a list, as function application. Okay, so this is cool. He's doing what Lisp did and in defining a sort of defining how it works in itself. He's not doing it in itself. He's using the FP system. He just defined in section 11 to define an interpreter that allows you to define new functions and refer to that refer to objects that have variables and et cetera in this one applicative system using objects. Okay. In FFP systems, these are these these interpreted systems. In FFP systems, one can create new functional forms. Functional forms are represented by object sequences. The first element of a sequence determines which form it represents, while the remaining elements are the parameters of the form. Okay, this is just like how you would write a Lisp interpreter. You'd make a list of the first element basically determines what branch of the conditional statement that interpreter is going to go down. If it's an if, if it's a conned, if it's a let, you know, that first argument is going to determine what this expression is about, or it might be a function, right? It might be a name of a function. And then it's going to go down that branch and it's going to do the right thing. Then we have the same thing here. Now, he's defining it not in terms of a vowel and apply as you would typically see in a Lisp interpreter. He's defining it in terms of two things. A meaning function, he's calling mu. So this is kind of like a vowel. So every FFP expression e has a meaning mu e, which is always an object. Okay, so the meaning, the value of an expression is an object. Mu e is found by repeatedly replacing each innermost application in e by its meaning. Okay, so it's recursive. Mu e basically break up e into all of its sub expressions and you apply mu to them. Okay, the association. Okay, then there's another thing, which is kind of like apply, but it's not not the same. But just by analogy, you can see that they're kind of mutually defined in terms of each other. The association between objects and the functions they represent is given by the representation function rho of the FFP system. So making a note here, both rho and mu belong to the description of the system, not the system itself. So they're not part of the system like they are in Lisp. If vowel is a function in Lisp, right? But in his things, they're just used for the definition so that you can understand how the things supposed to work. They don't exist in the language. Okay, so basically, this is saying, like, let's say you have some atom null and ULL that represents the Fp function null. Okay, so null is a thing that always returns null. It's a function that always returns null. Is that right? Or maybe it checks if it's argument is null. So whatever it does, there's an atom in your FFP system that represents the actual function in the Fp, the implementation host language, null, then rho of null is equal to null. So it's basically like a lookup, like a function lookup, like the name lookup. But it can be applied more recursively to other things. And so by using these meaning function and this representation function alternatively, you can determine how something evaluates to a value. So I want to bring that up because I think it's pretty interesting that he does this. It's definitely calling back to Lisp being defined in itself, in the first paper that presented Lisp. And it made me think about this idea of what is the Lisp interpreter relative to this? How are they different? And really, the Lisp interpreter is a way of defining a lambda calculus, a an applicative system, as he calls it, in terms of a von Neumann system. Right, whereas this is kind of doing something different. It's defining an applicative system in terms of a non- von Neumann system, what he calls functional program. This is the you know, the thing he was defining in the other section. This is a pretty interesting, pretty interesting idea. And it has similar consequences, logical consequences. For instance, by being able to define lambda calculus, you can prove that it's Turing complete. The same way Lisp did. Like look, this is a complete system because it is equivalent, it can define itself. So it's got the universe out. This is different. This is not saying it can be defined in terms of itself. It's saying that you can use this system that has very useful algebraic properties, very consistent algebraic properties, and define this other system that is maybe more practical to use, more user friendly. But because it's defined in this algebraic system, we can still get some of that algebraic goodness. Because everything will kind of expand out into this base algebraic system. And he defines a bunch of stuff like ways of storing stuff in cells and things in this interpreter. All right, so the last section I'm going to talk about is section 14. I just have a few things to read in this one. This is applicative state transition systems, AST systems. He's going to sketch an alternative to a von Neumann system. He's saying that he's not suggesting that this is practical. But it's just, let me read that. It must be emphasized again that these applicative state transition systems are put forward not as practical programming systems in their present form, but as examples of a class in which applicative style programming is made available in a history sensitive but non von Neumann system. Okay, so he's going to develop a system with state in terms of this thing he just did, which is this interpreter, which is built on this other thing that he did, which has this really nice algebraic property. Okay, this, this to me reminds me a lot of the stuff that people do with monad transformers, where you can say, oh, well, we need state. It's much cleaner with state, but we don't want to lose all of our FP properties. So we're going to define this monad called the state monad that lets us manipulate state and define an imperative program, but really it's FP underneath. That's really what this seems like he's doing to me. I still want all my algebraic properties at the bottom. It's all algebraic. It's not non von Neumann, but we're going to try to get state. We're going to try to get applicative style on top of it. So, but at the bottom, you know, it all compiles down to this non von Neumann system. It's interesting. This is an interesting idea. Okay, so AST systems provide one way of achieving the goal of having state. There semantics has two protocols for getting information from the state. One, get from it the definition of a function to be applied, and two, get the whole state itself. Okay, so you can get function definitions that are stored some way. And then the whole state, just get everything. There is one protocol for changing the state. Compute the new state by function application. Okay, so you just apply the new state to the old state or sorry, you apply a function to the old state and it gives you the new state. Yeah, so it's a much simpler semantics than you have in something like algal that has to have all these like fetch and store rules. Okay, and AST system is made up of three elements. And remember he's not saying this is practical. I think this is kind of practical. It's nice that it's very simple, but probably needs a little bit more. One, three elements. One, a plicative subsystem. Okay, so you've got your FFP system. That's a plicative subsystem. Two, a state D that is the set of definitions of the applicative subsystem. So you defined all these functions using using this lispy interpreter, right? You got to store those in D. And three, a set of transition rules that describe how inputs are transformed into outputs and how the state D is changed. Okay, so that's it. So just some rules like, well, when I get this input, we have to output this, but also change the state to that. Okay. And like you said, it's not super practical, but he goes and he defines some real programs using this, you know, basic stuff that like stores stuff in a file and you can then query the file later for this toughy store. Okay, I've, I've, I've gone through all the sections that I'd like to. Because it's so long, there's like a long summary section, but it's just the stuff that I highlighted. There's nothing, no more information in that. I, I think that this paper was very influential. It, it really made the idea of point free programming viable. I'm sure people were doing it before, but this is the one where people said, Oh, I see you don't even have to talk about values. You can simply talk about the functions and never, never name your, your arguments. Never, like this made that clear that you could do that. Now, that means that you need to have a basic set, a complete set of what he calls functional forms, what we would call higher order functions. But once you have that, you can operate and do a lot just at that level. And we still see that a lot in, in some functional languages, Haskell being a good example. Um, I think that the characterization of von Neumann machines and their bottleneck is still talked about these days because we still use that as our main architecture. There's probably a lot more resignation now, meaning a lot of people have kind of given up on the idea of, um, a new architecture. You know, people are, are very practical these days. We're stuck with this von Neumann system. We've got Intel, there's a few alternatives, there's ARM, there's, um, you know, they're all based on this von Neumann system, whatever they are. And even the GPUs seem to have the same model where like you're going to stick a bunch of data into the memory, run a program over it. And that program is probably doing a kind of von Neumann sequential, uh, you know, loop over that data. Um, I think people back in the 70s were still doing a lot more machine design. Uh, and he was trying to spark this idea like you don't have to. You can, you can define languages. You know, software itself is important and can be defined independently of the hardware that it's going to be running on. Um, that I think that's a really important idea. I think we take it for granted these days. Uh, but most of us have decided we're never going to run, write our own hardware. Uh, even, even in academia, we're, we're kind of resigned to say like, uh, computers are, are the way they are. And we're just going to, we're going to try to make what we want out of them, you know, interpret what I want to say, compile the language we want into this computer we have. We're not going to revisit, um, which is unfortunate. I think we, we, there's still plenty of room there for figuring out a new programming model that, um, that could inform how our architecture should look. I think that would be pretty cool. Um, you know, just something I saw somewhere, like I'm not a computer architecture expert. I'm really bad at it actually. Uh, I did take a course in it and I was bad, but, um, you can sort an array in a much faster way. So, so the way we analyze algorithms, these days is based on this von Neumann, like how many instructions, how many fetches do I have to do to, to do this? Oh, the number of instructions I call because they're sequential and the number of, one of those instructions might be a fetch. The number of those increases exponentially as I increase the size of the array, right? That's how we think of sorting and log in. It's like as fast as we could get it. Um, but if you just laid out the circuitry of comparing and swapping values, you could do a bubble sort using, let's call it constant time. If you just had enough of the logic units right by the memory, right? You could just take this array, fill it with numbers. Uh, it would have to be a finite size, obviously, but let's say you put 256 numbers in there and you turned it on, right? You said, go and everything kind of just bubbled it, or maybe it didn't do bubble sort of some other thing, but it could just swap doing swaps on all the values, could bubble out the largest from the smallest in that order in linear time. Um, so there's potentials for architectures, because what it's doing is it's trading, uh, it's doing the same number of operations, but you've got more hardware. So you're getting an n, uh, out of the fact that you've duplicated that compare and swap, where it used to be that the compare was that one instruction, the CPU would do it, then it would like, you know, swap the two values. That's another instruction. If you put those compare and swap units, the circuits throughout the whole array, so you had n of them, you have 256 of them, then you, you get that n back in time, because you have, you know what I'm saying, you can divide it out, you can do 256 compare and swap at the same time. All right. Um, so what I'm saying is there's still a lot of room for rethinking architectures and having interesting programming models that take advantage of them. And this kind of, this paper is kind of saying, let's think of how we want to program first, and back into how the computer should look. It's a good idea. Anyway, very influential paper, very influential person touring award winner in 1977. Well, I don't know if I have anything more to say. Um, don't forget to subscribe to the podcast has been a long one, almost two hours. Uh, I know you're out there, people who are interested in this stuff. I'm trying to go through a lot of important historical papers. Would love to hear your suggestions. Someone, um, that I respect a lot suggested, just, just go through all the touring award lectures, or at least the touring award winners and pick their important papers, because the touring award has done a pretty good job of highlighting important figures. Not everybody is that important. Some of the decisions were wrong, maybe, but like in general, it's a pretty good job. So I think I'm going to try that. Let me know what you think about that, because, um, when I, you know, can't figure out what's the next paper, I'll probably just look at the next touring award winner that, um, that had a cool paper and read that one. Okay. Um, I'm going to sign off. My name is Eric Normand. This has been my thought on functional programming. Thank you for listening and rock on.