Computing Then and Now

In this episode, we read excerpts of Maurice Wilke's 1967 ACM Turing Award lecture titled 'Computing Then and Now'.

Short biography -- Turing Award Lecture

Transcript

Serving the shifts of interest among computer scientists and the ever-expanding family of those who depend on computers and their work, one cannot help being struck by the power of the computer to bind together in a genuine community of interest, people whose motivations differ widely. It is to this that we owe the vitality and vigor of our association. If ever, a change of name is thought necessary, I hope that the words "computing machinery" or some universally recognized synonym will remain. For what keeps us together is not some abstraction, such as touring machine or information, but the actual hardware that we work with every day. Good morning. My name is Eric Normand. And today I am reading from the touring award lecture from 1967 by Maurice Wilkes. Maurice Wilkes was a computer engineer, programmer, computer scientist from England. He was born in 1913, so he saw, well, both wars, really he was a baby at the first one, but he saw the development of computers for calculating ballistics tables and doing code cracking and stuff, and so was there really at the beginning of the electronic computer, analog computers, and electronic computers, and even some mechanical computers that predated those? I wasn't familiar with him, so I had to look up the biography which the ACM puts together for each award laureate, and they're very nice. They summarize things really well and talk about where they're born, their sort of academic career, how they learned what they learned, and then what was important about them. And so I'm going to read some of that biography. He is known as the author with Wheeler and Gill of a volume on preparation of programs for electronic digital computers in 1951 in which program libraries were effectively introduced. So he's instrumental in the idea of libraries, software libraries, where you'd put your reusable code off to the side so it's not just in line with the rest of the code. That's amazing, that's awesome. It turns out he did a lot of awesome things that we basically take for granted today. He was perhaps the first person to recognize that what we now call software would prove to be a worthwhile academic pursuit. Back in the day, people would construct machines, and the machines were a lot of work to put together, and a lot of research would go into them, and then the programs like what you actually did with them was kind of an afterthought, and he changed that. And of course these days, we don't even think about the computer so much, you can just buy them for a dollar, and what's more important than what makes things one computer different from another is the software you run on them. Wilkes came up with a new design principle, which he called micro-programming, that greatly simplified the logical design of the new computer. Micro-programming was Wilkes' most important scientific contribution to computing and had he done nothing else, he would be famous for that. In the early 1960s, IBM based its world-beating system 360 computer is around the idea, and it remains a cornerstone of computer architecture. So I probably learned micro-programming in my sort of university years, but it's not something I thought about a lot, so I just forgot about it, I had to look it up. It means that the instructions that you give the computer, the machine code, what we consider the program, the machine code, is not necessarily, it's not a one-to-one correspondence with what runs on the hardware. There's some level, some layer of indirection there, so a multiply can get translated into like shift and add operations and stuff like that. The programmer does not have to think in terms of shift and add, they can think in terms of multiply, which is much more convenient, and this is all happening in hardware. This thing I'll read, Wilkes was very good at keeping up with technology trends and preventing either himself or the laboratory getting locked into dying research fashions. Well, that theme is going to come up a lot in this lecture. He was good at looking forward and always being on the cutting edge and seeing where the trends were moving to, not trends as in fashion, but the trends of things are getting smaller here, that's not going to stop. We're going to go over that in the lecture. A lot of forward-looking in this lecture is different from the others I've read, it's a lot less technical, it is much more about reminiscing about what times were like, you know, 20 years before the lecture was taking place and how that relates to now and how we look into the future with that perspective. So again, this is the 1967 ACM touring lecture called Computers Then and Now by Maurice Wilkes from Cambridge University, was published in January of 1968 in the Journal of the Association for Computing Machinery. And as usual, I'm not going to read the whole thing, I just excerpt things that I think are worth diving deeper into, there's a lot of stories, it's a worthwhile paper, you should read it, but I'll just excerpt it here. The computing field owes a very great debt to von Neumann, he appreciated at once the possibilities of what became known as logical design and the potentiality is implicit in the stored program principle, that von Neumann should bring his great prestige and influence to bear was important, since the new ideas were too revolutionary for some and powerful voices were being raised to say that the ultrasonic memory should not be reliable enough and that to mix instructions and numbers in the same memory was going against nature. So here we see this first inkling that he was always on the cutting edge. He wanted to build a stored program computer and there were people who thought that no, the hardware should be configured for the problem and the memory should be reserved for data. And nowadays, of course, looking back, we all use software on our machines, it's stored in the memory and it essentially is a different kind of data and that we can, you know, send over the internet or what have you. So that's interesting and that he's thinking von Neumann for imposing his personality on the side, you know, putting his thumb on the scales to make sure that it happened. And it's difficult right now to imagine a time but we need to, we need to really get into this paper, imagine a time when there were people programming computers or building computers doing research who were against the idea that software would be stored in the memory. And remember, memory is really expensive and so it makes sense that you wouldn't want to waste it on software, basically the word didn't exist back then, a program. And instead, you would want to hard code it. Imagine people thinking that, imagine it could have been you, that you were on the wrong side of history. That's the kind of perspective that these papers can give us, one of the reasons I like reading them. There was, however, a difficult period in the early 1950s. The first operating stored program computers were naturally enough laboratory models. They were not fully engineered and they by no means exploited the full capability of the technology of the time. It took much longer than people had expected for the first of the more ambitious and fully engineered computers to be completed and prove themselves in practical operation. In retrospect, the period seems a short one. At the time, it was a period of much heart searching and even recrimination. Wow, so I think that this is an important paragraph because it shows that people have a kind of illusion in their memory. When they look back, the time period seems so short. You see the highlights and you think, "Oh, wow, nothing." It was obvious like step, step, step, we made it to where we are. But during the time, it just felt like heart-wrenching, heart-searching and recrimination, like they were blaming each other and pointing fingers and that perspective that looking back feels different from what it felt like at the time is important because we're probably in a time right now where we're pointing fingers and we're really trying to figure out what's going on and in 10 years, 20 years, we're going to look back and it'll all seem so obvious like dominoes just pushing one domino after the other to where we arrived at. After setting up this perspective, he continues, "I have often felt during the past year that we are going through a very similar phase in relation to time-sharing. This is a development carrying with it many far-reaching implications concerning the relationship of computers to individual users and to communities and one that has stirred many people's imaginations. It is now several years since the pioneering systems were demonstrated. Once again, it is taking longer than people expected to pass from experimental systems to highly developed ones that fully exploit the technology that we have available. The result is a period of uncertainty and questioning that closely resembles the earlier period to which I referred. When it is all over, it will not take us long to forget the trials and tribulations that we are now going through." So this was a 1967 time-sharing was on everybody's lips. Time-sharing was the idea that you would have a terminal hooked up to the computer and that you could use a little slice of the computer. Instead of, say, taking over the whole computer yourself, you could have a little time slice. You might have a little program to run, but it could run in parallel kind of to another program running. Someone else logged in on their terminal. The idea that you don't have to batch up everything and do one program at a time. There was enough resources that you could just have it running all the time and have people logged in. This was, of course, today we think of this as a weird transition period, I would say. We all have our own computers, like multiple computers. I got my phone here. Even this mouse probably has a computer in it. There's the computer on my desk that's recording this. My headphones probably have computers in it, so there's been a huge shift. But at the time, a computer was this giant thing that was expensive. It took up the whole room, not just expensive to buy, but to maintain, and you needed technicians to work on it. There were enough resources in the computer that it started to make sense to kind of, in one sense, you could think of it as wasting them, switching between different tasks really quickly and have these terminals connected to it. Instead of what happened before was you brought your punch cards in and you ran your program, you basically got the whole hardware to yourself, and then you would get a printout of your program, of the results of your program. So there's this theme of like wasting memory and wasting the processor, doing silly stuff like switching between processes or storing the program in the memory. And it's sort of a way that we've advanced the field, is to find new ways to use the computer inefficiently, and maybe is more convenient for the programmer, or the user. Alright, next paragraph. In ultrasonic memories, it was customary to store up the 32 words end to end in the same delay line. The pulse rate was fairly high, but people were much worried about the time spent in waiting for the right word to come around. Most delay line computers were therefore designed so that with the exercise of cunning, the programmer could place his instructions, his or her instructions, and numbers in the memory, in such a way that the waiting time was minimized. Touring himself was a pioneer in this type of logical design. Similar methods were later applied to computers, which used a magnetic drum as their memory, and altogether the subject of optimum coding, as it was called, was a flourishing one. I felt that this kind of human ingenuity was misplaced as a long-term investment, since sooner or later we would have a truly random access memory. We therefore did not have anything to do with optimum coding in Cambridge. Cambridge. I didn't know what ultrasonic memories were, so let me explain. Back in the day, people were using stuff that was available to them to build the computers out of, and one of the things that was available was this analog device called a delay memory. What it was was a way of delaying an audio signal. You could put the signal into the device, into the wire, and it would amplify it and play it back, but over a long wire. This would delay it, and they would construct it as a memory, so that it would loop around. Every time it would loop around, it would play back, and you could read off the values in the memory. That's what he's talking about with the pulse rate. It was delayed, and you could store 32 words in there, and it would just cycle around, cycle around, and you could read it off. Of course, this isn't random access, because you have to play it back and redecode it into your computer every time. You wanted to say the 15th word out of the 32 words. You would have to wait for number 15 to come back around, and then if you wanted the 16th one, and you missed it the first time, you got to play it again and get the 16th one. It wasn't random access, it was linear access, and so people were getting good at rearranging the memory and rearranging the order of their program, so that it would minimize the wait time, minimize that linear search through that 32 words of memory. I could imagine that would get pretty fun, that you could have orders of magnitude improvement in your speed by rearranging your algorithm, making it more clever. It's kind of like a puzzle, and you could have friendly competitions with other programmers to see who could make their thing the fastest. We still do stuff like that today, it's like a little puzzle, and it's fun. He says that Alan Turing was really good at that, but he said that they didn't want anything to do with that because they knew that random access memories would eventually get there, and so for the long-term view, this kind of optimization was a waste of human potential. Spending all your time optimizing code, but eventually we'll just have random access memory. It just shows the kind of far-reaching view that he had. We can all get very myopic, like, "I just need this program to be fast, what do I do?" And you forget that maybe this software is going to live for 10 years, if you spent the time instead developing a better algorithm or a better data structure or something, it's going to pay off in those 10 years. Have a higher-level view of the problem. And also, I want to say that a lot of the developments in software and programming have come from this idea that the current technology is going to get better, it's going to get faster, and let's not waste our time now, might as well just wait 18 months. That was the old story, like, if your software is too slow, just wait 18 months, and the computers will have doubled in speed, and it won't be slow anymore. And Alan Kay talks about this when he talks about how they constructed the vision for the personal computer. They just looked out 30 years, and they said, "What has to happen in 10 years for us to get to that 30-year point? And can we buy that computer from 10 years with, like, you know, a thousand times more money? We'll just spend the money and buy our way into the future, because the technology was available, it's just really expensive. So can we buy that computer? It'd be really big and really expensive, but eventually it'll be this big, it'll be hand-held. And we can start working on it today, and waste the memory, and waste the processing on stuff that will be useful to the user. A source of strength in the early days was that groups in various parts of the world were prepared to construct experimental computers without necessarily intending them to be the prototype for serial production. As a result, there became available a body of knowledge about what would work and what would not work, about what it was profitable to do and what it was not profitable to do. It is, I think, important that we should have similar diversity today when we are learning how to construct large, multiple access, multi-programmed, multi-processed, or computer systems. I hope that money will be available to finance the construction of large systems intended for research only. This makes me think of a few things. He is recommending more basic research, research that doesn't have a practical intent, so it doesn't have to result in a product that's going to be launched. It simply experiments to see what's possible. When you do that, there's going to be failures. It's just the nature of research. It's a lot of trial and error and a lot of guessing and what have you. He thinks that we should continue doing that. We can't just rest on our laurels. He didn't rest on his laurels, so he's a good example of how to do that. He could say, "Hey, I invented programming. I'm just going to sit back and ride this wave," but he didn't. He pushed forward. He pushed forward. Much of the early engineering development of digital computers was done in universities. A few years ago, the view was commonly expressed that universities had played their part in computer design and that the matter could now safely be left to industry. I am glad that some have remained active in the field, apart from the obvious functions of universities and spreading knowledge and keeping in the public domain material that might otherwise be hidden, universities can make a special contribution by reason of their freedom from commercial considerations. Getting freedom from the need to follow the fashion. This reminds me a lot of the current state we're in. Now, I'm not a computer researcher. I don't know what's really going on, but it's clear to see in the industry that Intel has a huge monopoly and they do a lot of brilliant research and their chips continue to get better and better and it's very expensive. They're paying for all this research. It also has me wondering what is being not explored, what kinds of architectures, what kinds of new innovations are out there that a monopoly is really not, perhaps, it's a de facto monopoly. They're not really obligated to share their research and when one company is doing so much of it, there's often not enough diversity of ideas, especially since there's so much competition from Intel that something that might have merit wouldn't be able to see commercial success because it doesn't run Windows basically and I just wonder what is being missed. I might be wrong. Like I said, I'm not a computer researcher. I don't have my finger on the pulse of different architectures and things, but I just wonder how much research into basic computing architectures are being done. Gradually, controversies about the design of computers themselves died down and we all began to argue about the merits of sophisticated programming techniques, the battle for automatic programming or as we should now say for the use of higher level programming languages had begun. That idea of automatic programming, it used to be considered a field of artificial intelligence to compile a program, to parse it, to understand it in a basic way and then turn it into machine code. It was such a hard problem. We hadn't come up with the basic ideas of parsers and compilers and optimization passes and all that, so it was actually really hard work and it was considered so high level compared to the machine code that it was automatic, like oh, you just type some like if statements and some high level English looking words and it becomes software somehow. Of course looking back, nowadays we think of programming as not automatic, it's actually very hard and no matter how high level it feels like you go, it still feels low after a while, still feels like you want more. John Carr distinguished two groups of programmers. The first comprised the primitives who believed that all instructions should be written in octal hexadecimal or some similar form and who had no time for what they called fancy schemes while the second comprised the space cadets who saw themselves as the pioneers of a new age. Okay, so I'm bringing this up because he's setting up these two ideas that there's primitives who like to be at the low level machine code, optimizing stuff for that hardware and the high level space cadets who are constantly pushing the edge of programming and coming up with new basically new ways of coming up with like fancy schemes, stuff like what we got garbage collection nowadays with some fancy schemes to have the computer automatically collect memory for you. The serious arguments advanced against automatic programming had to do with efficiency. Not only was the running time of a compiled program longer than that of a hand coded program, but what was then more serious, it needed more memory. In other words, one needed a bigger computer to do the same work. We all know that these arguments, although valid, have not proved decisive and that people have found that it has paid them to make use of automatic programming. In fact, the spectacular expansion of the computing field during the last few years would otherwise have been impossible. We have now a very similar debate raging about time sharing and the arguments being raised against it are very similar to those raised earlier against automatic programming. The arguments are efficiency except everything is getting cheaper and faster and bigger. And our problems are getting more complicated. We have more people using them, we need some way of making it faster and easier to build these complex systems. So the primitives in what he's presenting, when their arguments are totally valid, it's inefficient, it's slow, it uses too much memory, but in the long run, we couldn't have done anything without the space cadets. So there's this tension always between them. Now this got me thinking, what debates are we having today where efficiency is the main argument because then maybe we can see this dichotomy today and we would basically know who's going to win in the long run because it's always the space cadets. I mean, that's the model that he's presenting. And I think for a lot of cases, it's right. I mean, if you look at like garbage collection, if you look at JIT compilers, you know, all this stuff that we take for granted today, it all was probably considered very wasteful at the time when it came out inefficient, but we couldn't have built so many things without it. So I think about what are we talking about today? And you know, what popped into my mind was rust. People talk about it being more efficient. And I don't know if this really applies to rust. So like hear me out. If you're a C++, rust was designed, the story that they tell is it was designed to write the new browser, the new Firefox browser, the layout engine and everything. It was basically really hard to do it in C++, which is what it's currently written in. And they wanted a new language that would make it safer and faster and just better. Better for the programmers, better for the users. And so rust is the answer. It is a statically typed language that uses the type system to manage the memory. Okay, so at static time, at compile time, you can know what scopes have access to every piece of memory. You can know no scopes have access to it. So we can collect it. You can know, hey, these two scopes can both access it at the same time, and that's bad because then they could overwrite each other's work. So that's bad. Like you can analyze all of this. And that is very cool technology. And so are they, but they're making this efficient, well, compared to C++, compared to C++, it seems like a space cadet language, rust, right? All these C++ people are like, no, like we've been, we figured out these techniques. We know how to do it. Good enough. This is an established language. We've got all this other code written in it already. And then rust comes along and they say, no, we're, we got something better, something new and it takes longer to compile and there's a lot to learn, but it'll help you build better software. So they're like the space cadets. But then if you look at it compared to something like Java, which was also the space cadet compared to C++, you know, Java had garbage collection. It has a very high level view of object references instead of pointers. And it was, it has a JIT, you know, it has all these things has a byte code representation. All these things that make it less efficient, but also prevent errors, right, that doesn't have a, because it's, you know, quote, managed, it, it doesn't, it doesn't segfalt or at least it's not designed to segfalt, whereas a C++ program can segfalt all day. And so Java is a space cadet language. But then when you compare Java to Rust, which one is the space cadet? Because Rust is obviously more efficient than Java, at least has the potential to be more efficient. It's much more lower level. And I think we've got, you know, maybe a breakdown in the model here. And what I was thinking is that perhaps we have, because there's so many programmers these days compared to back in 1967, perhaps we've fragmented enough that there can be different targets, different use cases for your language, and that you can be a space cadet in one area and a primitive in another. So Rust is great for this kind of system level, low level embedded programming, stuff like that because things are analyzed statically, but you get the benefits of what was traditionally done at runtime. So in a way it's a primitive, right? But then you've got other stuff like, you know, line of business applications and back end systems, web programming is like a huge industry, which has totally different concerns. And now it looks like machine learning is a thing, is a new industry, and data science is another industry. And so all these different industries, sub-industries, let's call them, have their own concerns. And we can sort of fragment and say each one can push off into different directions and do their own thing and have space cadets that wouldn't look, they look kind of incomprehensible to other people. And well, that's interesting that that's happened, you know, that the field is big enough that we've got these sub-industries. So another thing that it makes me think of is this debate that we're having of static versus dynamic typing, a long debate, it started actually before programming. So it's not going to be resolved anytime soon. But does this primitives versus space cadets apply? Because for a long time, the main argument for types was that so like in C or in C or in Java, C++, the type system was telling you at compile time how you could represent your data. So it could be compiled into native code with no boxing, right, no checks. You could say this is an integer, I know it's an integer. So just use the integer add operation in the machine code and it will work, I promise, right? Of course, if you made a mistake or you like found a hole in the type system, it wouldn't work. But that was the idea is you're telling the compiler what type it was and then the compiler was giving you a little help by sometimes checking if you like use the types wrong. This is this dynamic typing, which was saying let's waste memory and processing time and we're going to store a little bit of information, maybe a couple of bits, not that much, just to remember what type everything is. And then when you do a plus, it's not going to compile directly down into an add one instruction. It's going to dispatch to some table that looks up the different types and figures out what operation to actually call and it maybe we'll have to do some coercion and stuff. And that was considered the space cadet for the longest time. And still is in a lot of places like kind of controversial, right? Because you want the control when you want to know this is an integer. This is not an integer. But then you have something like Haskell where they kind of reverse and they say, well, let's make the type system first. I mean, that's kind of a caricature of what they did. But it's not this like we're going to compile it to integer operation. So we need that type. And then maybe we'll write into the compiler a little bit of help just to prevent errors. Haskell type system is saying like we can do a lot of math and analysis and cool stuff in the types like abstract algebra, category theory, we can do all that stuff in the types. And then somehow that will trickle down into cool stuff that we can do, you know, on the value level with practical stuff. And so in a way, that is space cadet-y, right? Haskell people, I was part of the Haskell community, I programmed in Haskell professionally. People considered themselves like the future. This is it. This is where it's going. All these types doing stuff in this cool abstract language that we can do inference in. And well, I just wonder because does this model not work anymore? Or is it a fracturing when there's still subgroups? Is it all going to converge, maybe, the dynamic and the static? Oh, I don't know, it's cool. Alright, I've talked enough about my own ideas. I'm going to go back to this paper. Incidentally, I fear that in that automatic programming debate, Turing would have been definitely on the side of the primitives. The programming systems that he devised for the pioneering computer at Manchester University was bizarre in the extreme. He had a very nimble brain himself and saw no need to make concessions to those less well endowed. I remember that he had decided that the proper way to write binary numbers was backwards, with the least significant digit on the left. I will remember that once during a lecture, when he was multiplying some decimal numbers together on the blackboard to illustrate a point about checking a program, we were all unable to follow his working until we realized that he had written the numbers backwards. This is in decimal, right? Imagine how confusing that would be. I do not think that he was being funny or trying to score off us. It was simply that he could not appreciate that a trivial matter of that kind could affect anybody's understanding one way or the other. Hmm, just a little interesting anecdote that Turing was a primitive and probably didn't appreciate the need for higher level programming languages. I believe that in 20 years, people will look back on the period in which we are now living as one in which the principles underlying the design of programming languages were just beginning to be understood. I am sorry when I hear well-meaning people suggest that the time has come to standardize on one or two languages. We need temporary standards, it is true, to guide us on our way, but we must not expect to reach stability for some time yet. Can you imagine in 1967, if they had just stopped developing new programming languages? Like, oh, we're done. Programming is done. Let's just work on, you know, more interesting problems. Wow, I'm sure glad that they didn't do that. People have now begun to realize that not all problems are linguistic in character and that it is high time that we paid more attention to the way in which data are stored in the computer. That is to data structures. In his Turing lecture given last year, Alan Perles drew attention to this subject. Parenthetically, the last episode last week was this very lecture that he's just mentioning. Alan Perles talked about data structure as the primary concern, like, how do we store data as opposed to, say, the, like, how do we write if blocks, you know? At the present time, choosing a programming language is equivalent to choosing a data structure. And if that data structure does not fit the data you want to manipulate, then it is too bad. It would, in a sense, be more logical first to choose a data structure appropriate to the problem and then look around for or construct with a kit of tools provided a language suitable for manipulating that data structure. Now, this one, I have a lot of trouble saying with certainty what he is talking about. Um, I think one of the main reasons is I have never programmed with a language from this period, never programmed an algal, never programmed in, uh, cobalt, I don't know why this idea of data structure is, is, is, you know, even worth mentioning, I guess is the way to put it. Like, why is this a problem? And it makes me think about one quote from Alan Kay, uh, in the small talk paper I read, the early history of small talk. One of his goals in developing small talk was to get rid of data structures. Now, what does that mean? Like back then it must have meant something different from what we think of today. So let me try to explain, I hope it correctly, I can only hope because, um, I wasn't alive back then and it's hard to get context in how these terms were used. But this is how I understand it. Imagine a data structure like a, um, a binary tree, very simple data structure and you're storing your data in memory as a binary tree. Your program would have to know how that tree was laid out in memory. It's going to have nodes and some of the fields, some of the, like, bytes in that node, which is probably somewhere like a little array in memory. Some of those bytes are pointing to other nodes. And some of them are going to store some data. And so your program has to know how to walk those pointers and search for the, the data that it's looking for, like do a depth first walk, do a breadth first walk. You know, maybe the tree has to be maintained in a balance so it doesn't get too lopsided so that it stays, um, you know, it gives you a good logarithmic, um, access to it instead of linear access. So there's all these concerns about the data structure that your program has to be concerned with, but at the same time you're trying to get useful work done. Your program has to do something with that data that it's finding in all those nodes. And so interleaved with the code that walks the data and the code that balances the tree and follows the pointers and all that is going to be some of the, let's call it the business logic, the business of your program. It's, it's going to be like smeared all over in the code. So there's going to be a mix, a pointer following and business logic. And this is what Alan Kay was trying to get to fix. Why are we following pointers and doing this? And then why is our code all mixed together? And so he developed, along with his team, the idea is a small talk where you have an interface and you pass an intention as a message to an object. And that object could be a binary tree. And so this binary tree, it maintains the behavior, which would be how do you walk the pointers and how do you, you know, find data inside of it? How do you add a new thing and stuff like that? And that could be delegated to this other system. And then the code using that data structure doesn't have to worry about it at all. You can just call these few, you know, handful, just a handful of methods on it that define the intention. Oh, I intend to add this to the data structure. I want to remove this. Oh, I want to find something that looks like this and you can easily define your business logic in terms of those methods and not mix the two. Not mix the one that walks the pointers with the thing that decides whether someone needs to get a raise, you know, as an example. Okay, so that was small talk and small talk had a huge influence on the industry. And when I was in school and university, we were taught in Java, which Java shares a lot of ideas from small talk, this idea of an interface and separating out the, you know, concerns, this modularity. And I think that the industry has gone through this major shift to where like a such a paradigm shift where we read this and we have no, like it doesn't make any sense to us. Why would choosing a programming language define the data structure that you're going to use? Like that shouldn't be the case. Even in, you know, in a language like Java, the data structure, you can choose any data structure. There's, you know, GitHub repose full of different data structures that you want if you want to choose them. But then mostly we don't do that. Mostly we just use some generic data structure. We just use a hash table because it's efficient enough. So we're not even concerned that much with choosing the exact correct data structure. In some cases, yes, it still matters, right? Like you don't want a linear data structure when you need to do repetitive access because then you get quadratic, you know, you want to use constant time access stuff, but a hash table is constant time access, and it makes it really easy. And you know, if you needed sorted data, you'd use something else that has a, you know, a fast sorting sorted access time. But you know, it's just a handful of things and it seems kind of like a solved problem. And it doesn't seem to really matter what language you use. So I think this is one of the things where maybe he was right, he was prescient, and the stuff he's saying just seems like doesn't, doesn't even make any sense anymore. Because it has changed in the way that he hoped, right? So what he was hoping is that you choose a data structure and then you can construct a language on top of it and he'll go more into it in the next, the next excerpt. We, we would have in fact two languages, one inside the other, an outer language that is concerned with the flow of control and an inner language, which operates on the data. There might be a case for having a standard outer language or a small number to choose from and a number of inner languages, which could be as it were plugged in. So the idea would be you could have some modularity in your languages. One describes data and how to access the data and the other is like control flow. If statements, loops, you know, all that kind of stuff. And that they could somehow interface that you wouldn't need so many control flow languages. Like I said, I think we've, we've kind of passed this point, like we've figured it out. I want to say where, like, even if you, if you look at like what people were saying in Pearl, like I, I, I learned some Pearl back in the 90s. And one thing that they would say is like, okay, it's a scripting language, but you're calling C libraries that do the dirty work. Okay. Already you have this like two language system, you have a language like Pearl where you get nice for loops and if statements and control flow. And then the dirty data processing work is done in C, where you can have low level control, a lot of efficiency, and the for loop is, you know, it's interpreted, let's say, but it's interpreted into some really tight loop. And then the inside of the loop is just C code. So that's as tight as it could be. And so we kind of have this, I mean, we think about this in enclosure as well, where if something is not efficient enough in written enclosure, because it does a lot of low level processing or something, you can write the, the bulk of the algorithm in Java, where you do get that low level control and then just call it from closure. So all your control flow is done in this higher level language. So I think that this might have come to pass, you know, and we don't even appreciate that this was a weird idea back in the day. Let me continue. The fundamental importance of data structures may be illustrated by considering the problem of designing a single language that would be the preferred language either for a purely arithmetic job or for a job in symbol manipulation. Attempts to produce such a language have been disappointing. The difficulty is that the data structure is required for efficient implementation in the two cases are entirely different. Perhaps we should recognize this difficulty as a fundamental one and abandon the quest for an omnibus language, which will be all things to all people. Okay, this is just kind of reiterating that, you know, maybe, maybe this isn't, we shouldn't come up with one unifying language. I even heard someone on a podcast saying that they thought the secret sauce that they had found was rust for low level programming and then Erlang or Elixir for the high level programming. So you somehow use a foreign function invocation, you compile your stuff in rust and then from your Elixir code, you get to call that and it does all the, you know, it's really efficient. But then your control and your, you know, the high level stuff that doesn't need to be so efficient can be done in a language like Elixir where you get the parallelism and the nice error, you know, self-healing stuff that you get from that platform. It's interesting. Maybe we should do that more or maybe we do it so much we don't even notice. There is one development in the software area, which is perhaps not receiving the notice that it deserves. This is the increasing mobility of language systems from one computer to another. There is reason to hope that the newfound mobility will extend itself to operating systems. The increasing use of internal filing systems in which information can be held within the system in alphanumeric and hence in essentially machine independent form will accentuate the trend. And so held can be transformed by algorithms or any other form in which it may be required. We must get used to regarding the machine independent form as the basic one. I believe that in the large systems of the future, the processors will not necessarily be all out of the same stable. Okay, so this is another thing that I think you was prescient on, that we're going to have these heterogeneous networks, computers with different kinds of processors and different operating systems, and we're going to need to store our data in a way that can be read by any of them. Now in 1967, this was probably looked like a daunting task. People would store the data in the way that they just needed for their one problem that they were solving. But nowadays, we appreciate that we have different file formats that are cross platform. And they might not be the most efficient for any one particular platform, but we have standards and we just use them. So we have PNGs and JPEGs and MP4s and, you know, even something like a Word doc. It's all, it's all cross platform. It's fun to think about a time when that was before that before we had that. This was before people were emailing files to each other on their systems, you know. Okay, now he's talking about robots. We're getting to the end here. I believe that computer controlled mechanical handling devices have a great future in factories and elsewhere. The production of engineering components has been automated to a remarkable extent. By contrast, much less progress has been made in automating the assembly of components to form complete articles. Okay, so at this point in 1967, they had computerized a lot of the production of components, okay, using digital tools to get more precision and stuff. But the assembly still was done by hand. The artificial intelligence approach may not be altogether the right one to make to the problem of designing automatic assembly devices. When engineers have tried to draw inspiration from a study of the way animals work, they have usually been misled. The history of early attempts to construct flying machines with flapping wings illustrates this very clearly. I believe that these handling machines will resemble machine tools rather than fingers and thumbs, although they will be lighter in construction and will rely heavily on feedback from sensing elements of various kinds. Okay, I bring this up because, you know, we have made a lot of progress in assembly, robotic assembly, and so I thought, I'm not an expert in it at all, but looking back, we can kind of see what he was talking about and whether he was right. So the first thing is he said, "The artificial intelligence approach may not be altogether the right one to make." So what does he mean by the artificial intelligence approach? Because, I mean, when I think of the term artificial intelligence, like robotics is up there, not just like human robotics, but any kind of robotics was done in the artificial intelligence field, the control, the planning, the situational awareness, all of that, interpreting sensor data, all of that belongs to the field of artificial intelligence. And he must have known that. But I think what he was talking about was we won't be constructing humanoid robots with humanoid intelligence. We're not going to make a artificial general intelligence that will then, we will then teach how to assemble these parts. We are going to do it much more, and he says, "We shouldn't look at the way animals work because that can mislead you." Now I think he was wrong. I think that the way animals work informs a lot of robotics, and different kinds of joints and things are often inspired by animals. Sometimes they don't work out, but very often it is looking into how, well, we can't figure out how humans do it, but maybe we can figure out how a roach does it, because they have a much simpler nervous system, and they're successful, or like, "How about a frog? How does a frog do it?" This is a common approach in artificial intelligence, and I think he must not mean that, or he's totally wrong. But just to be generous, I think that he was talking about this kind of artificial general intelligence, what we call it these days, that it's like a human-level intelligence to construct this stuff. Because he says that he thinks that they'll resemble machine tools rather than fingers and thumbs. So don't just construct a human hand, and have it controlled like a human hand is controlled, because you could just make something that's much more efficient for the job at hand. Okay. Just wanted to bring that up. I thought that was a cute little thing that he brought up. In general, I think he's probably right, that they do resemble machine tools with sensors more than this intelligent thing. But it's also the curse of the artificial intelligence field that every breakthrough it makes just becomes like normal programming practice, and so what was once like a hard problem that only the artificial intelligence people were interested in, then just becomes how everybody does it, and you always look at artificial intelligence as this thing that never achieves anything. That's the history of the artificial intelligence field. You know, spam filtering comes from the artificial intelligence field. You know, it's things that we take for granted these days, and people still say, "Oh, when are we going to get AI?" Like, it's everywhere. All this stuff we do is AI. Garbage collection was from the artificial intelligence, anyway. The next breakthrough, this is the last page, "I suppose that we are all asking ourselves whether the computer, as we know it, is here to stay or whether there will be radical innovations." It's in 1967. I am still wondering whether we're going to see some radical innovations in my time. Acceptance of the idea that a processor does one thing at a time, at any rate as the programmer sees it, made programming conceptually very simple, and paved the way for the layer upon layer of sophistication that we have seen develop. Having watched people try to program early computers, in which multiplications and other operations went on in parallel, I believe that the importance of this principle can hardly be exaggerated. From the hardware point of view, the same principle led to the development of systems in which a high factor of hardware utilization could be maintained over a very wide range of problems. In other words, to the development of computers that are truly general purpose. The ENIAC, by contrast, contained a great deal of hardware, some of it for computing, and some of it for programming, and yet on the average problem, only a fraction of this hardware was in use at any given time. There's quite a lot in this paragraph. He's referring to his micro-coding, micro-programming, where there's that layer of interaction between what the programmer enters and what the computer actually executes. Remember, you could, say, multiply these two numbers, that's your machine instruction, but it gets turned into bit shifts and ads, so you just have these adder units all over the chip, and it gets sent to some of them, and they could be happening in parallel to ads, and a bit shift could all happen at the same time, and it gets recombined, like there's a ton of stuff that's happening under the hood. He's saying that it was important for the conception, for the mental capacity of programmers, to make it so that it looks like one thing is happening at a time, one instruction after another, and that's what's led us build up layer upon layer. He talks about this idea of utilization. Utilization is, I don't know enough about computer hardware to say this definitively, but I have studied utilization in other systems, mostly in factory production, and in other work, assembly line kind of work, specifically how software development is done, and this idea of utilization as a metric of how efficient your system is. The truth in 1967, the way that people measured how efficient their factory was, was the amount of time that they basically broke down the factory into units, okay, this machine and this machine and this machine, how much of the time is this machine being used, and then you can just add it up or average it somehow, aggregate all those numbers, and that's how efficient the factory is. I think it was called cost accounting, and the idea was to try to see how well you're doing and how do you improve these metrics, right, and so you could improve a metric, you can notice a machine wasn't being used enough, let's say a milling machine, and you can have people mill more stuff, somehow rearrange things so that that thing gets used more. And that's sort of what he's saying, like you have all these transistors on the chip, like we should be using them, they're just staying idle, like shouldn't we give them work to do while stuff is going on, and the thing though is that there's this big revolution in production where they stopped thinking about cost accounting, and they started thinking about throughput accounting, like the real metric is like how many cars you're producing per minute, how many cars actually leave the factory in working order, ready to be sold every minute, not every hour, whatever your time unit is, not how much grinding is this particular grinder doing, right, it seems kind of weird when you think about it, like you're trying to maximize the use of every single piece of machinery, no, you just want cars coming out the end, that's what you should be maximizing. And so I don't know, I wonder about this, the statement of his, and how it relates to computer programming, I mean, obviously utilization in terms of, you know, if it's not the main metric, but it's just kind of a proxy metric to see if there are parts of your machine that maybe you don't even need to have, because oh, this piece of the heart, this corner of the chip is only used 10% of the time, maybe we can make the chip cheaper by not having that part and reproducing it in some other part of the chip, like, you know, funneling work to some other part that we use a lot. And so now, you know, we go from 70% utilization of that part to 80% and, you know, we're all good, we've eliminated a cost in our chip. You know, that kind of thing, you know, might be, might be useful, but the idea of the goal being to use all the transistors as much as possible, I think is, is, is just flawed. No, compared to the eniac, which was very specific purpose where they had like, oh, if you want to do a lot of additions in a row, use this part of the machine, if you needed the multiplications, use this part of the machine, I think that, you know, relative comparing those two, the modern processors of his time that use much more of the machine all the time versus the special purpose ones. So he's talking about general purpose versus special purpose. I think that that was a big success, right? And that's probably what he's referring to, you know, I'm just riffing on what he's saying. But he thinks that the fact that we got, we can have a few repeated low level elements, adders and bit shifts and, you know, very small things that get micro coded. So when you do a multiply, there's not a multiplier chip or a multiplier, you know, circuit, it gets changed into shifts and adds. That lets you make it more general purpose because now you just need these smaller number of reusable components. That is a big, a big revolutionary shift and very important. Okay, I'll continue. Literary advances, if they come, must come by the exploitation of the high degree of parallelism that the use of integrated circuits will make possible. The problem is to secure a satisfactorily high factor of hardware utilization. Since without this, parallelism will not give us greater power. Highly parallel systems tend to be efficient only on the problems that the designer had in his mind, his or her mind. On other problems, the hardware utilization factor tends to fall to such an extent that conventional computers are in the long run more efficient. I think that it is inevitable that in highly parallel systems, we shall have to accept a greater degree of specialization towards particular problem areas than we are used to now. Okay, so it's very true that if you try to parallelize a problem, there is a law, I can't remember the name of the law, but you have a problem and you apply two computers to the problem at the same time. You don't get twice the speed. You get some other factor, which is less than two, and then you add a third one and a fourth one, and there's just diminishing returns and it starts to flatten out because there's overhead in the coordination between the machines. You can't simply turn regular code, general purpose code, into parallel code and get increasing returns of speed. That factor is different for every problem and every coding system. But if you specialize the hardware, so I think of something like a GPU, which is a specialized multi-core processor that has thousands of cores, much more than your regular computer does, your CPU, and it is made to process an array of data at a time. So each processor is doing the same operation on that array, just over and over, right? So you're getting thousands of things working on the same algorithm over and over in lock step, and that turns out to be a very good way of getting high utilization, high throughput as well. And that's how a lot of, you know, graphics are like that because you're painting billions of triangles on the screen, and each one has the same math that you do, so it can do it very quickly and in parallel. And because there's no coordination, because each triangle is mostly independent from the others, the math is independent. You can do it without coordinating, without stopping and waiting for the other processor. So I think that what this is what he's talking about, that we have to specialize, so we specialize in these like GPUs that do parallel array processing because otherwise we wouldn't be able to get the factor that we want in the parallelization. One area in which I feel that we must pin our hopes on a high degree of parallelism is that of pattern recognition in two dimensions. I would not exclude the possibility that there may be some big conceptual breakthrough in pattern recognition, which will revolutionize the whole subject of computing. Okay this is, you know, I don't think we've had this breakthrough, so like the clear breakthrough that he's talking about, but the thing I think of if I was going to point at a breakthrough that maybe we have had is convolutional neural networks. So we, okay, let's talk about the difference between a linear search in one dimension and a quadratic search in two dimensions. In one dimension, let's say you're, let's say a simple case, you're listening on a wire for bytes and you're looking for packets, right? So the packet starts with a certain sequence of bytes and then you can start parsing that packet up into its components. Well, it's actually quite easy to just look for the sequence of bytes, right? It's like a little state machine is like, yeah, I read the first one, oh no, the second one was wrong. So I'll start over and know, okay, I read the first one, here's the second one, here's the third one. Okay, I found a packet, right? It's very simple. Now to do that in two dimensions is much harder. So imagine you had to do that in every row. You had to look for it. But then also it could be in every column. So you have to look down every column to see if, if you see that sequence of bytes. And I mean, you could even say, well, what if it's diagonal? You know, we're talking like word search, like it's much harder to find the thing. And what convolutional neural networks does networks and do is to look at the same, look for the same pattern, but over in any location, right? So it's kind of pattern matching all along, like in every rotation and every location, looking for things like lines, you know, and shapes, you know, very basic stuff and then builds it up. And you know, every layer of the neural network is kind of looking for something different from the data in the layer before. Of course, it's not that efficient. It still takes a lot of multiplies and ads. But going back to what he was saying, there is now specialized hardware that is made expressly to train and run these neural networks. And it, because it's specialized, it can have this high level of utilization, this high level of parallelism going on at on these particular kinds of problems. So, you know, this is, this could be interpreted as what kind of breakthrough that would match what he's talking about. Okay, I'm going to finish this up. This is in the summary section. This is actually the intro paragraph that I read right at the beginning of this reading. And I think the summary section often gives the best intro paragraphs because it's like, you know, big ideas and stuff and you get to start off with a bang. But now that I've read all the other stuff, it might put it into more perspective. And I'll get to comment on it, serving the shifts of interest among computer scientists and the ever expanding family of those who depend on computers in their work, one cannot help being struck by the power of the computer to bind together in a genuine community of interest people whose motivations differ widely. It is to this that we owe the vitality and vigor of our association. If ever a change of name is thought necessary, I hope that the words "computing machinery" or some universally recognized synonym will remain. For what keeps us together is not some abstraction, such as Turing machine or information, but the actual hardware that we work with every day. So he's obviously referring to the association for computing machinery, which hosts the Turing Award, and he's saying that it's the computer, the machine itself, that binds all these different fields of interest. I'm just extrapolating. I presume he's talking about there's the electrical engineers, there's computer engineers, there's mathematicians, there's software people. There's data people, and all of these people are pulled together because of the computer hardware, that they would not be together if it were just say talking about software, or just talking about algorithms, or just talking about electrical engineering and transistors. Like if you're just talking about nan gates, transistors and voltages and stuff, you really wouldn't be considering that the data people wouldn't be that interested in it. Let's say the data scientists. That's the machine itself that is the nexus for all of this. I don't know how relevant that is. Let me tell a personal story. I was in undergraduate in the late 2000s early, I mean, sorry, late 90s early 2000s. The ACM did not seem that relevant to me, especially when I learned what ACM stood for. Computing machinery, it sounded like some kind of Baroque steam punk era organization that was more interested in mainframes than on what I was interested in, which was programming software. Seemed like from another age when hardware was much more central. In an age where you can click a couple of buttons and spin up a virtual machine in the cloud, you don't even know where it is, it's in some data center somewhere. In that age, how relevant is this ACM? How relevant is the name, first of all, because he does bring up the name as important. But how relative is this focus on the machine? I'm not sure, but I will say this, as I've gotten older, I'm actually 39 now, so many years out of college, the ACM has been much more important than I thought at the time I was in college, regardless of the name. They do quite a lot, they host a lot of events, and of course, they're less relevant than they were before, because I think the industry is a lot bigger, and so they're, like I said, all these subgroups that can have their own associations, like it's so big that you can have a Ruby conference, and a Python conference, and a Java conference, and a Rust conference, and everybody is in a separate world, and they're not unified by this one association. I can imagine that without this association, there wouldn't have been some of the early conferences that led to people meeting each other and sharing ideas, and sharing their research. Now it's much more fragmented, so it has relatively less relevance, but I still think that it serves a very important function, and all of these papers that it has, all of these old papers that kind of document the history of the field, I think that that is so important these days. So I am a member of the ACM, and I think you should be too. That's all I want to say, say about that, okay, now, if you like this lecture, if you like this reading of the lecture, and you like my comments, you can find more, go to lispcast.com/podcast, and you will find all the old episodes of this podcast in audio and video and text transcripts, I want to start text transcripts again, I kind of stopped doing it just to test, but I think they're very important, so I'm going to start doing them again. You can also subscribe, you can subscribe on YouTube where I host a video, this is in video form, it's recorded in video, but there's also audio on podcast in case you want to listen in your car, and you can find links to get in touch with me, and I love to talk, I love discussions and hearing your comments on the episode and your ideas, and you can also find a link to this episode, and there'll be a link to the lecture itself so you can read it. Okay, I'm going to sign off. My name is Eric Normand, this has been my reading of computers, then and now, by Maurice Wilkes. Thank you for listening, and as always, rock on.