Generality in Artificial Intelligence

In this episode, I read and comment on excerpts from John McCarthy's 1971 Turing Award Lecture.

Transcript

It was obvious in 1971 and even in 1958 that AI programs suffered from a lack of generality. It is still obvious and now there are many more details. Hello, my name is Eric Normand. Welcome to my podcast. Today we are reading from the 1971 touring award lecture by John McCarthy. It's a little more complicated than that but he won in 1971 and he gave a lecture and it actually wasn't published in 1972. Usually they're published the next year. I guess he didn't like what he said. He didn't like how the talk turned out so it was never published and then he did finally publish something 15 years later. It's a quite a long time afterward and so he's trying to, in this talk, he's trying to give a feel for what he was trying to say in 1971 plus with some updates. So kind of an oddball that we haven't seen something like that yet where it's actually for much later. Okay, we're going to get into it in a moment but before we do, I want to talk about my new book, Grocking Simplicity. Look, I've even got the t-shirt on. You're watching the video. Grocking Simplicity is all about functional programming. It is for people with who know at least one programming language and have a couple years, maybe three years of experience working on commercial software and know some of the pitfalls in it. It talks about functional programming. It is what I hope to, I hope it is a book that starts a conversation, a discussion in the literature for commercial software. There's a lot of academic books on functional programming but I think we need to start talking about how it applies in the industry. So please go check it out. It's available on manning.com and also amazon.com and probably other stores but I'm not aware of where else it is. So I'd love to hear what you think about it. Please leave a review if you like it. Tell your friends and thank you for buying it. Okay, I usually read from the biography. I'll read a couple things. I didn't find the biography to be that insightful but there are some facts that I like to look at. So he was born in 1927 in Boston and this, he's about eight years older than the last, not no well touring, laureate who was born in 1919. So sort of, you know, the next generation, right, like he could have had that person as his PhD advisor or something. He was a mathematician and he, I look at him mainly as like a logician, like trying to find a logic that was appropriate for using on a machine to reason about the world. In the way that we humans can make judgments about how the world must be or we can plan some set of actions to make something happen. He was, you know, for instance, he uses the example of stacking blocks. Like, oh, if you want to get block B on top of block C, first you have to take block A off of it and then move this over here and take the stat out of the box and then put it on top. There's all these actions you have to take in order to make the situation the way you want it. And computers are bad at reasoning about that. So his, his talk is mostly about that. It's about all these systems of logic trying to solve this problem of generality. It's easy to construct a very small logical system that is consistent, that can solve tiny little problems, but then you start introducing new ideas into it and it quickly explodes. It's, it's, it's not a scalable solution. He has worked at Massachusetts Institute of Technology at Stanford. He's worked with Marvin Minsky. He's the inventor of the term artificial intelligence very early in the field. And he also has worked with, then I say Marvin Minsky. He also is the inventor of Lisp. And the biography, I didn't know this, but the biography says something interesting that 16 Turing award winners have been given to, or sorry, 16 Turing awards have been given to people who have been affiliated with the Stanford AI lab, which is the project that McCarthy started. So it talks a lot about all the students that came under him and that this, this is kind of a big part of his legacy is the first person to write a computer chess program, you know, very early in artificial intelligence, a very important figure in computer science. Another thing that's not mentioned in here is that he was instrumental in getting the if then else statement put into algal. So thanks. Thank you for doing that before they would have done some kind of go to and he was in favor of something more structured. And of course, now if then else is common in basically all programming languages. Okay. So let's get into the Turing award lecture itself. Like I said, it's the on the top of the page, it says the 1971 Turing award lecture, but he actually didn't write this until 1986. I mean, I'm inferring, I'm presuming that he gave the talk in 1971. And he was trying to summarize this idea of generality in artificial intelligence and couldn't really do it and didn't like the way his lecture turned out. And then I guess he like, you know, 10 or 15 years later, he decided, Oh, I guess I better finally write that up. And so they allowed him to publish this 15 years later. Okay. It tries to tries to summarize the approaches that have been attempted in the past to deal with this problem of generality. The problem of generality was not was not well understood. They knew that it was a problem, but they did not understand. So this is the thing I read at the beginning. I'll just start. It was obvious in 1971, and even in 1958, that AI program suffered from a lack of generality. It is still obvious. And now there are many more details. Okay, remember, this was written in 1986. So basically for 30 years, they've been trying to solve this problem. Also, I want to say I have so much to say on this, but so I have a master's degree in, and I studied artificial intelligence. And so I have thought about a lot of these problems, you know, even like decades after this was published. And there's also been like this resurgence in the last 20 years of artificial and no, not even 20 years, let's say 10 years, less in artificial intelligence and more of a neural net machine learning approach. And so there's there's just a lot more happening going on than what was happening back then. So there's a lot to say. And so I'm going to try to bring my understanding to the field. Unfortunately, a lot of this kind of seems, the stuff that he's going to talk about. If you've done any reading on it at all is going to seem naive, at least, at least from our perspective now. But the reason it seems naive is because by doing this investigation, that artificial intelligence, the field of artificial intelligence was doing, trying to do stuff in a computer that normally only people can do, they brought up a lot of problems with the way we perceived ourselves and how we thought we made sense of the world. And so this is an ongoing process. And the effect that artificial intelligence has had on cognitive science on psychology on neuroscience is profound. And looking back at it, it often seems like, wow, people were really really had no idea about how we thought. And it's true that they discovered it. They had all these challenges. Like, we think we're doing, you know, we think chess is like this really hard thing, but it's actually one of the easiest things we do for a computer to solve. Okay, so let's, I hope to interject with like lots of little stories and stuff. Because it's some, the problem is I might forget them, because to me, I've lived in it so long, I often forget that it's not common knowledge. Okay, I'm just going to get started. Let's go. All right, the first gross, gross symptom of this lack of generality is that a small addition to the idea of a program often involves a complete rewrite beginning with the data structures. Some progress has been made in modularizing data structures, but small modifications of the search strategies are even less likely to be accomplished without rewriting. Okay, so he's talking about how when we're dealing with stuff in the world, we're kind of like always learning new little facts, little things about what's going on. And these are small changes, but in software, it often requires you to just start over from the data structures, like the basic ideas that you've got encoded in your program. And so we need to modularize this to so that we don't have to keep changing our data structures at least. But then, of course, you got different data structures, you're going to need search strategies that are different because, you know, one, a search strategy that works on this data structure is not efficient. So you need one that works on the new data structure. It's not, it's not an easy problem. Another symptom is that no one knows how to make a general database of common sense knowledge that could be used by any program that needed the knowledge. We don't know how to just put facts of common sense knowledge, you know, birds fly, objects fall when you release them, all that kind of like little stuff that every like even, you know, five year olds know, we, we don't know how to put that into a database and make it useful. When we take the logic approach to AI, lack of generality shows up in that the axioms we devise to express common sense knowledge are too restricted in their applicability for a general common sense database. In my opinion, getting a language for expressing general common sense knowledge for inclusion in a general database is the key problem of generality in AI. Okay, so this is 1986, remember, and he's making this strong claim that we need a language to express general common sense knowledge. And we want to put that into a database. And that is the key problem. Now, you know, if we solve that problem, we'll unlock the next wave of problems, the next, you know, things that next challenge is on the way to generality in AI. Now, you know, what is this? 35 years later, we do have some common sense databases, there's a thing called psych, CYK, CYC, that someone had just been writing, you know, they hired a team, they were just writing common sense statements into a database. And with the idea that when you hit a certain quantity of them, then there will just be a qualitative change in the kinds of reasoning you can do. It's unclear at this point, whether that's actually helpful. There are attempts at reading, basically reading everything, reading books, Wikipedia, web pages, and trying to get these facts out, you know, not not have a human involved in getting the facts out, but getting them out automatically. And it's just not clear that that's going to lead anywhere. You know, once you have them, you have to represent them. And that's what he's talking about here. And in a language, you have to be able to write them down in some format that's convenient and efficient for search. And then you do this big search with inference and stuff. And so the engine that infers over these statements would also need to be devised, right? It's not clear that that's going to help at this point in 2021. Okay, but he's going to go over the problems. And I think it's very useful to understand this problem more deeply. Friedberg discussed a completely general way of representing behavior and provided a way of learning to improve it. Namely, the behavior is represented by a computer program. And learning is accomplished by making random modifications to the program and testing the modified program. The Friedberg approach was successful in learning only how to move a single bit from one memory cell to another. Okay, so just imagine this was like back in 5859. You have this program that can solve some problem, right? And you want it to learn to do it. You want to improve and want you want to make it better. So you just like make random changes to the program. And like imagine this is in machine code. You just like change some bytes and then run it and see if it's better. Very naive approach. But you know, you got to try those just in case they work. It was shown by Simon to be inferior to testing each program thoroughly and completely scrapping any program that wasn't perfect. No one seems to have attempted to follow up the idea of learning by modifying whole programs. So it didn't work. The defect of the Friedberg approach is that while representing behaviors by programs is entirely general, because we know that, you know, software is Turing complete. Modifying behaviors by small modifications to the programs is very special. Okay, so you're trying to make this general change and or this you're trying to solve the problem in general of being able to learn any new behavior just by these like random mutations and selection. But you want some particular thing. You wanted to learn some particular thing and random and particular don't really go well together. A small conceptual modification to a behavior is usually not represented by a small modification to the program, especially if machine language programs are used. And any one small modification to the text of a program is considered as likely as any other. While Friedberg's problem was learning from experience, all schemes for representing knowledge by programs suffer from similar difficulties when the object is to combine disparate knowledge or to make programs that modify knowledge. Okay, it didn't work. It was, you know, we needed to try it, but, you know, there's there was a at a time at the time, the notion that, hey, we have this new thing called programs that seem to have at least the potential for solving any problem, right? It requires human ingenuity to craft the solution as a program to write the program. But so far, you know, any program we've attempted has either been solved or we could see how we could do it if we had more resources, a bigger RAM, more, you know, faster processing, but like we just can't do it yet or take more too much time to write the program. So plus that with the church touring thesis about, you know, this universality of touring and that this of a touring machine, and this was it, this is all we needed to to do any kind of computation, you know, this sense that we have this thing that's universal now. Can't we just make small changes to this program and it'll learn new behavior? Turns out that representation matter, right? He talks about this before that it's about the language you need to express the statement and machine code is even though it's universal is not the best way to express this kind of knowledge. Okay. Alan Newell, Herbert Simon, and their colleagues first proposed the general problem solver in 1957. The initial idea was to represent problems of some general class as problems of transforming one expression into another by means of a set of allowed rules. In my opinion, GPS was unsuccessful as a general problem solver because problems don't take this form in general. And because most of the knowledge needed for problem solving and achieving goals is not simply representable in the form of rules for transforming expression. GPS was the first system to separate, oh, however, GPS was the first system to separate the problem solving structure of goals and sub goals from the particular domain. Okay, so GPS was an interesting attempt. Herb Simon, and Herb Simon, very interesting guy. He won the Turing Award and the Nobel Prize. Herb Simon was the guy who came up with the notion of satisficing that humans don't optimize, they satisfy, and he won the Nobel Prize in economics for that. So we'll get to him when we get to his award. But he was solving this problem of generality by having these, taking expressions and having transformation rules that applied to expression. So we'd kind of pattern match on the expression and say, well, if this is true, then these other things must also be true. And you have this huge set of allowed transformation rules, and you just keep applying them recursively, and you should be able to generate basically, you know, giving enough time, all true statements that that derived from that rule. And the problem was that these transformations are not enough. Most problems are not solved by just simple transformations. You need more ability to, well, he'll get into this, but you need this ability of having variables and talking about things in general and not just specifics. But he says that it's important because it separated the idea of like an engine for problem solving that was separate from the domain. All right, now he's going to talk about production systems. So he just is kind of summarizing like all these attempts and where they failed. Okay, production systems represent knowledge in the form of facts and rules. Unlike logic based systems, these facts contain no variables or quantifiers. New facts are produced by inference, observation, and user input. The result of a production system pattern match is a substitution of constants for variables in the pattern part of the rule. Consequently, production systems do not infer general propositions. Again, we're talking about the generality problem. They can't learn like these general problems. They have to work on particular. You know, so for instance, you could, you could have it generate like move block C because it's on top of, you know, C is on top of B and you want to move B. So you have to move C first, move it off of B. But it can't figure out in general the rule. Hey, if something is on top of it, if you want to move a thing, so that's a variable. If you want to move a thing and there's something on top of it, move the thing that's on top first. Like that's a general rule and it can't ever deduce that. So he has another example. For example, consider the definition that a container is sterile if it is sealed against entry by bacteria and all the bacteria in it are dead. A production system or a logic program can only use this fact by substituting particular bacteria for the variables. Thus, it cannot reason that heating a sealed container will sterilize it given that a heated bacterium dies because it cannot reason about the unenumerated set of bacteria in the container. He's going to talk about that more later. He's going to bring up the same example later. So like just to explain again, it can only reason about basically, you know, you could say it like this can reason about one bacterium at a time. Heating, heating X will kill it if X is a bacterium. Okay, but it can't say, oh, therefore, if I heat this dish, all the bacteria in it are going to die. And then so I should heat the whole dish. It can't make that leap to a set of bacteria that are contained in it. Something has to like list out all the bacteria in it. And then it can decide, ah, yes, there are no more bacteria because I've heated each one. Representing knowledge in logic. It seemed to me in 1958 that small modifications in behavior are most often representable as small modifications in beliefs about the world. And this requires a system that represents beliefs explicitly. The 1960 idea for increasing generality was to use logic to express facts in a way independent of the way the facts might subsequently be used. Okay, I want to pause here. Remember, he's a mathematician. He's a logician. And so he's approaching this as, you know, we will have some statement of fact about the world. And we'll use that in different ways. So we're not making a production rule, which was the last section we talked about that that has a very specific use, like to, you know, to move this block from here to there. You know, it's much more general, like moving a block changes its location. And that's a declarative statement about the world. Right, or heating if a is a bacterium. So for all a if a is a bacterium, we heating it will kill it. Then that is a statement about the world is not a statement about any particular bacterium. It seemed then and still seems that humans communicate mainly in declarative sentences rather than in programming languages, for good objective reasons that will apply whether the communicator is a human, a creature from Alta Centauri, or a computer program. So he's, he's making a claim about the universality of the usefulness of this. That speaking in declarative sentences is more useful than describing a program, right? So just as an example, I know when I was a kid, we had this exercise of having to describe how to make a peanut butter and jelly sandwich. So you have to give all these instructions, you know, place the jar on the table, open the jar by twisting the lid counterclockwise, open the drawer, find a knife, take it out, close the drawer, and if you forgot, and then the teacher would enact it. Right. And so if you forgot a step, like your program would have a bug and the teacher could not finish the sandwich. But you, so you could describe making a peanut butter sandwich that way. Or you could have a declarative statement, like a peanut butter sandwich is two slices of bread with a layer of peanut butter between them. And that is, I mean, that might not be the best way to represent it. But if you say it like that, the person on the other side that's hearing this statement is an intelligent person, is an intelligent, you know, has an intelligence and can figure out all the steps themselves of how to make the sandwich. Right. I mean, it reminds me of the difference between American style recipes, which are very step by step versus French style recipes, which are like a little paragraph about like, what's in it? You know, I think it's, it's really telling that they, the French style recipe is, I mean, it's, it's crazy looking at the difference, because they are very different. You have to know a lot more about cooking to, to do the French style recipe, whereas the American style is made for someone who, you know, they know how to turn on a stove, but they don't, they don't know how to saute an onion, you know. So there's like things you can assume that the Americans know, but there's things you can't, whereas in the French, they assume a lot more knowledge about, about what's going on. You know, they'll say like, this is a dish of, oh, I mean, it'll even say like, cook sauteed onions, carrots, potatoes, and et cetera, together in a, in a pot until tender serve in a bowl. Right. It's like that, that general of a statement of, of a recipe, whereas in the American, it would be like a whole page, and it would say like, one medium onion, finely chopped. Right. And then so, then it'll say like, heat the pan to, on medium heat, add oil to the pan, put the onion in the pan, and stir occasionally until the onion is translucent. Like it'll, I mean, it's just the details that you have to write in there are crazy. But if you can assume knowledge on the other side, like you can assume that the, the agent on the other side can do some deductive reasoning and has a similar set of knowledge and experience and skills in the world, that the declarative sentences are much better for, for communication and for getting things done, because it can, it can work in multiple situations. Okay, let me, let me continue. The advantage of declarative information is one of generality. The fact that when two objects collide, they make a noise may be used in particular situations to make a noise, to avoid making a noise, to explain a noise, or to explain the absence of noise. Okay, so in theory, the idea is that I can write one statement that can be used for all these different situations. It is, it is a true statement is generally true. And so I can save time as the programmer, because I'm writing just one statement. And the computer can use it for multiple situations. Once one has decided to build an AI system that represents information declaratively, one still has to decide what kind of declarative language to allow. Okay, so now you have this new problem of writing a language that can express this. Every increase in expressive power carries a price in the required complexity of the reasoning and problem solving programs. So you go from the simplest systems that are just like constant symbols, and then you add some predicate symbols and some variables. And now you're in first order logic. And it's just really hard. Okay, so you need a much better engine for running these programs. So he talks about prologue is kind of this local optimum. It's interesting because in 1971 prologue didn't exist, but in 1986 it did. So this is something that he could look back on. All right, so he says prologue represents a local optimum in this continuum, because horn clauses are medium expressive, but can be interpreted directly biological problem solver. One major limitation that is usually accepted is to limit the derivation of new facts to formulas without variables, that is to substitute constants for variables and then do propositional reasoning. It appears that most human daily activity involves only such reasoning. So people aren't doing science all the time when they're going about their day. They're not inferring, you know, what what goes up must come down, which is like a universal statement with with a universally quantified variable. They're just saying, if I drop this apple, it's going to hit the ground. Right, that that's everything is is instantiated to a sink to constants, right, this apple that ground is going to fall. Okay, a prologue program can sterilize a container only by killing each bacterium individually, and would require that some other part of the program successfully generate the names of the bacteria. It cannot be used to discover or rationalize canning, sealing the container and then heating it to kill all the bacteria at once. Okay, so it has the same problem that it cannot generalize. And this is something that we do all the time. It's not science exactly, but we we can we can infer that like we can do these specific quantifications. I mean, like, like you can say stuff like all the people in the room, right, and there might be 20 people in the room, and we don't have to like list them all. We can infer things from that. In a way that might not be well expressed in in predicate logic. So if you say, or you know, in like a universal corner, like in logic, you would have to say for all people, if the person is in the room, then, you know, they heard me talk, right. So this is the way you would express that in logic. And notice that you have this for all this variable that's quantified over all people. And then there's a conditional in there. But when we say for all the people in the room, you know, all the people in the room heard me, we might not be doing that. We might just that might just be a shortcut. Right. And then we have some kind of heuristic that's like, were they in the room? Yes, no, like, you know, it's not, it's not this thing that we're applying to all people. And then, and then narrowing it with an if statement. Okay, that's, this is my interjection here. This isn't what he's saying. Okay, now I'll read from his paper. My own opinion is that reasoning and problem solving programs will eventually have to allow the use, the full use of quantifiers and sets and have strong enough control methods to use them without combinatorial explosion. Okay. While the 1958 idea, the one of this logic was well received, few attempts were made to embody it in program in the immediately following years. I spent most of my time on what I regarded as preliminary projects mainly LISP. My main reason for not attempting an implementation was that I wanted to learn how to express common sense knowledge and logic first. So we wanted to do the expression first before making the logic engine. And, you know, you can see why, like, as a logician himself, he could sit there and work out simple problems himself and know whether a language was expressive enough. Like, why should you write the engine first? And then later learn, oh, this doesn't really do what I needed to do. It's much better to work out a few problems on a piece of paper. Learn that there's some missing gap in this inference and we can't figure out why these rules don't work. Oh, we need some universal qualified here that, you know, that kind of thing. You can work it out before you make an engine to do the deduction. McCarthy and Hayes made the distinction between epistemological and heuristic aspects of the AI problem and asserted that generality is more easily studied epistemologically. The distinction is that the epistemology is completed when the facts available have as a consequence that a certain strategy is appropriate to achieve the goal, whereas the heuristic problem involves the search that finds the appropriate strategy. Okay, so this is kind of what I was just saying, that there's two problems. One is having the information encoded in the right way. That's the epistemology of it. And then there's the heuristic aspect, which is actually taking all that and doing a practical search to generate the desired knowledge, right? What I was talking about where you could do it on paper just by working on the epistemology of it. Okay, the common sense information possessed by humans would be written as logical sentences and included in the database. So he's talking about his own work here, his own, you know, approaches. Any goal-seeking program could consult the database for the facts needed to decide how to achieve its goal, especially facts about the effects of actions. The much studied example is the set of facts about the effects of a robot trying to move objects from one location to another. This led in the 1960s to the situation calculus, which was intended to provide a way of expressing the consequences of actions independent of the problem. Okay, so you have a database and you put a bunch of facts about it, about the world in it, but you're especially putting facts about the consequences of actions. So this might be like, if you paint something, it changes its color, right? Something like that. And so he has this, like, some, it's not code, it's like a formula. S prime equals result of E and S. So you have a situation S, the current situation, some event E happens, and the result is a new situation S prime. Okay. Notice that the situation calculus applies only when it is reasonable to reason about discrete events, each of which results in a new total situation, continuous events and concurrent events are not covered. I'm not going to read all the code that he has here, the axioms. But one thing that's clear is that you have to state quite a lot of stuff. So like the result of moving a thing that's at position X is to Y is that it is now in Y, like that, you know, it's things like that. The facts that were included in the axioms had to be delicately chosen in order to avoid the introduction of contradictions arising from the failure to delete a sentence that wouldn't be true in the situation that resulted from an action. So it's very tedious. You have to say everything that's still true, all the things that are not true anymore, anything that has changed, you kind of had to rewrite the whole description of the situation in every rule. And if you added a new thing, like a new piece of information that was being tracked. So you had location and you had color and then you had some other, so you have those two things, you describe all the rules, like moving something doesn't change its color and painting something doesn't change its location, right? You say all that stuff. But now you add a new thing like its rotation. So you can flip the block. Okay, so now you have to say flipping the block doesn't change its location or its color and and changing and painting it changes its color, but not its location and not its orientation. So you like have to start describing everything that changes and not changes. It's very hard. A problem with the situation calculus axioms is that they were again not general enough. This was the qualification problem, putting an axiom in a common sense database asserting that birds can fly. Clearly the axiom must be qualified in some way since penguins, dead birds and birds whose feet are encased in concrete can't fly. All right, so he's talking this is he's describing this new problem call that he's calling the qualification problem. You can make a general statement like birds can fly. If you said that to a person, they'd be like, yeah, sure, that's that sounds right. But then you're like, ah, but what about dead birds? Huh? Did you think of that? And then he can say, huh? What about if I cut their wings off? And so then you're like, well, but you didn't say that. You didn't say that this bird was like tied down or in a cage, you know? And so he needed to invent this new thing because you can always come up with all these exceptions. Formalized nonmonotonic reasoning provides a formal way of saying that a bird can fly unless there is an abnormal circumstance and of reasoning that only the abnormal circumstances whose existence follows from the facts being taken into account will be considered. All right, so he's trying to formalize this thing that people do. If I say birds, you know, birds can fly. And then later, I tell you, and so you're like, yes, that's, that's right. And then I say, ah, but what about a penguin? And he said, well, you didn't, you didn't say penguins. Penguins are kind of special. You know, most birds can fly. Penguins are special. So you don't have to list all of the exceptions. And this, um, this is trying to formalize this. And so anything like if you're describing a situation, if you leave out all, you have to, you have to make it possible to leave out the details because you can't describe everything about the situation. And if you leave out a detail like it's a penguin, that's okay. You're just going to assume it's not a penguin, right? You're going to assume that the birds can fly. It's true. And you don't have to know all the details of its wings and whether it's healthy and whether it's alive and whether, you know, you can leave that stuff out and the reasoning can continue and infer the, you know, the sort of general case without the exception. The frame problem is another problem that he's going to describe. The frame problem occurs when there are several actions available, each of which changes certain features of the situation. Somehow it is necessary to say that an action changes only the features of the situation to which it directly refers. When there is a fixed set of actions and features, it can be explicitly stated which features are unchanged by an action, even though it may take a lot of axioms. If additional features of situations and additional actions may be added to the database, we face the problem that the axiomatization of an action is never completed. Okay, so let me explain this problem here. This is the frame problem. Other researchers have defined it differently. It's kind of similar, but this is his, he was first, so he's defining it how he defined it. So you want to make this database and we talked about this before. If you put, you know, painting a block changes its color. You also have to describe, like, well, it does not change its location. It does not change its orientation. And so let's say you have a complete set of all that. It's all well described. And now you add another variable in there that the system is keeping track of. You have to go through all your rules and add this. And of course, your rules are growing because you're adding this new variable that can change. And so you're just increasing the amount of stuff you have to put in the database with each addition of a thing, obviously not scalable. You'd never finish. So you need like a kind of meta rule that says, well, if I don't mention location, it's not going to change. Just use the same location as before. If I mention paint and the paint changes, well, then it changes. But if I don't mention paint, it doesn't change or the color, right? So we need some way of solving this frame problem. Okay. So he's got the situation calculus. He's showing that, you know, you have your statements, your axioms become a lot less wordy because you have this thing called ab aspect. And it's just a way of describing like, there's a bunch of stuff that's not that's there. That's not changing. All right. So this treats the qualification problem because any number of conditions that may be imagined as per the venting, moving or painting can be added later and asserted to imply the corresponding ab aspect. It treats the frame problem in that we don't have to say that moving doesn't affect colors and painting location. So now when you add a new axiom and a new like thing that can change about the situation, you don't have to change all your existing rules. That's very important. All right. But even with formalized non monotonic reasoning, the general common sense database still seems elusive. The problem is writing axioms that satisfy our notions of incorporating the general facts about a phenomenon. Whenever we tentatively decide on some axioms, we are able to think of situations in which they don't apply, and a generalization is called for. Moreover, the difficulties that are thought of are often ad hoc, like that of the bird with its feet encased in concrete ad hoc meaning you don't have a situ you don't have a rule. Like if you say all birds can fly and you say bird a has its feet encased in concrete, like does the system have a a way of representing that that means he's too heavy to fly? Like, you know, this is another, you know, you now you need a whole bunch more a whole bunch more statements and axioms in your database. And so it's just exploding the sides of the database needed. I want to interject here at this point, because I think we've gotten enough of a sense. It's we're almost at the end now, but we're getting we have a real sense of the the difficulty of this logical approach. Back when AI started, there was a real thought that, you know, sort of the this is a caricature, it's a it's a caricature of what was thought at the time, but it seems naive now. So it's hard to describe it without making a it seems so cartoonish. But the idea was sort of smart people make reason like reason through situations. And logic is is something that smart people have learned how to do. And so there must and it seems to be able to describe all these situations. And so therefore a machine machines can do this kind of basic logic, because it's very formal. So that might be a good way to make a computer be able to think reason that thinking is basically reason. I think that that characterizes it well. Thinking is reason. And you look at stuff like, you know, another another caricature was like, well, if we're trying to make intelligence, intelligent people can play good chess, like you have to be pretty intelligent to play chess. So let's make a computer who can play chess. And that way we will be simulating intelligence, right? On the flip side, they thought the stuff like walking and manipulating blocks with a robot, you know, using a camera was kind of like a simple problem, like it must be easy, because even a two year old can do that. And it turns out that that was exactly back. The stuff that a two year old can do turns out to still be pretty hard. You know, we're making good strides these days in it, just because we have a lot more experience with it and faster computers. But it turns out that that stuff that two year olds can do is actually the hardest stuff. Two year olds can recognize shapes and people and can talk about it and make simple statements about the world. They can run around, they can jump and climb like in unknown situations, like there's all this stuff that's really hard. And this is the kind of thing that AI was discovering. Like, hey, wait a second. We thought chess was hard because it's hard for people. You know, it takes a lot of study to become good at chess, a lot of practice. But anybody can learn to walk. But it turns out it's, it's the opposite that the current best explanation for it is that actually walking is something that took millions of years to evolve. And this kind of reasoning that we do with our forebrain is actually like a recent invention, evolutionarily speaking. A recent emergent phenomenon. And it's hard for us because it's so recent, right? That it's not, it hasn't had enough time to really get well developed. And we also seem to be one of the only species that can do it. Some, you know, dogs can can solve problems. Monkeys can solve problems. Dolphins can solve problems. But it's, you know, it's a handful of species that can do that. And then we have this other ability of being able to kind of reason in more general terms. Like we are able to come up with physical systems of physical laws that have predictive power that we can use to build, you know, spaceships and stuff. And so all of that is the same thing that people thought like, Oh, that's the hard thing. That's intelligence. But actually that stuff is fairly easy for the machine to do. Because it is so formal, like the best logic that you do is very formal. And I mean, at least at the time, that's what was thought. And this easy stuff, things that even like non intelligent people can do, you know, animals do it like a bird can stand on two legs like it's not and walk. It doesn't seem to be that hard. But that is actually something that's very hard to do with a kind of mechanical approach to to you're not using logic. Let's put it that way. You're not using logic to walk down this. And so this is very similar to this chess idea. And we're, we're now at a place where we do have walking robots and they are not doing logic. And we know that. And where we have robots that can computer programs that can recognize faces and they're not doing logic. And so there's kind of this monkey wrench that's been thrown into this approach in general that that maybe we're not even doing logic when we're when we are doing logic. Right? Like, we might use logic to write down a proof so that we can communicate it and have it formally verified by somebody else. But that's very hard for us. Even the best logicians are have to have a lot of concentration. They need to have the door closed and and quiet. And they need to really focus. It takes all of our brain power to do that. Meanwhile, we can make plans to, you know, go to the grocery store and like we can plan a vacation and we can do all this stuff. And it does not seem to take all that skill. So there must be something else we're doing besides logic. Okay, so this is this is a perspective of a of a grad student in the early 2000s, like looking back on this time that we're just not we're not doing logic. And it seems like it seems like there's logic in it. But all these problems that he's talking about, the frame problem, the qualification problem, these are things that you deal with in logic. But they're not they're not problems that we're having, right? They're that when we're talking, we know what we mean. And so I think in 1986, they were still very optimistic about logic and having databases of we just need more facts, more facts. And I think that we know that it is not the case. I mean, obviously, you need the system to know about the world, but it's not going to be in logic. You know, if you have a two year old and you tell it, like, make it so that B is on top of A, like they're not doing a math problem. They have some other system that maybe it's even a special purpose system of object configuration, engine that just knows, oh, move this move that is done, right? And it's a special purpose piece of hardware in our brains that knows how to reason about that and can do it in like it feels intuitive because it's not conscious. We are not doing logic. We're not working out. Well, if A is on B and B is on C, then we have to move C to be on. You know, we're not doing that. Okay. So reification. Reasoning about knowledge, belief, or goals requires extensions, the domain of objects reasoned about sentences like precedes. So we're trying to make a statement about what has to be done first, right? So you're saying block two, being on block three precedes, block one, but being on block two. So, you know, you're trying to stack blocks in a certain order, you have to do them in the right order. And so you make a logical statement that says this situation has to proceed that situation. On block one, block two has to be regarded as an object in the first order language. This process of making objects out of sentences and other entities is called reification. Okay. So now you, he's realizing that for certain situations, you need to start talking about knowledge itself. You need axioms about the axiom. So this is reification. Okay. So now we're going to talk about context and making it formal. Whenever we write an axiom, a critic can say that the axiom is true only in a certain context with a little ingenuity. The critic can usually devise a more general context in which the precise form of the axiom doesn't hold. Consider the sentence, the book is on the table. The critic may propose to haggle about the precise meaning of on, inventing difficulties about what can be between the book and the table, or about how much gravity there has to be in a spacecraft in order to use the word on. Thus, we encounter secratic puzzles over what the concepts mean in complete generality, and encounter examples that never arise in life. There simply isn't a most general context. All right, I'm going to describe this problem a little bit better. And then I'm going to talk about my own personal experience with reasoning. So you have, you know, you have this sentence, the book is on the table. And you can really start to nitpick. What does on mean? Because what if the book is on a piece of paper that's on the table? Does that still count? And how do you represent that? And like, what if the book is on a box that's on the table? Is the book on the table still? The book is floating above the table. There's like somehow some kind of, I don't know, it's like suspended from the ceiling is like, is that on the table? What's the difference between a column of error between the book and the table and the box being, right? So there's, there's, there's a real problem. And you can go to this higher context, like, well, what if you don't have gravity? And like, the book is touching the table, but there's no up or down. So like, you could say the table is on the book, and that should be acceptable too, right? This is, this is a problem you get in, in logic, where you're trying to define context, and of course, universal statements. And then you move to a new context, and your universal statements don't apply so much anymore, or you get weird results, you get weird, you know, statements that the system says must be true. And it's, it's a problem. So from my personal experience, if I've been doing a lot of programming, let's say I, I woke up early and I started programming early in the day, and then programmed all through the day, in the afternoon, I often, because I've been dealing with the computer that requires such pedantic specificity to, to work with, I cannot reason on a human level anymore. A friend will say something and I will nitpick it. Well, do you mean that or do you mean that this? Like, that's not really the precise way to say it. And like, and I have to apologize to them. And I say, wait, I'm sorry, like, I know what you mean. Don't worry about me. So precise. I've just been dealing with a computer all day. So there must be some, something else happening in our brains, in our minds that allows us to reason without being quite so precise, and somehow short circuit all these logical puzzles, what he calls "secratic puzzles" about the world, that we, even if our friends make, you know, precision errors in their sentences, we know what they mean, we can, we can work around it. And I think this, this is one, this is like showing that like what we're doing truly is not pure logic, right? There are approaches that try to use like society of mind, like Marvin Minsky's, where instead of having one monotonic, let's call it monolithic logic engine, and then this huge database of facts that the engine can use to infer things. Maybe we have a large number of small special purpose engines, each with maybe a different approach to the problem, and then there might also be systems that can choose between those other subsystems, and which subsystem would have a useful answer to this problem at this time, and they're activated at different times by different, you know, hormonal states, if we're stressed, we might be using this, if we're more relaxed and focused, we'll be using this other subsystem, you know, there's this approach, which kind of kicks the problem down the road, but it shows that maybe a single big logic engine is not always, is not the only solution to the problem, that maybe you could have a small logic engine, but that runs on a small number of facts, but that's very expensive and slow, so we're going to try to do it with some heuristics first, and try to find the problem, in a satisficing kind of way, you know, like I mentioned before that maybe we have a special purpose spatial reasoning system subsystem in our brains that can solve these little like cube on cube problems, like how to move the cubes so that, you know, they're properly stacked in the right order, like we can solve these problems without resorting to this general purpose problem, and there's even nowadays doubts in good arguments for the fact that we aren't general, we don't have general purpose, we just have, we can't even imagine the purposes and the context in which we don't operate all the time, and that this is kind of the way that like say scientific revolutions happen with a total paradigm shift, and it requires thinking in a new context that no one had imagined before, no one knew how to do the reasoning in that new context. So this is, this is one of those things where we assume that we're better than we are, that we're more intelligent than we are, and we're trying to make the computer be as an intelligent as we imagine we are, but we're not that intelligent, we're not that good, you know, it's at, or maybe our definition of intelligence is wrong, that we don't need to be so smart, all in everything we do, we can rely on habit, we can rely on culture to like give us pat answers to problems with no, with like arbitrary, you know, no optimal solution, we just pick an arbitrary one that seems to work, and everyone agrees on, and we just work with that, we don't need to solve a logic problem with these socratic problems all the time. Okay, but he's still trying to formalize this idea of context. Humans find it useful to say the book is on the table, omitting reference to time, and precise identifications of what book and what table. This problem of how general to be arises, whether the general common sense knowledge is expressed in logic, in program, or in some other formalism. A possible way out involves formalizing the notion of context, and combining it with the circumscription method of non-monotonic reasoning. Okay, so this is kind of where he's leaving it off, and I assume that that means that this is what he was working on in 1986, or this was at least the approach he thought was the most promising. Of course, he doesn't really conclude, it just stops really, and there you go, that's the end. I think that, I mean, I think I've hashed it out pretty well, this problem of AI, of generality in AI, has caused us to think a lot about ourselves, and what we do when we're thinking, there's much better understanding of our psychology now, and all the biases that we have, and that those biases served an evolutionary purpose. So whenever I learn, I read about a bias that I hadn't heard about before, there's always an explanation of like, why would this be useful? It's obviously not optimal, but why is it useful? And it's probably a case where we stumbled on an evolution optimized for something like energy conservation, so that we would be more likely to survive to procreate, and energy being a scarce resource, and so we make all these assumptions, and we have, for instance, some loss aversion. Why would we want loss aversion? Well, it might be that you might hold on to something that you have longer than is wise, because in that most of the time you're going to get to keep most of the time. There's these weird situations that you might get in, where you would be better off, like you might lose your life if you don't drop the thing you have, but you want to hold on to it, because those situations are so rare, and so nature, natural selection, evolution, has found some balance between dropping what you have to conserve something else, versus holding on to what you have, and the hopes that you'll get to keep it. So it's found a balance that people often talk about situations, and certainly in the modern world, where this loss aversion doesn't make sense. For instance, to an economist, earning $10 is kind of the same as saving $10. So you spend the $10, and then you earn $10, or you just save the $10. To an economist, it's the same. The end situation is the same, but to a person, it feels a lot different from losing that $10 that you earn, that you spent time earning. It feels different from gaining another $10. In a world of scarcity, where a new $10 might not come, or a new... I'm trying to think of a new tree full of berries. You might not find a new tree full of berries. Hold on to the one you have. That seems to make a lot of sense. Anyway, what I'm trying to get at is our reasoning. It's not optimal. It might not even be logical. It's not logical with finely tuned parameters. It's not logic. It's stuff like, don't let go. Hold on to stuff. Defend what you have. Defend what you have is not like a computer printout calculation detects that 70% chance that you will lose. It's not like that. It's like you feel territorial. That's a hormonal flush that you get. It's not reason. Later, you might make up a reason why you did that. I thought I would be able to win in the fight if I defended it. You make that up. No, you just had a flush of hormones that made you want to keep this thing and defend it against intruders. I think that this AI and the people talking about AI hasn't been successful. It has been successful. It has posed questions about reasoning, about intelligence that has been fruitful. It's led to a lot of the development of computer science in general. We could list programming languages, data structures, databases. Because it's like a logic language, it was once considered AI. But now that it works and it's open source software that people take for granted, they forget that it comes from the AI world. But also, I think more fruitful is AI has asked questions, has forced us to ask questions about ourselves. We've learned so much about who we are as species, what intelligence is, what is the nature of intelligence, and how could we possibly work? How do we do what we do? I think that this is the real contribution of artificial intelligence. It's asking these really profound, deep questions. You look back at 1986, it seems naive, but it's only because of these questions asked in 1986 that we have our current understanding. It's very important. Okay. Thank you so much for listening. Please tell your friends if you like this episode. You can always subscribe and you'll get the new ones. I'm going to continue with these Turing Award winners. I find it very, I don't know, I'm learning a lot. Let's put it down. Edifying. All right. My name is Eric Normand. This has been a Turing Award reading, a Turing Award lecture reading, John McCarthy. Thank you for being there. And as As always, rock on.