 There's a camera that watches you guys here, that's pretty neat, I guess. This is how you roll these days, the awkward silence before the lectures begin. So David, if you're ready, I'll get started. We're ready to go. Okay. Welcome, guys. My name is Reza Shadmer and this course is called Learning Theory. It is in its fifth or sixth year that I've been teaching this course. The course led to a book that is your textbook that you, I hope, at some point will get. And I thought today what we're going to do is that we're going to have a little introduction about what the course is going to be about and we're going to have a homework at the end of the course and at the end of this lecture. So it won't be just a nonsense kind of introduction, we're going to have some real things in it. But before we get started, I want to tell you a little bit about myself and why I was interested in the question of learning in particular with regard to biological systems and how basically biological systems learn. So my background is when I was a graduate student, I studied robotics and I was studying how to move machines, basically. And I couldn't help but be impressed by the fact that humans are so much better than machines in the way we move. So for example, you know, there's this chess program that now easily beats some of the very best chess players in the world and occasionally beats Gary Kasparov, the best chess player in the world. The program, however, does not move the chess pieces by itself. A human being moves the chess pieces for the program. So artificial intelligence has figured out how to play chess extremely well, better than nearly every human that exists, but they haven't figured out how to move the chess pieces as well as a child. And it kind of gives you a sense of how learning certain things is a lot harder than others and learning things like control of objects and movements is really something that we do very well, extremely well. So when I was studying robotics, I was appreciating the fact that control is difficult and learning control is difficult. And in biology, the fundamental problem is to survive. And you survive by learning to control things around you. You learn from your environment, you adapt to your environment, and you improve your status from day to day so you can survive. And this learning process is interesting, both from the point of view of mathematically, how do we build better robots biologically, what's going on in the brain? And this is the background that I began with asking questions about learning. Now, this course is putting together some of the fundamental ideas in learning theory, not to tell us how to build better machines that can play chess, but to begin to build a framework of how do we understand biological systems and how they learn. So it is a mixture of some fundamental ideas in learning theory, in control theory, in estimation, in system identification, because these are all pieces of the puzzle, I think, that our brains need to interact with, solve these components to learn control and to interact with the objects around us. So it's biological learning and control, which is what this course is about. That's an evolving course, because basically this is the material that I care deeply about, this is the material that I research, the students in my lab use this material to study how systems learn. So it's, in some ways, you know, you're lucky if you get a chance to hear from somebody ideas that are, in some ways, you know, the forefront of what he knows, and this is certainly the case here. And I hope that we'll have fun. So let's begin, let's begin with the details of the class, and at the beginning of the lecture today I just want to go over some, you know, usual stuff about courses. So our teaching assistant is David Herzfeld, so David is in the checkered shirt back there, and this year we're going to tape this thing and record it and put it on YouTube so if you miss a class, you know, you won't miss a class. One days on Wednesdays, three to four o'clock or so, the grading is as follows, you're going to have homeworks which are going to account for 40% of your grade, and then you're going to have two midterms, one right before the holidays, the spring break, and then one of the usual final period. The midterms are not cumulative, meaning that you're, you're going to be tested on half the course right before midterms, and then you're going to be tested for the second half of the course at the end of your course. Now your textbook, as I said, is this a book that I wrote a few years ago. David's details are here, he's got his email address there. Now the course is highly dependent on your homework, and I find that it's, you learn a lot by doing homework, by, rather than just looking at equations, it's a fun thing, to actually try the equations, code them up, it's more fun that way. So most of the course is going to depend on your homework, you're great. The homeworks are pretty much given every class, today you'll have a homework, and today you'll have actually two homeworks, one that is a regular homework and one that's extra credit, it's a little bit more challenging if you feel like you want to spend some time doing it. The homeworks have to be received by 8 p.m. the day before, sorry, the questions that you have on your homework should be received by 8 p.m. the day before due. The way we're going to do the homeworks are our follows, so every class has a homework, but it's due the next Monday. So you have a homework on Monday, you have a homework on Wednesday, both of them are due the next Monday. So once a week you turn in your homework. Okay? Now you have the weekend to work on it and things like that. And some more details about the way the grading works and so forth. Alright, so because so much depends on homework, I need to remind you of what is acceptable behavior in doing your homework. So most of what you're going to learn about is going to be through your homework and I expect you to do your homework without copying from others. So you encourage to work with others, but you have to write your own code. I expect you to write your own code. You can get help, you can discuss with your friends and that's perfectly fine, but you're not allowed to copy code from others. And your turning your homework is, you know, basically is written here. You will email David your code before class time as well as turn in the printed version of your homework. Now we have no late homeworks, if you just don't turn it in, that's alright. No big deal, you just won't get credit for it. But understand there are times when some things come up, many of you may have other things going on, that's alright, you don't have to sweat it. You're allowed two homeworks to drop. So, you know, if you're sick, if you're traveling, whatever, that's alright. Questions? Okay. Alright. I'd like you to read this on your own just again to remind you what I expect of you in terms of integrity when you turn in your work. You are expected to be honest about it and put in honest work. And if you don't, there are going to be consequences for that. I hope that we won't get there, but be aware that I expect honesty from you. And I hope that you will not disappoint me. Alright. There are two exams in this course, each that cover half the material and you're allowed to bring notes to the exams. So the way it works is that I have found in my own classes years ago when I took it, it's useful to summarize what you learned on a piece of paper. So you're allowed to bring a piece of paper to class on your exam day with whatever you want on it. And so it's a good way to, you know, have on a piece of paper everything you think are important to know about this class and you can bring it to class. It's, you know, just a single piece, you can write on both sides of it with whatever you like. So you can count on, you don't have to memorize things because you'll have an opportunity to write down the things that you think you might need to know. I'm not interested in you really learning how to memorize, I'm interested in you learning how to think. And the process is useful, I have found it, to have a document that says, this is what I learned this half of the course and I'm going to bring it. And I still go back to some of my own notes from the times that I was where you are sitting. Now, if you miss an exam, you'll have to document it to me why and then you'll have an oral exam with me. Many of you guys are undergraduates in this class and I think that's wonderful. I'm glad you're here. I want to encourage it, but you're going to be competing with graduate students and that, to balance the field for you, I give you extra points depending on where you are in your education. So if you're a fourth year student, you get a little bit, five percentage points. If you're a third year student, you get a little bit more. And if you're a second year student, occasionally we have students in their second year that take the course to get a lot more. So that makes it so that everybody has a chance to get an excellent grade. But at the end, you're going to be tested on the same material, you have to turn in the same homework and I'm going to have the same curve for all of you. Any questions? All right. There's no exodus yet. Are you still happy with the course? All right. Let's get started then. Let's begin with supervised learning. So this will probably be the last time I'm going to use slides. Most of the, almost all the course is going to be me on the board writing for you. But today, we have a simple idea that I want to teach you that has to do with how to learn a very simple algorithm and then you're going to use that algorithm for your homework to learn a simple mapping. Where we're going with your homework is to take a data set that is consisting of the following. A person's choices on watching a movie, watching movies on Netflix was recorded. So imagine a movie. Suppose we parameterized that movie with 10 different features. However, you like to describe what a movie is like, 10 features of it. And then at the end, when that person watches that movie, they're going to say, I like this movie or I didn't like this movie, okay? So then Netflix, what they're going to do is they're going to take that data from this person, maybe they've watched 500 movies, what they liked and what they didn't like, and then make a guess of what they would like in the future, okay? So this is an example of supervised learning. Why? Because the data set that we're starting from is labeled. It says, this is the input, 10 features that describe this particular movie, this is the output, this person didn't like it. Now what I'm going to do is try to build a relationship between this input and the output. And the simplest of all algorithms that can be used to learn this is called the perceptron learning rule. And it's very simple. The perceptron learning rule says that let's take the features, call them x, this vector x, 10 dimensional, let's say. Each one of those dimensions is going to have a value for this particular movie. I'm going to assign a weight to each one of these features, like you see here. So the input pattern is going to be this 10 dimensional vector describing this movie, there's going to be set of weights associated. So how important is this feature versus that feature versus this feature? And then what I'm going to do is I'm going to add it up and then find the sign of that function. So what does that mean? So my output y, my estimate of whether this person is going to like it or not is going to be some sign of the sum of w i x i. Where there's n features in this particular, what is sign? What does a sign function look like? When I say sign of x, what do I mean by that? Excellent, there it is. So that's a sign of x, thank you. So if the sign of the sum is plus one, that means they liked it. Like this movie. If it's minus one, didn't like it. So our problem is to figure out how are we going to change these weights here so that we get to better estimate whether they like this movie or not like this movie. In general, build a better model of this person's choices. And that's the learning algorithm that we're going to learn today. How to change these weights so that we better match the behavior of this particular person. Any questions? All right, so here's an example of something that is related to this business of features and classification. In this case, it's not a movie, but it's a number. So suppose that you have a matrix that you see here. And some of the data bits are on, some of them are off. So in this case, you can see that these are sort of like threes. The top one is a three, the middle one looks like a three, the bottom looks like a three. And maybe the right one there, is it a three? Perhaps, perhaps not. So you can imagine that this what we have is that somebody has written something down and you've digitized it. And now, based on this data, you've built a model to classify what you see and know whether this is the number three or not a three. So for your extra credit today, what you have is a data set made up of a thousand examples of somebody writing a zero, somebody writing a one, somebody writing a two, and so forth. And people in the post office, they're very interested in being able to classify this writing. What number is it? And so in this case, what you have, you don't have a 10 dimensional vector, you have a, I don't know, 30 dimensional vector. Where some place is something is on, something is not on. And there, our problem becomes a little bit more complicated. Because if I have to learn something that's based on many hundreds of dimensions and associated to a single number, maybe it's going to take me a heck of a lot of data to be able to learn this mapping. So then, I need to be a little bit more creative in what I mean by a feature. So maybe what I mean by a feature now isn't that this particular pixel is on, maybe what I mean is how many total pixels are on? What's the center of mass of these pixels? Is it symmetric? Is the left side have the same number as the right side? So now, features are a little bit more complicated than just is it on or is it off? Is it there or not there? You'd have to add some heuristic to it. So for example, what's the center of mass of the things that are on? Is it to the left? Is it to the right? Is it to the center? Is it symmetry? Is a symmetry this way or is a symmetry this way? These are things that you would add in recognizing what the features of that system are. But the problem remains the same. Whatever features you describe, we're going to add some weight to them, multiply them by some weight, and say, OK, what's the sign of that? And we're going to classify it as, is it a 3 or not a 3? And here's our estimate. We're going to call it y hat. This is what we think. We're going to be the output of our labeling. It's either going to be plus 1 or minus 1. And it's going to be the sign of the sum. And our problem is, how do we find these w's? These weights that allow us to learn this mapping. What we're really trying to do is build what's called a linear classifier. And what we have is that we have a bunch of features, in this case, d features. These features may be binary or they could be real valued numbers. And then we're going to assign weights to them. And based on these weights, we're going to count the votes. And if the votes are after they're weighted or positive, we're going to say yes. And if they're negative, we're going to say no. And it's going to be basically a majority rule weighted by these w's. So how do we change these w's? So let me build a basic idea for you. So you have a vector x. Let's make it easy, make it two-dimensional. And you also have a vector w. Let me see if I can get the red one. So here's a vector w. For every x, we have a w. For every component of x, we have a component of w. And what I'm doing here, multiplying each element of w by each element of x. So what does this mean? That means that in this case, the vector x is equal to x1, x2. The vector w is equal to w1, w2. And I'm plotting the vector x and the vector w. So what does this do? This looks like a dot product. Not quite, but it's a little bit like a dot product, which is basically saying, what's the projection of this w onto x? And if it's positive, the sign for it is going to be positive. So what happens if I've made a mistake? Say I have guessed that these features multiplied by my weight should mean that the person liked the movie. But in fact, they didn't like the movie. So I should have guessed negative. So what I want to do is I want to change my weight by some amount so that afterwards, after I have learned, it looks like this. So this is w before I made my guess. This is w after I made my guess. And so I should change it based on some amount so that if I now were to project w onto x, I would get a negative number. So if I am wrong, if I made a guess that was wrong, I should change my w. How much should I change my w by this amount? And what is this amount? It's something proportional to x. It's in the opposite direction of x, but proportional to it. So what I should do is if y hat does not equal to real y. So if I made an estimate and I was wrong, then what I should do, I should make my new w be whatever it was before minus something, some proportion to x. That's a very simple learning rule. And let me show you why it makes sense. So what you have here is the algorithm. Whenever the guess was wrong, move the voting parameter toward the correct answer. Here's what I did. My wi is going to change by some amount that's proportional to the difference between the reality and my guess. So what is this difference here? Well, if the difference is such that I made a correct guess, then this is 0 because y is equal to y hat. Nothing is going to change. If on the other hand, I said it was plus 1, but it's actually minus 1. So it's actually minus 1. I said plus 1. This becomes a minus 2. And similarly, if I made the guess the other way. So what you get is that learning rule there. It becomes a negative times something alpha. The 2 comes from the fact that it has the sum of the difference between plus 1 and minus 1 is going to be equal to 2, which is like a proportional constant there. But that's a very simple learning rule, which says basically change your weight in such a way so that it is in the opposite direction of the feature. So this is called the perceptron learning rule. If you voted and if you voted it correctly, if you made a guess and it was correct, you leave it unchanged. If you made a guess and it was incorrect, you change it with this learning rule. Let me stop for a second and see if there are questions. OK, so this is a very simple linear classifier. And you're going to use this tonight when you look at your data set. So I have drawn that there to show you what we just did. This is w minus a times x. Yeah. Enough. Yeah. Yeah, that's right. That's right. So how big should that be? The friction may not change the feature at all. That's correct. So depending on your alpha, you may need many iterations before. You're going to get better, but you may not get it correct on the next trial. Yeah. Good. Good question. OK, so I ran some simulations for the data set that you guys are going to look at tonight. And what I did is that I looked at the average error over the data set. So say there's 1,000 data points, meaning that you have 1,000 examples of somebody making a guess. So you have these 1,000 movies that this person has watched. And you have, for each one of them, whether they liked it or not liked it. You have your w's. On every case, you make a guess. And at the end, you say, OK, how many did I get wrong? That's your average error rate. So you have 1,000 guesses that you made. How many did you get wrong? That's your average. Then you learn from your errors. You change your weights. And then you try it again. You get another average error rate. And then you do it again. And what happens is that slowly, the thing gets better. So in this case, you get something that looks like this. So this is how you would evaluate your algorithm by finding its average error rate as a function of the number of examples that you've seen. And of course, the more examples that you see, the more you go through it, the better it gets. All right. So let's talk a little bit about what's the limitation of this simple algorithm? So where do I fail? What kind of problems can't I do with this kind of an algorithm? So far, what we have are vectors that are linearly associated with my output. So I take x and I weigh it with these w's linearly. Well, it's easy to come up with certain problems that this kind of approach cannot solve. And one of them was this classic problem that in the 1960s brought the field of learning theory to a complete standstill. So in the early 60s, there was a great deal of excitement with perceptrons. This algorithm that I just showed you. The thought was, this is going to help us understand how the brain works, how we can build all kinds of machines that can learn wonderful things. But a book came out by a fellow named Seymour Papert, where he gave a counter example, a very simple mapping that could not be learned by these linear classifiers. And it's called the exclusive or. Exclusive or is a condition where when two items are the same, you're going to give a 0. And when two items are different, you're going to give a 1. So our feature space is two dimensional. It looks like this. Here's x1, here's x2. If x1 and x2 are 0, my output is minus 1. If x1 and x2 are 1, my output is minus 1. So in the condition where the two are the same, I'm going to give a negative 1. In the condition where the two are different, like 1, 0, and 0, 1, I'm going to give a plus 1. So it turns out that an exclusive or is a kind of a function that we cannot use or classify to learn. There is no line, a linear function that I can draw that will separate the plus 1s from the minus 1s. So just to be clear, see the problem is that if you look at this, there's a minus 1 here, there's a minus 1 here, there's a plus 1 here, there's a plus 1 here. There is no line that I can draw that separates the minus 1s from the plus 1s. And this was the fundamental problem that linear classifiers faced. So in the late 60s, there was this death associated with basically the field of learning theory. Because here was a problem the linear classifiers couldn't solve. So what happened was that in the 80s, another idea came around. And that was the idea that, well, let's use nonlinear functions to code the space and use certain algorithms that can handle these nonlinear functions. So it then became things like back propagation and multiple layered networks. And that was the field of neural networks. And then neural networks gradually became solidified into the field of mathematics and statistics. And that became learning theory. So for about 20 years, the perceptron rule, because it had been shown to not work for complicated problems, really stopped the field from growing. But then people came up with other things. And now it's a mature field. So one of the simple things that you can do to tackle the problem is, well, let's not just have linear things. Let's add also some interactions in quadratic terms, for example. What I can also do is take my feature space x. And instead of coding it linearly, well, I can put a Gaussian on top of it, a nonlinear function. And we'll do that as well. But for now, for your homework, let's just consider the simple condition where x is going to be linearly coded. You're going to look for a set of weights that best approximate the data that you have using the perceptron learning rule. And we'll see how well you do it. The data set is available to you. It's online. And as I said, it's a Netflix movie selection to be able to detect when, basically, to predict whether they're going to like the movie or not using the perceptron rule. All right. Any questions on perceptron? All right. Yes, sir? It's there. No, it's the same data set. Let's just learn. Exactly. Exactly. You don't have a separate set. You can if you want to, but there's no need to. We don't need to divide. You don't need to divide. You can learn from all of it. Yeah? Where online is this going to be? Because the blackboard isn't quite right. No, that's just my own fault. I don't know anything about blackboard. Maybe David can figure that out. But there's a course website that all you have all the lecture notes, everything's on there. And it's here, shadmerlab.org. And if you then go to courses, I think it's learning theory. And there you find the entire courses there, all the lectures, all the homeworks, and more. All right. Let me now give you an overview of where we're going with the course. So the first part of the course is going to be regression. So regression is the problem of taking real valued functions and finding the best set of parameters, basically, that can be used to estimate regression. And we're going to begin with the idea of a loss function, which is what are we trying to minimize? How good do we want something to fit something? And so what is our loss function? And based on that, we'll come up with what's called the normal equation. And an earlier question was asked with regards to that learning function. So how sensitive am I to air? What should that be? So we'll see that when you want to do regression, using a method called the Newton-Raphson method, you can set your learning rule to be such that in a single iteration, you find the solution. So if your cost function is a particular shape, if your problem is a linear regression problem, you can solve your weights in a single step. And that's called the Newton-Raphson method, which is something that I'll show you how to do it. Then I'll teach you a little bit about sensitivity to error, which is when you make a prediction, how much you should change your error, change your belief. And we're going to look a little bit about biological data. So when biological systems, when humans make predictions, and then they have an error in their prediction, how much should they learn from their error? And we're going to see that there's a sensitivity to this error. And associated with that learning is something called generalization. So people and animals, when they make predictions, they not only learn something about the prediction that they made, but they also generalize their learning. So if they make a prediction, they don't just learn something about the function that they predicted, they learn something about the neighborhood. That neighborhood is interesting, because it says the building blocks of learning has a signature that's associated with how you generalize. So generalization is really a fundamental idea in trying to understand how biological systems learn. Is that they don't just find a relationship between their prediction and reality, but then they take that error and say, OK, I should change my belief in a particular way. And we're going to see how to estimate generalization. And what does it say about various kinds of learning? So we'll see some examples from visual learning where individuals are learning to understand things about the visual system. And then the way they generalize, tell us something about which part of the brain is being used to learn that function. OK. Then at this point, I'll mention to you the statistical idea behind maximum likelihood. Maximum likelihood is the idea that I'm going to build a formal model that relates what I'm observing to the true things that are generating it. And it goes like this. So maximum likelihood forces me to say there are some things out there that I can measure, features. And they're associated with things that I can predict. Why? So this is the thing that I'm making a prediction. And I can also measure this. And then there are some things that I don't know. W's, these weights that I don't know. And in maximum likelihood, when I write the probabilistic relationship between these variables, I'm defining how likely is it the data that I'm seeing comes from the model that I've produced. So maximum likelihood really is to find the things that you want, W's, in such a way that makes it the data that you have was actually generated by the model that you assumed. But the key idea in maximum likelihood is that you have to have a probabilistic model that says these things are the elements that make up the way this data was generated. It's our first example of what's called a generative model. A generative model is a model that you're proposing that says this is how the world generated this data. And in this model, you have certain things that you don't know and certain things that you can observe. And what you do is you find the things that you don't know in such a way that maximizes the likelihood of you observing the things that you actually did. So you want to maximize the likelihood of the data that you actually have by finding that parameter that you need to estimate. So maximum likelihood is the beginning of describing generative models. Generative models are models that says that in this probabilistic way, the data that I've been given was actually generated. Okay, so at this point, we're going to switch now to the book. And so the first four lectures are going to be the introduction, and they will not be an associated text with it in your textbook. But then from here on, the text in your textbook will take off. So sensory motor estimation, state estimation. So biological systems, when they learn, they need to estimate not just the parameters of the world, but also the state of the world. So let me give you an example of it. So say that I feel something in my hand and I see something as well. And what I need to know is that, all right, well, I have these two sensors, feeling and vision, that two are giving me information. How do I combine these things to know reality? To give you an example of this, suppose that you go walking in some trail, and you have with you a GPS that was made in America and a GPS that uses satellites from the European Union. These two GPSs will give you information about where you are. What is the truth? You're only in one location. You have two sources of information. How do you combine that two pieces of information? And that's the problem of state estimation. You have information that you've observed, and now you have to make estimate of the world. And so we're gonna go back to this model here. And now what we're gonna do is we're gonna say, my problem is as follows. There is some particular location that I'm at, X, I don't know where I'm at, but what I have is some sensors that tell me things. How do I estimate the truth based on the two sensors that I have? And I'm gonna have to combine this data together. And state estimation is the problem of basically determining the state of a system. So for example, when engineers shoot a rocket into space, they have some certain commands that they have sent that rocket. And by telemetry, they make certain estimates that says this is where the location of the rocket is. But the two pieces of information have to be combined. The commands that have been sent makes a prediction about where the rocket is. The sensors tell the engineer the rocket is actually in this location. I'm gonna combine these two pieces of information to estimate where it really is. So state estimation is the problem of determining based on, again, a generative model. I have a model of the real world and I've given it some kind of input. I've measured some kind of state. Where is the true state? I have two states that I've been, I have two sensors that measure something. What's the true estimate? And the way this is done is via a very important algorithm called the Coleman filter, which is basically a way to optimally combine pieces of information and make estimates about the underlying mechanism that generated. Now, it's particularly related to, it's close to my heart, because Rudolph Coleman did his work while he lived just a few miles from here near Homewood. And so it's good for you guys to know here's a person that did amazing stuff in the late 1950s, which probably many say is the most important algorithm in engineering. So we're gonna use the, we're gonna learn about the Coleman filter algorithm and we're gonna use it to estimate things like when you have multiple sensors, when you have something called signal dependent noise. What's signal dependent noise? Signal dependent noise is the kind of noise that looks like this. So if you have a signal, if the value is small, maybe this is what the noise would look like. Is the value increases, maybe this is what the noise will look like. If the value really increases, maybe this is what the noise looks like. Look, look at the noise here. The noise has grown larger as the mean of the signal has changed. This is called a signal dependent noise. And when Coleman did his work, he did it with what's called a additive Gaussian noise where the standard deviation of the noise didn't depend on the mean. Here, most biological signals, most real signals look like this. So we're gonna learn the basic Coleman filter and then we're gonna extend it by adding to it this noise. Now, why are we doing this? Well, because in the nervous system, it deals with multiple sensors. You have multiple sensors in your body. You have vision, you have proprioception. In the nervous system, you have noise. The noise is signal dependent. So when you make sensing of the world, you have to decide, how do I integrate multiple things? Let me give you an example. So when you hear thunder and you see lightning, your brain estimates that it come from the same source. So if I heard lightning, long time after I saw lightning, it probably didn't come from the same source. But if the two came temporarily close to each other, it was probably the same source. So you estimate the source based on proximity of these two things. And it's a way for us to think about state estimation in the nervous system. All right, now, as we learn the Coleman filter, we'll be able to make our next move to the concept of Bayesian integration. And basically, in Bayesian integration, now what you have is the concept of a prior and a likelihood. And then you're gonna make a posterior belief. So the idea is just like the Coleman filter, except that now instead of having two sensors, one of the things is your prior belief. The other thing is your observation. And then you make a prediction based on these two things. Okay, as we learn Bayesian integration, what we're gonna do is build a model of learning. In that model of learning, we're gonna apply it to biological systems. We're gonna have things that are gonna have certain characteristics. And it comes from the idea that potentially the way you learn is something related to the way your body changes throughout your lifetime. So your brain has to learn to control your body, but your body isn't the same always. You know, when you wake up in the morning, you're fresh, you get fatigued during the day. As you grow older, your body changes. When you were a kid, you had a different body than you have now. And throughout this process, your brain has been controlling it. When you walk up the stairs, your muscles get fatigued. But yet your brain knows how to control that system despite the fact that that system is changing. So the idea comes to us that the way your body changes is a form of generative model that the brain has learned to control and estimate. And that the way it learns other things are based on these temporal dynamics associated with the changes in your body that are then generalized. So we're gonna talk about how you might learn to control a joystick, the way you might learn to move your body in some novel environment. And the dynamics of that memory are gonna be used for us to stand basically this idea of time scales of memory. And we're gonna see how we can attack problems like learning to control and so forth. And it comes up, it begins with the idea of again a generative model. Me describing to you the way the data is generated and saying what's the best way for me to learn given that I've hypothesized the data that I'm seeing comes from this particular model. So the next part of the course is gonna ask the question that in things like the Coleman filter in Bayesian integration, I begin with a model. So like today I told you that I'm gonna start with a linear model, right? But who gives you that model? So as a biological system, who gives you the model that says this is the model you should use to learn and these are the ways to find? That's called structural learning. So meaning that how do you know the structure of the system that you're trying to learn? How do you identify that system? And about 10 years ago, there was a really beautiful mathematical result called subspace analysis that I wanna teach you that shows you how do you in principle identify the dynamical system that you're trying to learn to control? And the basic idea is that the first part of the course what we've been talking about is that if I give you a model you can find the parameters for it. But now I'm not gonna give you a model. You're gonna have to just figure out the model to start with. And we're gonna do it in the context of linear dynamical systems and finding the parameters for those linear dynamical systems. Now why is this important? It's the idea that when the nervous system is asked to ride a bicycle, what kind of dynamical system is it learning to control? How does it figure out that bicycle? But then if you've learned to control a bicycle that small, you also know how to control a large bicycle. Maybe you could also know how to ride a unicycle. So the structural learning is about figuring to identify the system, the structure of the system, which is different than finding the parameters of that system. All right, and then we're gonna end with optimal control. And optimal control now brings the whole system together because so as a human being, your objective isn't just to learn. Your objective is to excel, get reward, get a great job, be happy in life. So that requires learning, but itself it has an objective other than learning. So learning is useful to acquire a goal. So you come to class today in order to learn something from Reza, do well in your course, but this is only a small stepping stone in the ultimate goal that you have in your life, which is to retire in Bahamas. How do you get to retire in Bahamas? Well, that's your goal. Today, you have a small policy that says, well, I think for me to succeed in life, I need to take these courses and maybe a useful course will be learning theory. Biological systems have long-term goals. They involve reward, they involve costs associated with performing actions and so forth. And to perform an action, you need to consider these costs and rewards. And we're gonna talk about how to do that. How do we build mathematical representations of these processes of decision-making where you consider the costs and rewards and you perform some action? So learning is useful for us because it allows us to build a model of the world so we can make predictions. But by itself, who cares about predictions unless I'm gonna do something with it, right? I'm gonna make a good prediction because if I do it well, somebody's gonna give me money for it or I'm gonna become a happier person in some way. The goal is not to make a good prediction, the goal is to achieve whatever the long-term effect is that you're going after. So we're gonna describe optimal control from the point of view of having costs and rewards. And now the question becomes, how do I build a policy to how to respond to feedback? I'm gonna produce this long-term goal, what is the best action that I can do today? And this is a little bit related to our chess problem. So the goal of chess is to win the game. That's the goal that you have. And now you find yourself at a particular state of the game. What is the best action that you perform now in order to maximize your chances of winning the game down the road? And we're gonna do it in the framework of linear dynamical systems where we have feedback, where we reevaluate our costs, and so forth. So most of the end of our course will be associated with this process of building mathematics that allow us to minimize long-term goals and forming what are called feedback control policies, which says basically, what is the action that I should produce today in order to respond to a particular feedback? All right. So that's the roadmap for this course. I hope that those of you who decide to stay, enjoy it. And David and I are at your service to help you succeed in it. And any questions? All right. So good luck with the homework. It's, as I said, do next Monday. See you Wednesday.