 On Tuesday, when I explained this stuff, I motivated by the geometric aspects of directional derivatives and graphs and tangent planes and tangent lines. So now I wanna start with sort of a slightly different end, which is one of the applications of this technique. In other words, what is a real life situation where we might be interested in finding directional derivatives? And I explained last time that directional derivative is just a fancy term for the rate of change. And so here is a typical example, which you may have already seen in the book or in your section work, which is imagine a mountain, imagine a mountain, okay? And then there is the ocean somewhere next to the beach, but we'll talk about this later. Focus on the mountain for now, okay? This is a mountain. Now, if I draw it like this, it's not clear whether it is a mountain or just a curve, right? So in order to give it an illusion of 3D, of three-dimensional picture, what I usually do as a reflect on what you would normally do or when you read the book what you see there is that you draw some curves on it. You're kind of trying to give it a three-dimensional feel, right? So my first point is what are these curves? These are the level curves. So that's the first thing I wanna say, which of course is sort of, I've said it before, but I just wanna emphasize it one more time. Even to visualize this three-dimensional object on the plane, we find it very convenient, very useful to imagine not just the contour, the general contour of this object, but this collection of curves which are kind of parallel to each other. What are they? Where they're just obtained by taking the section of the mountain by parallel planes, by parallel horizontal planes which are parallel to the floor, to the ground, okay? That's what they are. Well, in fact, I have drawn just the visible part, the visible parts of those curves, right? There is also a backside of for each curve. For example, this one also has a backside, but we don't see it unless it's a transparent object. We don't see it. So that's why usually we indicate like this. So each of them is actually has sort of the second half, which is behind, which we don't see. And this we also don't see when we look at the actual mountain, but when we try to draw it, draw the picture, then we draw this. Okay, so that's the first point. So the second point is what does it have to do with directional derivatives? Directional derivative is the rate of change. So change of what? So in this particular setting, there is a very good example of this, which is that suppose that there is somebody here on this mountain who is climbing it, okay? So there's a climber. And so the climber wants to decide which way they should go to, and depending on which way they go, what will be the rate of the, you know, how steep will be the climb, will the climb be? That's the question. So the rate of change will be here, the rate of its altitude change. Or here's her altitude change, right? And the higher the rate, the steeper is the climb in case we are going towards the top of the mountain or the steeper, the descent, if we are going down. And likewise, the smaller the rate of change, the smaller is that the steepness. So how do we measure this? The point is that there isn't a single number. There isn't a single number to give us the steepness of the climb because the climber could go in many different directions. And for each direction there is a particular steepness rate. For instance, the climber may be tired and doesn't want to climb anymore. So in that case, the climber could just go along the level curve. So then the altitude of the climber, the height of this point over the sea level will not change at all. So then the rate of change or the steepness level is zero. Steepness rate is zero, right? So that's one possibility. That would correspond to going in this direction. But on the other hand, the climber could go, could choose the most, the steepest path, which would actually be, if you think about it, just intuitively you can guess that it should be something which is perpendicular to the level curve. And in fact, this is something that we have confirmed by calculation. I will go over it one more time in a couple minutes. So in that case, the rate of change is the highest possible. So in this direction, in this direction, the rate of change is zero. In this direction, the rate of change is maximal. And if we go in a direction which is perpendicular to the level curve, but we go down, then it's going to be minimal, as small as possible. Well, it's absolute value is still going to be the highest possible, but it's going to have a negative sign. So it is a number, it will be the smallest possible value. So here, rate is minimal, okay? And so the way we, so in order to talk about the rate of change, you have to choose a direction. That's why it's a directional derivative. It's not, there isn't a single derivative for a function of two variables, but there is a whole variety of derivatives. The derivative is determined by the choice of the direction. Okay, so what are the variables here? Which variables am I talking about? Well, the variables are somewhere here. So there is a, let's actually do it. Let's do it on this board because I don't want to mess up the picture. So I want to draw the coordinate system. And let's say this plane, X, Y plane is the C level. And the Z then will correspond to the height, to the altitude above the C level. Now our point has a projection onto the plane. And so on the plane it has coordinates, X and Y, or maybe X0, Y0, to emphasize that these are some fixed numbers. So these are the coordinates of the, these correspond to the position of the climber. You see the point is the climber is in space, right? So a priori the climber has three coordinates, X, Y and Z. But because the climber is not flying, you know, he's not jumping with a parachute, he is on the mountain. Because he's on the mountain, as soon as we know the X, Y coordinates, we know the Z coordinate, unless the mountain has a shape which sort of comes back, right, but normally mountain, normal mountain, for normal mountain, it's not going to happen, right? So then the Z coordinate is actually determined by the X, Y coordinates. That's why the only parameters here are X and Y and not X, Y and Z. In fact, you can think of this, of the surface as a graph of a function. Graph represents the mountain. Okay. And so now when we talk about the direction, we can think about the direction as being the direction on the mountain, but we can also think about the direction as being the direction on the X, Y plane. And so in other words, what we can do is we can drop the level curve down here and it's not going to look exactly the same because I kind of magnified, I have kind of magnified the picture compared to the picture there. This is not, this is bigger, I have used a different scale for the bottom picture as opposed to the top picture. So the level curve will look like an ellipse but I have magnified it. I have rescaled it to make it bigger so that it's easier to draw. So then the directions which we have talked about here are the following. This one is both parallel to the level curve. That's the direction along the X, Y plane or in the X, Y plane, which would correspond to the movement on the mountain parallel to the level curve. This is the direction on the X, Y plane which will correspond to the path of steepest ascent for which the rate is maximal. And this will be the direction for which for which the rate will be minimal. So this is what we call the rate of steepest descent. Now I would like to draw these vectors actually in such a way that they are of the same length because they are supposed to be unit vectors. This is a convention. We agree from the beginning that we will measure directions by unit vectors. Okay, and the point is that these are perpendicular. So what are these vectors? This again is a vector parallel to the level curve. And this vector is a vector perpendicular to the level curve. And this is what we discussed last time. I mean, both of these vectors are perpendicular to the level curve. And the one which corresponds to the steepest ascent is the one which is the gradient vector. So this is actually the gradient vector. So this is not live. Because to get the steepest ascent, you have to go inside this level curve, right? Because you go towards the center of the mountain. And this one will be negative. It'll be negative. And the point is that I explained last time why this gradient vector is actually perpendicular to the tangent vector. Or in other words, perpendicular to the tangent line to the level curve. This is some calculation which involved the knowledge of equations for lines on the plane, okay? So that's the picture. But in principle, there are many other directions. We can also draw a direction like this, say. Again, some unit vector, U, which would be AB. Which would have two coordinates, A and B. And if the climber goes in this direction, then her path would go along that and it will correspond to some path on the mountain. Like this. Which is neither the steepest ascent nor the steepest descent. Nor is it the parallel to the slope. You see, in this case, it goes down because the vector points outward. So this direction, or more precisely, this line which contains this vector will correspond on the mountain to some specific path which starts at this point and then goes somewhere, okay? And what we are calculating is just the slope of that curve, of that path. So the directional derivative, directional derivative d sub u f y zero with respect to this vector u is the slope of the path on the mountain or on the graph corresponding to the line containing u. You see, this is the line I'm talking about. This yellow line, I didn't draw it very well. Maybe it's more like, it goes like this. So if I look at this line, this line will give me, will give me that path. What do I mean by give me? If I lift that path, there's a path on the xy plane, but it has a unique lift to the graph. This is the yellow path on the graph. In other words, this line, or this half line is the projection of that path. There's a unique path on the mountain which starts at that point and whose projection onto the xy plane is the half line directed by u. You see what I mean? Is there any questions about this? Okay. So the point is that the graph is two dimensional. It's a surface, but once you choose a direction, you cut a path or a curve on that surface. So you are back to one dimensional, to the one dimensional case. And in the one dimensional case, you can actually talk about the slope because you get the graph of function one variable, namely the variable along this line, and you can talk about each slope. That slope is the rate of change along that path. That's what we call the directional derivative. And finally, we have a formula for it which involves the gradient vector, which involves the gradient vector. And this formula tells us when this directional derivative is take some particular values. For example, the maximum value. Maximum value corresponds to u equal nabla. Minimum value is its opposite. And the zero value corresponds to the tangent, to the tangent, to the u which is tangent to level curve. Okay. But this we already knew from, just from by analyzing this picture, just on the grounds of common sense. We didn't need to do any calculation to figure this out. In fact, when you are climbing the mountain, you are not pulling out a paper pad and a pen and starting to calculate what is the best way to reach the top of the mountain. You kind of follow your intuition. And what your intuition will always tell you is that if you wanna reach the top in the fastest possible way, you have to go perpendicular to the level curve and direction perpendicular to the level curve. And likewise, if you wanna go down the fastest way, you also go perpendicular to the level curve. Is that clear? Okay, so intuitively it's clear, but now we have proved it because we've found the formula for the rate of change. And from this formula, which written in terms of dot product, it's plain obvious when it takes the maximum value, the minimum value or the zero value. And that was one of the main conclusions last time. But now I have illustrated it in this way. Okay. So now, one more, oh, that's odd. There is a problem. That's cool, catch 22. All right. So I have a, this is a small inconvenience. But that's very clever. I will not try to get it out of there because I don't wanna put the second one. So yes, what is that? What is that symbol? It's called nabla. It's a Greek letter, which is written opposite to delta. We're using it for the gradient. I'm sorry? That's right. This is an notation for the gradient. Okay. So, and one other thing which I wanted to mention in this regard is we talked about equations of, we have talked about equations of tangent lines and tangent planes. And I know that this could be confusing because you have many different objects at the same time for which you can look at tangent lines and tangent planes. Seems like there are many different discussions going on at the same time, okay? Let's focus. So the tangent lines and tangent planes. And so I just wanted to summarize this stuff once more so that there's no ambiguity, there's no confusion. So the first is a two-variable case. Two-variable case. In two-variable case, we look at the level curve of a function f of x, y. So it is given by this equation, x, y equals k. So that's this level curve, that's this curve of equal height or equal altitude which I drew over there. Okay, and so then we can look at the tangent line, tangent line to this curve. And the equation of this tangent line at the point x0, y0 is like this. It's f sub x at x0, y0 times x minus x0 plus f sub y, 0, y0 times y minus y0 equals 0, okay? But in fact, we can now look at a case of three variables as well, right? So actually, let me do it here. Three variables, three-variable case. In three-variable case, we would want to take instead of a function in two variables f of x, y, we would want to take a function maybe f capital, a function on three variables x, y, z. And by analogy, we would have to look at the level, but not curve, now level surface, level surface of this function. And so that would be f of x, y, z equals k. Maybe I should say that k is some number, k is a number. So that's for example, in this discussion, k would be the height. So I don't know, 1,000 feet. Whereas x and y and z are variables. So that's the difference. It's an equation, it's one equation for three variables. Because in this equation, this is some number, like 1,000. So then you can ask, what is the equation of the tangent, but now not line, but plane, to this surface at the point x zero, y zero, and z zero. And you see, the point is that the answer is given by something which looks exactly the same, except now we have three variables. So we have to add one more term, which involves the third variable z. So what the answer is, the answer is the following. You have to take the partial derivative with respect to x, multiply by x minus x zero, plus the partial derivative with respect to y, plus the partial derivative with respect to z now. Also, you see. So the difference between the two variable case and the three variable case is that we now have an extra variable. So everything gets dimension one higher, gets dimensions get bumped by one. We had a curve, now we have a surface, we had a line, now we have a plane. The equation here involves two partial derivatives and had this very simple form, and now the equation involves all three partial derivatives, but has the same form. So I will not derive this formula. It is derived in the same way as in this case, in the case of two variables. But I hope it looks convincing to you because you can clearly see the analogy. And in fact, if you wanna prove it, you can prove it in exactly the same way. Now, what is slightly confusing in this is that there is a special case of this three variable picture. And the special case is when f of x, y, z is f of x, y minus z. So you can ask, why would we even bother to look at this special case? And the reason is very simple because in this special case, if I look at the equation f of x, y, z equals zero, which is a special case of a level surface, namely the case when k is zero, right? This is just the equation z equals f of x, y. And this equation defines a graph, graph of the function in two variables. So it's kind of funny that function in two variables shows up in two different contexts. It shows up here in the context of level curves. But it also can show up here for functions in three variables, even though it is a function of two variables, but even when we have a function in two variables and we think about graphs, we automatically go to the three dimensional situation, right? And so the graph of a function in two variables can be thought of as a level surface for a function in three variables. Which function, well, this function, f of x, y minus z. It's kind of a simplest concoction you can make out of f and the new variable z. So we can apply this general formula for the tangent plane to this special case and what do we get? Let's observe that f sub x is just f small sub x because when you take partial derivative of this function big f, you have to differentiate this one, that will be just f sub x, and you differentiate this one, but this one is independent of x. So this doesn't change anything. So partial derivative of this function, of this whole function with respect to x is just a partial derivative of this part. So this small f sub x. Partial derivative with respect to y is f sub y. Partial derivative with respect to z is what? Negative one. It's negative one because that's the derivative of this function, negative z with respect to z. This guy doesn't depend on z, so it's partial derivative with respect to z is zero, but the partial derivative of this term is negative one. So we get these three partial derivatives which we substitute into this formula and what do we get? We get f sub x of x zero y zero times x minus x zero plus f sub x zero y zero y minus y zero minus z minus z zero, okay? Equal zero. And now we recognize the equation of the tangent plane to the graph which we have known already. This is the one which we got already two weeks ago when we talked about differentials and linear approximation. The only difference is that now I put negative z minus z zero on the left-hand side, but in our old discussion we would write equals z minus z zero. And then we would switch the left and right-hand sides too, but that's a minor issue, right? So this is just a slightly different form of writing the same equation, just putting everything on one side. And now you see that the case of graphs, of the case of tangent planes, of graphs of functions in two variable can be thought of in two different ways, okay? Namely, you can think that you started with a function in two variables and you just look at the graph and you look at the tangent plane. But you can also think of it as a special case of the more general case of functions in three variables except you take as a function in three variables this very special form, f of x y minus z. Either way you approach it, you get the same answer. But now you can appreciate more the connection between this answer and this. Many people ask me after last lecture why when we go, when we look at the equation of the tangent line of function in two variables, it's as though we are dropping this term z minus z zero. So there's this negative one which we just dropped. Well, geometrically it's clear. In fact, the board stayed till since Tuesday. So I guess nobody likes this small board except for me, which is good. So this is the tangent line and this tangent line corresponds along this tangent line to the level curve, we have the same value of z, z equals z zero. So that's why we drop this term to go from this equation to this. So you can get this equation from this by dropping z minus z zero because z is equal to z zero along the level curve. But also you can now understand that this formula is a special case of the formula for the tangent plane to a general level surface, which actually looks like this one. Which actually looks like this one except we have a third variable. In the special case when the function is like this, this third term becomes extremely simple. It just gets a coefficient negative one. So you just get minus z minus zero. And to make the analogy complete, let's actually look at the, let's fill in this square. It's like when you do IQ tests. I've never done it, by the way. But it's easy to find them online. And I think the problem is often like fill in the square. So this is exactly the kind of question here, what should be here, right? In other words, this is a case of three variables and this is a special case of that, right? Now, what's the analogous special case for functions in two variables? That's the case when this function of two variables, special case, when this f of x, y is some function in one variable, let's call it g of x minus y. And so you see in this case, in fact, you know what I'm going to do? To make it look more like an analogy, let's actually, let's reposition the boards. There we go. So now I think it's more clear what I mean by filling in square. I wanna find a special case of this, which is analogous to how we found the special case of three variables. And that's the case when, now our function in two variables is equal to another function in one variable, minus y. Okay? In this case, the equation f of x, y equals zero means y equals g of x. And this is a graph, graph of the function g of x. So a level curve for function of two variables can become the graph of a function in one variable. When this function in two variables has this special form, did you have a question? Good question. So what would happen if we put some k, right? If we put some k, then it would be here minus, I would put minus k, right? So then I could just absorb k into the definition of the function g of x. If I redefine my function g of x by subtracting k, then I would get back the level zero. So that's why we don't lose any generality by looking at the case of level zero rather than as opposed to the general case. So let's just look at the case of level zero. Okay, so that's a graph. And so now this formula for the tangent line, note that f sub x of x, y now is g prime of x. Just like here, the partial derivatives of the big function f, for respect to x and y, were just the derivatives of the small f. And now the role of the small f is played by g. So the partial derivative like this is just g prime. And the partial derivative with respect to the second variable is negative one again. Because this minus y now plays the same role as minus z plate here. So when we take the derivative, we get negative one. So now this formula for the tangent line becomes g prime of x zero times x minus x zero minus y minus y zero equals zero. Let me rewrite this. This is equivalent to saying y is equal to f prime of x zero times x minus x zero plus y zero. We recover the old formula for the equation of the tangent line to the graph of function one variable. That formula is exactly this one, right? The slope is f prime, you multiply x minus x zero and you add the value of the function at the point x zero which is y zero, right? So there is nothing mysterious in these formulas. In this special case, we get back the old formula we've known all along. And also this now shed some new light on this coefficient negative one which many of you have found mysterious. It's not mysterious, it's as mysterious as this coefficient negative one which shows up in the old formula for the tangent line to the graph. We were not surprised to write the formula for the graph of the function is y equals f prime times in this form. But if you have it in this form, it is, you can rewrite it like this. When you rewrite it like this, you find the coefficient negative one. That's exactly, the reason it appears is the same reason why this negative one appears. Okay? Any questions about this? Yes. Why do we choose a special case? That's a very good question. Why do we even choose a special case? Well, from the point of view, let's talk about this case. From the point of view of functions in two variables, this sounds strange. Why would you write it like this and not f of x minus x y or something, right? So from point of view of functions of two variables, it doesn't make any sense. It makes a lot of sense, however. From the point of view of the theory of functions in one variable, when we study functions in one variable, we would like to visualize them by graphs, right? When we draw a graph of function one variable, we introduce one more variable and we look at the graph, which is y equals f of x. What I'm saying now is that within this formalism that we are developing, we can think of the graph of g of x, which normally we would write as y equals g of x, just in this form. And when we write it in this form, we never say the word level curve or anything like this. We just say graph. But we have to realize, it's important to realize, to see the connection between different formulas. It's important to realize that this graph actually can be thought of as a level curve for a function in two variables. And that function just happens to be this function, even though it looks kind of, there's no reason or priority to study such functions. We have introduced them because we started from the point of view of functions in one variable. And then this will naturally fill out once we started to look at the graphs. So that's likewise in this case. Okay, yes? That's right. It could be a point or finally many points because you could be, let's look at the function. And for a good reason, right? The dimension of the level curve is going to be the number of variables or level curve or level surface and so on will be the number of variables involved minus one. If you have two variables, it's a level curve, so dimension one. There are three variables, it's a level surface, dimension two. If it's a function in one variable, it will be a dimension zero. And zero dimensional objects are just collections of points. And the way it works is just like this. Let's look at, for example, if you have a parabola, a level curve consists of two points. But if you have a cubic parabola like this, there will be three points. And if you have a cosine, you will actually have infinitely many points. Infinitely many points if the level is between zero and one and negative one. And if the level is higher than one or lower than negative one, then it will be empty. Level curve could be empty or level surface. In this case, it's sort of level point. We don't have a good word for a collection of points. So it's like level zero dimensional object, manifold as a mathematician would call it. Any other questions about this? Yes? Oh yes, I'm sorry. Thank you. That was just the mistake. Thank you. Yeah, it's G-prize, of course. I mean, I'm just saying that this formula becomes this formula, but I called it G, right. Sorry, yeah, I completely messed it up. Good job. Okay, so that will do it for us in this topic. And actually we are running out of time. So we need to talk about something else today also. I really wanted to go over this slowly to emphasize the connection between these different objects because I think that there are different dimensions and different number of variables at play and it could be very confusing. But I think that if you put this in this picture where you have these four squares, two variable case, three variable case, special case and two variable, special case and three variables, then I think it becomes much more clear. All right. But the next topic we'll discuss is concerns finding maximum and minimum functions. And as is always the case, it's actually instructive to look at this question already in the one-dimensional, in one variable case. Because we already can gain some insights into the problem by looking at this very special, the simplest possible case. If you have a function in one variable, it's a natural question to ask where this function attains maximum and minimum values. That's important because this function could respond to something in real life and you may want to maximize that or minimize that. And so the first point I want to emphasize is that there are two different types of maximum and minimum. The local and the global. Or the global ones are called absolute. But I like to think local, global kind of. I like this terminology, this terminology a little bit better. So what do I mean by local? So let me draw this. For a function in one variable, it is very convenient to analyze everything by using graphs of functions. And graphs again are curves on the plane. So we introduce the new variable y and we write a graph given by the equation y equals f of x. So let's look at this kind of function. That's a very typical example. So I want to focus on this point. So clearly this point, the value of the function at this point, this would be the point x zero and that's the value of the function. This is f of x zero. The value of this function at this point is greater than the value at nearby points. So that's an example of a local maximum. A point is a local maximum if there is a small neighborhood of this point such that if you restrict your function to this neighborhood, which is this little interval in this case, then this function will, this will be the maximum value on the interval, okay? But is it a global maximum? Clearly not because I have a point here, for example, x one for which the value is higher. So that's not a global maximum. That's not a global maximum either. In fact, in this example, there is no global maximum because I'm assuming that the function keeps growing, keeps increasing as x is increasing, okay? If that's the case, there is no global maximum. So global maximum is a completely different, finding global maximum is a completely different game than finding a local maximum. Finding local maximum just involves analyzing the function on a very small interval around this point. Finding global one sort of involves analyzing all points in your domain. The way I phrase the question so far, I have phrased the question so far is as though we were studying global maximum on the entire line, on the entire x line, okay? And you see clearly that that question often doesn't have an answer. In other words, there is no global maximum. Simply because for any point, there will be another point which will have a higher value, higher value, higher value and so on, okay? So the question of finding global maximum is better to phrase on domains which are bounded. Not on the entire line, but on bounded domains. Bounded means that it's finite. So it's better to say what is the maximum of this function on this interval, okay? This is an example of a closed bounded domain in the following sense. First of all, it's bounded because it's finite. It doesn't go to infinity. Second, it is closed because it contains the endpoints. And these are the kind of domains that we should look at if you want to ask questions about global maximum or absolute maximum and minimum. So let's look at this question in this particular case. In this particular case, we see that the maximum value is actually taken at this point. This is a maximum. So now you can appreciate why you have to include the endpoint. If we did not include the endpoint, there wouldn't be a maximum because no matter how close you are to this point, there will be another point even closer for which the value would be even higher. So therefore, there will be no maximum, right? So in order to guarantee that you have a positive answer to the question of existence of maximum or minimum for that matter, you should really look at closed and bounded intervals. And then what happens is that the maximum can be obtained either at the boundary, which is the case here, or it could be some local maximum which lies in the interior of this interval. In this particular case, you do have a candidate. You do have a candidate for a maximum, this one, because it is a local maximum and it is within this interval. But it's not a global maximum on this interval because the value of this function is just bigger. But if I were to take a different interval, if I were to take an interval like this, for example, okay? Then this guy would win because at the boundary, the value would be smaller. You see, at the boundary, the value would be smaller. So this guy would have the highest possible value on this interval. So the bottom line, the upshadow of all this is that the absolute maximum can be found in the finite set of points. And those points are, first of all, the end points and all the points where you have potentially a local maximum minimum. So the global maximum on the interval, on a bounded interval, on a closed interval, bounded interval, let's call it AB, can be found, maximum of some function f at one of the following points, one of the following points. The end points, which are AB, and the points of potential local maximum. And here it's important to emphasize the word potential. And those are the points for which f prime of x is zero. Because certainly if it's a point of local maximum, then the slope of the tangent line at this point is going to be zero, right? Because if you have a non-zero slope, just move away from this point and you'll get a bigger or smaller value on one side bigger and one on the other side smaller. So the only way you could have a possible, have a local maximum is to have slope zero and slope is a derivative. So that's why these are the points for which the derivative is equal to zero, okay? So this is the first statement that I want you to remember or remember from the recall perhaps, from the one variable calculus, which is that if you're looking for global maximum, what you need to do is simply measure or evaluate the function at the end points. Evaluate at this point, evaluate at this point. Next, find all the points where the derivative is zero and evaluate the function. So you get a finite list and then just pick the one or the ones where the value is maximum. These are the values, these are the maximum of this function on this interval. In other words, you don't have to look through all the points on the interval, there are infinitely many, but you only look at the end points and the points where f prime is equal to zero. That's the algorithm for finding maximum of a function. Likewise for minima, just replace the word minimum by maximum by minimum. So it's exactly the same. Now, so before I go and generalize it to the case of two variables, I want to explain what I mean by the word potentially, potential local maximum. In other words, if the point x is a local maximum or minimum, then the derivative is zero. I already explained this because the slope has to be zero. If it's a maximum, minimum slope has to be zero. If the slope is non-zero, it means you can increase or decrease the value by moving a little bit away from the point. So this is true, but this is not true. This is not true. In other words, if the derivative of your function at your point is zero, it doesn't mean it's an absolute maximum or minimum. And the reason is the following. The reason for that is the following. And there's a very simple counter example to this, namely the function x cube. So f of x is x cube, f prime is three x squared. And so f prime of zero is zero, which we see, right? We do see that the slope at zero is zero, right? But is it a point of a local maximum or minimum? It's not because if you go this way, it increases. And if you go that way, it decreases. And in fact, if you think in terms of monomials, the same thing will happen if you have x to the odd, to the n where n is odd, like three, five, seven and so on. Because the derivative of x to the n is n times x n minus one. So it's always zero for this function. But if n is odd, then for positive x, this value is positive. And for negative x, it's negative. So it's going to look like this. But if x is, if you have yx to the n where n is even, so it's two, four and so on, then it's going to look like this instead. And so in that case, it is okay. It is actually point of a local maximum minimum, a local maximum in this case. And if it were negative, it would be local, sorry, this is minimum, but if it were like this, it would be maximum. So in other words, there are many possible scenarios where you have a derivative equals zero and it is a maximum. And there are many scenarios where the derivative is zero, but it's not a maximum or minimum. So it only goes this way. If it is a maximum or minimum, then the derivative is zero. That's why I said those are the points which potentially could be maximum or minimum. So in principle, you could rule some of them out from the outset by saying, well, these points are actually F prime is zero, but the point is not maximum or minimum. So then it cannot possibly contribute to the list of suspicious points or candidates for global maximum minimum. But I think it's just much easier to just take all of them because they're going to be finitely many and just evaluate your function F at all of them and then compare. Where do you get the largest value or where do you get the smallest value? You guys following this? Okay, good. So that's why the way I formulated this, I didn't want to, at this level, to try to differentiate between the ones which are actually maximum and minimum and which are not. I just said, let's look at all which are potentially maximum and minimum. Okay, so that's the one-dimensional case. And now, in some sense, we already know everything we need to know because in the two-dimensional case, it's going to look exactly the same. The criterion will be slightly more complicated. Maybe I'll say one more thing, which is that there is a criterion to see whether the function is a maximum or minimum in this case. Namely, suppose that F prime is zero, but F double prime, F double prime at this point. Let me emphasize that it's a particular point x zero, which was the point zero in my previous example. Let's call it x zero. This is positive. Then it's a maximum. It's a minimum, sorry, minimum, local minimum. And if, let me see, but to the end, if F prime of x zero is zero and F double prime of x zero is less than zero, then it's a local minimum. In other words, if you think about this in terms of Taylor series, you can approximate, often times, you can approximate a function by, it's a smooth function by its Taylor series. And the first terms on the Taylor series are going to be given by the value of the function and the derivative and then the second derivative. So the point is that if the first derivative vanishes, that's the necessary condition to have a local maximum minimum. But then it depends on which term in the Taylor series is non-zero next. So for example, if the second term is non-zero, that means your function looks like x minus x zero squared times some coefficient, right? What I'm trying to say, what I'm trying to explain is the following. Let me do it more slowly. The Taylor series looks like this. So this is just the value of the function. Let's assume without loss of generality that it is equal to zero. I mean, after all, we can just subtract this value from this side. It's not going to change anything. So let's just assume that it is zero. So the next comes this term, which is the first derivative. And the first derivative has to vanish otherwise it can't be a maximum or minimum as we just discussed. So this also vanishes. So the next term is the second derivative times x minus x zero squared. And then there was some additional terms. But the additional terms are negligible compared to this term when x is very close to x zero. So you might as well replace your function by this function. But this function is just a parabola. I mean, the graph of this function is just a parabola. And the parabola we know, the parabola would be like this if the coefficient is positive. And it would be like this if the coefficient is negative. So in this case, clearly this is a local minimum for this one. And for this one, it's a local maximum. And the other terms don't matter. So that's the reason why we get this criterion. But if it is zero, if this term is also zero, then we can't really tell because we don't know what comes next. If the next non-zero term is a cubic term, we know it's not going to be a maximum minimum because we looked at the cubic parabola. And it's like this. It doesn't have a maximum minimum, right? But if the cubic one vanishes but the quartic one is non-zero, then again, it's a good shape. It's a U shape, right? So there is no telling. We should really then look at higher terms in the expansion. And that's much more difficult. So that's why we just stop here. And we say, well, here is a criteria. If the first derivative is zero, but second derivative is positive, it's a local minimum. And in this case, it's a local maximum. And we just stop right there. In other words, it does not exhaust all possible cases, but it exhausts its concerns or helps us in the cases when the second derivative is non-zero. And there is a similar, there is a similar criterion also for functions in two variables. So now we switch to functions in two variables. I know, I wrote what? On the top of what? Oh, they're both local minimum. Wow, it's kind of pessimistic. Thank you. I have to correct. We definitely should correct that. Otherwise it looks like we never reach maximum. Okay, I think now it's good. Right, because if it's negative, it's a shape like this. So it is maximum. So now switch to functions in two variables. So again, we have local things, local maximum and minimum and global ones. And searching for them is sort of two different games. For local maximum and minimum, the first step, step one is to check that the two partial derivatives are zero. Just like for functions in one variable, the first step is to look at the first derivative. Well, now we have function in two variables. So there are two different partial derivatives. So both of them have to vanish in order for us to have a local maximum minimum. Well, I'm assuming now that both of them exist. There is another possibility, which is that say one of them may not exist. And in that case, that's also a possible case for local maximum and minimum. But let's assume in this discussion that the partial derivatives always exist. So then we don't have to worry about this. If they do exist, then a given point X zero, Y zero will be a local maximum or minimum. Only if the two partial derivatives, both partial derivatives vanish. So when you kind of narrow down your search, you first have to, you throw everything away, everything else away. You just keep the points for which both partial derivatives are zero. But this does not guarantee, this does not guarantee it is maximum or minimum. Just like in the one variable case. The best we can do is to have a criterion involving second partial derivatives. And so the criterion, we would like to say something like, if the second derivative is positive, it's a minimum. If it's negative, it's a maximum. But there are now three different second partial derivatives. We have F sub XX, F sub XY, and F sub YY. So in fact, the rule is as follows. We have to calculate the following expression. So remember when we did cross products, we use determinants. So let's make a determinant of this two by two matrix, which is very easy to memorize. Think of the axis. Think of this one corresponding to the first index. And this, the rows will correspond to the first index. So first index here is X and here is Y. And the columns will correspond to the second index, which will be here is X and here is Y. So you put four possible partial derivatives in this matrix. Then of course, we know by Clareau's theorem that this is the same as this. But let's not yet worry about this. This is just an easy way to remember. Okay, and then we take the determinant of this. So what's the determinant if FXX, FYY, minus FXY, FYX. But FYX is equal to F. Okay, now we remember it. Now we remember it and we just put square. So let's call this D. So the criterion is that if both partial derivatives are zero and D is greater than zero, then it's a maximum. That's number one. Number two, if both partial derivatives are zero and D is negative, then it's a minimum. And finally, I'm sorry, I'm not saying it correctly. It's maximum, no, no, no, no, no, no, no, no, no, no, no, no, sorry, sorry, sorry. It's worse than that. It's maximum or minimum. Then this one is not, let's just say not. I don't have enough space, but not a maximum, not a minimum if it's negative. And if it is zero, it's inconclusive. Don't know. So first point, think of this as an analog of this rule. Because in the case of one variable, there is also a rule which involves the second derivative. However, this rule is much more complicated because there are three different partial derivatives of second order. And we make some complicated combination of them. Whereas here we just took the second partial derivative from the nose and we just looked whether it's positive or negative. But there is an analogy between the two clearly because this involves second partial derivative and this involves second partial derivatives. Okay, now, but it looks very mysterious. Why do I make this, why do I look at this combination? I noted other combinations. To understand this, think of the case where, think of the case of parabola, of the analog of the parabola. Because I explained to you how this rule came about by looking at the parabolas. The parabolas, because the parabolas approximate your graph, just because of the Taylor expansion argument, you can see that parabolas are going to approximate your graph near the point of where the first partial derivative is first derivative vanish. So think about the parabolas. And in the case of the parabola, you know that if it's an elliptic, first of all, parabola now becomes paraboloid. But there are two types of paraboloids. There is an elliptic paraboloid and there's a hyperbolic paraboloid, okay? And just look at the examples of elliptic paraboloids and you will see that for elliptic paraboloid, the first condition will be satisfied. And for hyperbolic paraboloid, the second condition will be satisfied. So if z is equal to, so let's say f of xy is x squared plus y squared, ah, you can just write ax squared plus by squared, okay? So what are the derivatives in this case? fxx is 2a, right? fyy is 2b, right? And fxy is 0. So there is a simplification in this case that there is no cross term, okay? So this matrix looks like this. It's 2a, 2b, and that's 4ab. That's the d in this case. So to say that d is positive means to say that both ab are positive or both of them are negative. If both ab are positive, it's going to look like this, right? If both ab and b are negative, it's going to look like this. So in this case, it's a local minimum, in this case, a local maximum. But in both cases, you see ab both positive, ab both negative, the combination a times b, or 4a times b, in both cases is positive. So that's why we get into the first condition, in the situation of the first condition where d is positive. So in this case, we can say for sure it's maximum or minimum, but we cannot say which one. So we have to look at it more closely. Okay, and what if d is negative? If d is negative, that means that ab have different signs. And in this case, so a good example of this would be x squared minus y squared. And that's a hyperbolic paraboloid. And for hyperbolic paraboloid, I drew this picture before, it looks like a saddle. And on a saddle, there is a point from which you can either increase the function if you go along one of the parabolas, which opens up this way, or you can also decrease the function by traveling on a different parabola, perpendicular one, where it opens up downward. So this point clearly, this point on a saddle, is not a point of maximum or minimum. So that is the explanation of this criterion in the case of quadratic functions, which are combinations of x squared and y squared. And the point is that all other functions can be reduced to these ones by a certain procedure. And that's how you get this rule, okay? So that's how we get this rule for local maximum and minimum. And that takes care of that issue. And the last remaining topic then is how to find the absolute maximum and minimum on particular domains. And this I will illustrate very quickly by a concrete example. This was step one, and this is step two. Let me give you an example of how to find maximum and minimum, global maximum and minimum. I have just enough time to explain this. So let's say you have a function f of x, y, which is x squared plus y squared plus x squared y plus four. Find global or absolute maximum and minimum on the domain x, y, where absolute value of x is less than or equal to one. Absolute value of one less than or equal to one. So the first step is to sketch the domain. Sketch, which is very easy, right? So this is just a square. The sides are lines parallel to x, y, x's at one and negative one. So step two is to find the boundary. Identify the boundary. This is the boundary. And now we are going to make a list of suspicious points or points which are candidates for being maximum or minimum, okay? And this list will include three kinds of points. First are points in the interior where both partial derivatives are zero. What do I mean by interior? Interior means everything except the boundary, okay? So I have to calculate what is fx and what is fy. fx is two x plus two y. And fy is two y plus x squared, right? So we have to set this equal to zero and this equal to zero. Since I'm running out of time, let me just go to the next step. So you solve this equation. It's very easy, right? So this is the first group of points that you get on your list. The second group of points are points on the boundary, but which belong to the smooth part of the boundary. Who is smooth part of the boundary? Yes, two x plus two xy, I'm sorry. That's right. Thank you. All right, smooth part of the boundary. What I mean by this, well, we exclude, I mean, maybe it's not a good idea to say smooth part. Maybe it's not like this. Let's just say, let's just call them components of the boundary. Components of the boundary. So what I mean by components, I mean this four intervals. So break your boundary into pieces which can be represented by a nice equation, like here. Here is like x is equal to one and y is between negative one and one. So then restrict your function. Restrict your function to this component. It will effectively become a function in one variable. Solve the problem for this one variable function, okay? One minute left, okay? So let me give you an example. Say one of the components is y equals one and x is between negative one and one. So I substitute this y equal one into this formula and I get f of x one is x squared plus one plus x squared plus four. So that's two x squared plus five. I got a function in one variable on this interval. Find the absolute maximum of this function on that interval, okay? And then the same for each other component. And finally, and that's not all. It would have been all if you didn't have the corners. But because you have corners, you have to include them because in principle it could happen that the maximum minimum is attained at the corner. So look at, include the corners. So now you compile the list and you evaluate the function and you choose the one where the value is maximum. So that's how you do it. All right, have a good weekend.