 I'm sorry I was a little late in putting the homeworks up. I did put them up. Let me know if it's a problem. It is a problem. Yeah. I even put the M-sweeps up at the start, but I made changes. So anyway, they're there now. You can do them. We can postpone the deadline. We can postpone the deadline if you want, but I can't take the other one. Let me know if it's a problem. We have some function matrix of such a function, which is the matrix of partial derivatives. We have the gradient vectors of each of the coordinate functions, or going down we get tangent vectors like this. So as an example, there could be only two examples. Suppose I have a function from r3, r3, and since I used this twice in my examples, x squared plus e to the y, of course I'll forget it by the time I get there. f of x, did I have sine or cosine? I think it doesn't matter. Sine of z plus y. And so we can compute. So the book uses the notation here f prime. I really like using the f prime notation because people tend to think that this is a one-dimensional thing. They tend to use capital D to make it clearer that it's a derivative matrix. Of course, in all of these cases, just about everything in sight is a vector, not everything. So anyway, I'm going to continue to use the df notation rather than f prime. But in this case, then our derivative matrix here will be going across. I'll get 2x. Can I have an x there? No, I guess not. OK. Yes, I did. Huh. No, I guess I did. OK. 2x e to the y, zero. The derivative of this guy with respect to x, y, and z. The derivative of this guy with respect to x, y, and z gives me 1. With respect to y, I get sine of z. With respect to z, I get y cosine of z. And then here I get 1, 1, and 0. And so at the point, so here we see that f of 1, 1, pi is 1, 1, pi is 1 plus e. And so here, put this function at 1, 1, pi, 2, 0, right? Cosine of pi is minus 1, 1, 1, 0. Right? So this is the linear map that approximates f or near the point 1, 1, pi. And so then you can check if you want that the gradient of this at this point is 2d0. The gradient of this at this point is 1, 0, minus 1. The gradient of this at this point is 1, 1, 0. So everything's good. And, okay, so that's just to remind you how all this stuff works. Any questions about that stuff? Everybody's good with that, right? I'm just sort of resetting the stage. Okay. And then where I left off, right at the end of the class I talked about the tangent approximation. So f there. So if I have this same function f of the tangent approximation to f, which is sort of the first term of the Taylor series, it's just going to be evaluate your function at some point. And then we add on the derivative matrix evaluated at that point, times where I think of this as a column. So this is just saying to me close to approximation or in one variable calculus. This is the tangent line approximation, except it's not a line, right? So I think the same example, if I wanted to estimate, no, but I guess I'll leave this lower. So if I want to estimate, say, f of 1.1. Can you say tangent when we have that next one? I see. It's not really two-dimensional. Tangent. Let me just say tangent. Yes. It's a hyperpoint, right? Right. I mean it's, yeah, it's a hyperpoint. Right. Here I have three, so this has to be, it's the tangent space. I just lost my place. Okay. So say I want to approximate this. This is going to be just about f of 1.1 pi, which is right there, plus this makes minus 1, 1, 1, 0, times the column vector, which is the difference between this and this. This will be 0.1 in the x-coordinate, minus 0.1 in the y-coordinate, and 0.01 in the z-coordinate, which is, we lost it. It was there somewhere. 1 plus e, 1, 2, plus, and then here we get 0.2 minus 0.1e, and then 0. That's the change in x. The change in y will be 0.1 minus 0.01, and the change in z will be 0. It will be 0.1 minus 0.1 plus 0. So that means this will be 1.2 plus e over 10. No. Sorry. That's 0.9e. That's from this. And here I get, that's 0.09, that's 0.2. So this is sort of a stupid example because I have to do the e stuff, approximate e. And if I am approximating e on a calculator, then I can certainly just compute the value which I did, 3.669, 1.09. So again, this is just the same stuff that you did in one variable calculus where you want to approximate the cosine of 0.3 or 0.1 by looking at the derivative, looking at the value of the cosine at 0 plus 0.1 times the derivative. That's exactly the same except here we're moving using the tangent approximation. Let me do one more example. And this example is in the book except not really because they say it's something that it isn't. So if we take, say, g of uv, so this is going to take n, I'm going to take 1 something away from 0. Well, let me write it down. So which do I use? Cosine or sine? U cosine v, u sine v. So the book claims, the book does the example with a u here and then claims it's what I'm going to draw but it's actually not. It's a comb if you put a u here. But it says that it's a ramp which is just wrong. But it does everything else right so it doesn't matter. So something that we'll take, say, in the uv plane and I map it into yz space, then for a fixed value of u I get a line pointing at the origin and for fixed values of u I get something that spirals up. I get helixes. For fixed values of u, if I'm going to fix u, then I'll get helical things going up. Then for fixed values of v I get lines. So what I get is a ramp-like structure. Can you even see what I drew? So I get a spiraling up thing like that. Now it has a problem at 0 because at 0 it stretches way out. But other than that we're okay. So I get this ramp-like structure. Now let's look at what df-dg tells us. Let's just compute dg. So dg will just be the, well I have two variables in and three variables out. So I'm going to get a 2 by 3 matrix or 3 by 2. Let's forget who's a row and who's a column. But anyway I get the 3 by 2 matrix. So if I take the partial of u cosine v with respect to u, I get the cosine of v. If I take the partial with respect to v, I get u in a little more space. Negative u sine v. If I take the partial of u sine v with respect to u, I get the sine of v. No, it's okay. And with respect to u, I get u cosine v dot cove over v. And if I take this, I get a 0 and a 1. So I get that. 3 by 2 matrix. And let's say at the point which is the image which will be sort of here somewhere. Make a quarter term. Then let's see what we get. So dg of 2 pi over 2. We'll be in the 3 by 2 matrix. So the cosine of pi over 2 is 0. The sine of pi over 2 is 1. So I get minus 2. The sine of pi over 2 is 1. So I get 1. The cosine of pi over 2 is 0 again. And I get a 0, 1 there. So I should get that matrix. And now what is that telling me? Well, certainly I can do the same business here where I can think about this as an approximation. But I maybe want to interpret this slightly differently. If I multiply this... Well, no, let's... So now let's think about how I can see the tangent plane from this. So here, if I'm looking at the image of the point 2 pi over 2. So g of 2 pi over 2 is the point... x is... Uh-oh, where's my map? x is 0. y is 2. And z is 2. So near 0, 2, 2. I want to claim, so near 0, 2, 2. My tangent plane... Did I make a mistake? Yeah. Is it moderate to the last value? Yeah. Yeah? Yeah. Oh, yeah, yeah, yeah, sorry. My tangent plane, what does it look like? Let's think of it as being attached here at the point 0, 2, pi over 2. So it looks like... I mean, I'm at the point 0, 2, pi over 2. And then these columns tell me the parametric representation of the tangent plane. So it's going to be 0, 1, 0 times some vector. The x direction... t. Well, if I move in the u, if I move u, then I increase... Well, let me just call it t for a moment. And I give it 2, 0, 1. Right? So this is saying go to the point 0, 2, 0, 2. In one direction, increase... And this is increase v. Well, we shouldn't have used v. It's just the same statement, just interpreted in a different way. This is telling me that near the point 0, 2, 0, I have a tangent plane which contains one vector that only points in the y direction, and that is the image of the u-axis. If I increase u and hold y fixed, then I move this way in the y direction. So I imagine there's z. That's z. Here's y. Here's x. So if I increase here in this direction and move in this line, which is parallel to the y, and exactly at the rate in which I increased it, and if I increase in the other direction, here... Which moves... Initial graph, 0, 2, higher than 0? No, my initial graph goes to, like, 3 pi. So you're... What do you mark as an initial graph? Yeah. It doesn't look like 0, 2, 0. Because 0 and x is not... Oh, sorry. Okay? Yeah, okay. Why did I want that to be possible? Oh, no, no, that's right. So if I move a little bit in the u direction, I mean, fixing u and moving in the v direction, what happens to the x? Well, it decreases a lot, but the z increases. So this is a vector pointing up. So I can see this tangent plane sitting there. I don't know if you can see it, but I can see it. It's tilted up, so that the x decreases by a factor of minus 2, and as the x decreases by a factor of minus 2, the y doesn't change, and the z increases by a factor of s, however much I move the other guy. And, you know, if I did this at another point, which was turned a little further, I would see you get some combination of these things turning a little more. Is this clear to everybody? Am I meeting dead horses? I'm not sure. Okay, so I was talking to another professor who was telling me that, you know, she can't always read people's faces. I usually think it's okay, but, you know, if she doesn't know when she says something, people just sort of look, and she doesn't know if she's going too slow, and so they're all just like, yeah, whatever, get on with it, and she's going too fast, and they're like, whatever the hell you're talking about, just hurry up and get to something I understand. And right now, I thought, yeah, I don't ever see that, but right now, that's what I'm seeing. It's like... So I'm going to assume, unless somebody tells me that it's the, yeah, whatever, get on with it. Yeah. So that's the tangent plane. Are you going to point the gradient? Does it follow the spiral up? No. That's what I want to talk about next. No, it's perfect. Great leading. What about the gradient? Well, here, see, I don't really have a gradient. I have several gradients. Because this is not a function. This is a function from R2 to R3. So there's two gradients. Because there are, there's three gradients. I, I have multiple functions involved. The gradient only works for a function into R. Right, a function where I put in a bunch of numbers and out comes one number. Here I'm putting in two numbers and out comes three numbers. The gradient is here. There's actually three gradients. There's the gradient of that function, the gradient of that function and the gradient of that function. And those correspond here, here and here. There are three gradient vectors laying around here. There is the vector zero minus two. There is the vector one zero and there is the vector zero one. Exactly what those mean, I want to put off for just a few minutes. But there are three gradient vectors here. But then there are sort of three tangent, two tangent vectors which give me the tangent plane that I can interpret in this way. So depending on how I want to look at this matrix, whether I want to pay attention to the columns or the rows, tell me whether I'm looking at gradients or whether I'm looking at tangent vectors. It's a little harder in the case of this three by three matrix. The tangent vectors are there but they're three dimensional so it's a little harder to see something tangent to a surface because they don't really have a surface. It's a space that is being used there. Okay? So I'll say more about gradient vectors and stuff real soon. Are we okay with all this? So now I'm going to start new stuff. So this was review, although really I kind of stopped about here last time. So it's review of what I didn't talk about as opposed to review of what it is. Whoa, whoa. Okay. So for a few minutes, no, maybe not yet. So now I want to specialize once again to a specific class of functions. Suppose that my input and my output spaces are of the same size. So I have a function from R n to R n. I can think of that and I'm going to mostly think of it as n is 2 or maybe n is 3. So one way, and let's see, I have an example function that I was dragging around 2x, y over 2. And let me just use a calculator for that. So f of x, y is 2x y over 2. So one way we can think of this and this is the way that we've been thinking about this so far is I put in a box. I apply f to it and out comes some shape. In this case, since I'm doubling the x coordinate this box will get twice as long and I'm having the y coordinate so this box will get skin here. This output will be something that looks it'll just take this box it'll squish it in the y direction because the y goes to y over 2 and it'll stretch it in the x direction. I can do things more complicated where I'm going to mix up the x's and the y's and all sorts of stuff but maybe this is enough because it's easy to see what that function does. So this is one way to think of a function of two variables. But another way that is very useful and this is in fact a linear function two variables because I wrote a linear function you can think of a function from r in to r in which somebody asked me when I first did it in Q said well isn't that a vector field so yes now I want to tell you what a vector field is so this is one way this is a mapping which takes in some domain and it does stuff to it and I get something out but another way that I can think of this at least in the case where and the input and the output are the same dimension is I can think of this now here what this means is rather than drawing a picture of the input space and a picture of the output space what I want to do is to each point x y I want to attach the concept works no matter what dimension I'm in whether I'm in one dimension or eight dimensions but I'm going to draw it in two because two is easiest to draw now so that means that really in terms of the transformation I am not connecting I'm not thinking of take this box and draw an input from the source point to the output point instead I'm thinking of this as sort of giving me an infinitesimal motion of each point so here like along the x axis again let's use the same x y is two x y over two so along the x axis there's the zero vector attach that's pretty boring but as I move along the x axis say at the point one I attach a vector from point two and it points at the point negative one I attach a vector of going negative two it points outward at one half well it's too long I attach a vector of size one and so on so as I move out here and they point away from zero and similarly along the y axis I attach vectors which get bigger and bigger but they don't get as big as fast in the x so they're a little shorter and then because of the nice fact that this is a linear a linear function these vectors are very simple and they're just a combination here so out here at a point like let's do one one and a point one one the vector that I attach is the vector two one half so that'll be something like that and in general I will get a bunch of vectors as I move further in the x direction and moving up they sort of fall over and point more in the x okay now really this vector field is describing the derivative of some function but so this is a different object than this but I want to eventually try to relate these things these vector fields are very important in differential equations and modeling things but we'll come back to that when we get to it we're not operating on vector stuff but a lot of this course we'll come back to vector fields a lot because they are an important part of what we're dealing with and I guess so let me just say this now you can imagine I can find a curve this is not what we're going to focus on we'll focus on this in a little while if I have some initial point I can find a curve where the vectors are tangent this is something that one often does and you want to do where the vector field describes something about a function from describe something about a curve this is solving a differential equation this is also called integrating the vector field where we want to find a curve where the vector field that is described here looks nice to them so I'm not going to talk about general vector fields now but the concept of general vector field the sort of thing many people have seen vector fields before who hasn't seen a vector field before ok so we'll talk about that in some cases and in particular I want to talk about special vector fields which is a gradient vector field so here I'm going to start with some function f and then if I take the gradient of f if I look at so in this example f we know this defines for us the surface right so in this case and elongated in the x direction and shorter in the y direction so we'll get sort of a parabolic thing level curve I would get those are supposed to be ellipses and they're supposed to get closer together confocal ellipses it doesn't look very good I want to consider now instead of this function if I look at the gradient of f focus on this example for a minute the gradient is a function from r2 r is a function r2 to r2 this f that I have here is a derivative thing is we okay with this? there should be some very clear relationship between which it is is it clear what it is? I mean if I put of course these are on different scales but if I put this on top of this notice that arrows along the x axis well if I could draw arrows along the x axis that way yes exactly so the thing that maybe this picture should have in this example and it's true in general the level curves taking r on some open set u if it's the zero vector I can't I mean it's still orthogonal but I want to interpret something here so for each guy in there increase make it a curly u you can call it d like the book does I don't like the color it's just a blob that is the domain if you want you can just say differentiable near x each appropriate you can take the derivative around x then you have a tangent plane there and this gradient points where the tangent plane goes up the most and furthermore is the slope in the direction it's the amount were you asking that I'd say do you measure curves now sure well you can prove that it always points in the opposite direction of the normal line and the normal line points in the direction of the lowest so the greatest direction how do I prove that it points in the opposite direction of the normal line because if you think about the function of the tangent it's f of x where it's f of x times x plus f of y times y plus d times something like that so the normal line to the tangent line at that point is going to be f of x f of y derivative of f taking the x direction derivative of f taking in the y direction 1 and then the dimensions you're in and then the gradient looks really takes the x and y part of that and is in the, or for a two-dimensional case would be f of x and f of y so it's pointing in the opposite direction because the normal line is f of x or f of x, f of y and the negative one and then the opposite direction of that is just going to be f of x f of the x, f of the y direction and since that's always pointing in the direction of the lowest increase that means that the gradient vector will always be pointing in the direction of the greatest increase so let's do it a different way it's exactly the same way it's actually what you said but so I'm going to do it in some way that looks different but it's in fact exactly the same so I'm going to let you be a unit vector people understood what he said was not wrong he was right he was saying you look at the tangent plane you write the equation for the tangent plane and you forget about the z component of the tangent plane and you see that the tangent plane corresponds to the gradient and then stuck up in the z component and then you just project that and you see it goes the other way so everything's cool that's essentially what he just said and this is almost exactly the same except it isn't so I take any unit vector and let's compute the directional let's compute I'm going to take the directional derivative unit vector so this is at some point so just pick any old direction that you want to go in and take a unit vector that points in this and we saw before that that is the gradient evaluated at x dotted into the formula that we came up with you want to take the derivative in the u direction that's what it is so now this will have its maximum gradient vector here some unit vector u here and I'm taking the dot product which gives me the piece in the u direction this will be the biggest one in the parallel so that means that u is pointing now in the direction of maximal increase so let's say that sign of the angle between them but if we are and so the factor here the increase factor the length also then this is the maximal since here theta is negative there's nothing going on here but in some sense there's a lot going on the proof is like take the derivative dot in particular that tells us that these arrows are orthogonal I'll come back to that but they point in the way that the steepest way up if I look at this surface here and I take some point then the gradient is going to point the steepest way up the points help the most yeah then the function didn't happen to be differential I don't know maybe like two directions then it wasn't differential because you would have two tangent points pointing in opposite ways I'm at a single point right so if at a single point the bottom of exactly that parabola the bottom of exactly this parabola there's infinitely many steepest way up notice that I said if it's non-zero then we can do this if it's zero there is no unique steepest way up also if there is no gradient then maybe there's multiple steepest way ups so here there are two at the bottom here there's exactly two there's this way and this way and they're both best if I take instead of this function I take let's call it p x squared plus y squared then at zero the gradient of f is zero and every direction is the best way up because it's round but once I get away from zero there's a unique way if you think about the tact there's that the thing where at the north pole all directions are south you can't go if you're standing at the north pole no matter which way you move it's south there is no unique north there is no east or west at the north pole because always are the way a maximal increase in south there's a similar statement here because the gradient is in the other direction so it's very important that we have a non-zero gradient vector otherwise we might have multiple ways that are steepest okay use this now some kind of a chain rule part of our goal which I won't get to next time to do the general chain rule but let's do a chain rule in kind of a stupid case for example suppose that I have some function let me draw it by a picture taking r3 of t so this is some curve into space so you can imagine that this is like a path well this is a path you're going to follow is say t t squared give me some path in space and we can imagine that will take a point in space and give me a number this is a path I'm going to traverse and my g of xyz is, and I'll just use the one that I did 2xy plus yz 2xy plus yz I can't remember long enough to long discuss it oh no, that's the gradient I'm sorry x squared of y plus xyz and so the composition first I do f and then I plug those xyz into g this gives me a function just of t this will give me t squared times t squared so that will be a t to the fourth and here I get t times t squared times tq to be t to the sixth this is a very simple composition much more simple than either of the input functions I can easily compute let's call this h h prime of t is quite easily seen to be 4t cubed plus so it's easy to see that it's 4t cubed plus 6t 6t to the fifth but let's compute it another way let's try and compute this by a chain rule kind of thing so usually I mean the chain rule would say that we want to take the derivative of the outside function here which is g and evaluate it at the inside function and then multiply that times so the usual chain rule if I have too many letters here if I have a function r of s of t and I want to take d dt of that then that will be r prime evaluated at s of t times s prime of t so that's for a function of one variable here in a function of more than one variable we want to compute something similar which is going to be well the only kind of derivative that we have for a function g here from r3 to r is the gradient so we might try to take the gradient of g and evaluate it at f of t and then multiply it by the derivative of f of t well f of t here is a vector valued function so its derivative will be used f prime of t for the notation of that which is going to be a vector and the only way that we can yeah so that's what we're going to do so let me just turn the page here so we have let's just calculate this somewhere there we go so here we have the gradient of g is going to be the vector so g is x squared y plus xyz so the gradient of g we take the derivative with respect to x we get 2xy plus yz for the x derivative now we take the y derivative we get x squared plus xz and now we take the z derivative and if we evaluate the gradient of g at f of t which means we plug in x equals t y equals t squared z equals t cubed that will give us the vector so x equals t y equals t squared so that will be 2t cubed plus t squared times t cubed t to the fifth here for x is t so that's t squared and then xz will be t to the fourth and then xy is t cubed so that's the gradient of g evaluated at f of t and f prime of t is the vector 1 2t 3t squared and the only way that we know how to combine these two vectors to get a number out would be a dot product so we take the dot product here and we dot that with f prime of t and let's see what we get well we get 2t cubed plus t to the fifth that's from this component here plus 2t cubed plus 2t to the fifth from here and then 3t to the fifth here and then if we add everything up so that is 2 plus 2 is 4t cubed and then on the t to the fifth I have 1 plus 2 plus 3 which is in fact 6t to the fifth which is what we wanted this is h prime of t so that worked now so that should be a general principle that we want to do so let's check well it's not really check but let's confirm this so let's write that as a theorem so suppose that I have my function g which takes rn into r and this needs to be continuously differentiable on some domain so I have some open domain that g is defined on and I also have f which takes r into rn which is continuous well actually it needs to be differentiable differentiable on some interval ab so that's just to make sure that where we're looking everything is nice and we have and it also has to satisfy that the image of where we're looking lands in this u so f of t is in u for t in my interval where things are nice so okay so everything's just nice now and now let's let h of t be the composition g of f of t then h of t is differentiable h of t is differentiable on the interval in question with h prime of t equal just what it was in the example the gradient of g evaluated by the image of f dotted with the derivative the vector derivative of f so that's my theorem and so let's you can hear me so let's we already saw it in an example let's see how the proof works so the proof just works by writing down how stuff goes so the proof says that h prime of t well that's just definition the limit as h o I shouldn't have used h so I'm going to go back and I'm going to change all of my h's to capital h's I made the same mistake when I did it in class that's okay so there's my capital h h prime of t so that's my difference goes to 0 so that's just the standard difference quotient for the derivative which is the same thing let's now forget about h and write it in terms of f and g this is the limit as h goes to 0 of g of f of t plus my little offset minus g of f of t then divide it by the distance I am away so if it exists if it doesn't exist well then that's not the derivative so now we look at that and let's let just for notation I want to call this guy here some point y so this is a vector in Rn and that'll be f of t plus h I'm just going to call him y for a minute he really depends on h and let's let f of t and let's let x be f of t so I just have these two vectors x and y so here's this is in Rn or R3 here's f of t which I'm calling x and here's y f of t plus h just a little bit off and here's my u and my u where everything's nice they both are in there because that was the point and so now if I consider the segment between them then I can use the mean value theorem on this so I use the mean value theorem which we talked about before so I use the mean value theorem here to see that so everything's inside the domain where f is nice everything's nice in here it's continuously differentiable on there so the mean value theorem says that somewhere between here and here that that g of y minus g of x the difference here divided by the distance that they're between the distance this is y minus x let's write it as y minus x that this is exactly equal to now this is wrong sorry g of y minus g of x is exactly equal to the gradient of g evaluated at this point x no some point nearby x naught which is maybe here some point in the middle dotted with the vector from y to x okay so now we can write the same thing let's just divide both sides by h for the heck of it so g of y minus g of x divided by h is the same thing as the gradient of g at this point x naught dotted with y minus x divided by h I just divided both sides of the equation by h that's okay we can do that alright and so now we're almost there so here now let's just take the limit of both sides and let's rewrite it a little bit well this is certainly this is true then the limit as h goes to zero well first let's write it back in terms of h g of f of t plus h minus g of f of t because that's what x is divided by h is the same thing as the gradient of g x naught dotted with f of t plus h minus f of t over h so that's what we have and now we can just take the limit of both sides as h goes to zero of all of this stuff this equals the limit as h goes to zero of this thing so this h here minus g of f of t over h this limit is the derivative of h if it exists here and again if it exists this limit well we can break it up into two pieces the limit as h goes to zero of the gradient of g at x naught times the limit as h goes to zero of f of t plus h minus f of t well this is since f is differentiable this is just f prime of t and here well since in our picture before x and y our point x naught lived on the line between f of t and f of t plus h is going to zero and since f is continuous as well as derivative differentiable but since it's continuous this is going to shrink to zero so that means this point x naught will converge to the point f of t so that will be just the gradient of g of f of t which is a vector so there we have it the derivative is just what it was in the example so that gives us the chain rule for this pretty much useless situation well it's not quite useless gives us the chain rule for this situation where we have some function h of t which is a composition of a function from where we have a function taking r into rn followed by a function taking rn back down to r so in this simple case we have we have that that the derivative works the chain rule is what we suspect it would be which is the derivative of the outside function evaluated at the inside times in this case is interpreted as a dot product the derivative on the inside function so we can use that to see that we can use that to see that if we have a function actually I think I didn't do this in class so I'm going to do this on Wednesday so this is a good place to stop so I'll stop there and so the two things that I will do on Wednesday which is in the next class is that I will do this in more generality and I will also prove that the gradient vector is perpendicular to a level set and so that is if I have some function f from rn to r it has level sets which are contours so or if I it has a contour here and that the gradient vector is always perpendicular to the contours and this is intuitively quite clear because the contours are the directions in which the function doesn't increase at all and if the function is nice then the gradient pointing in the direction of maximal increase will be of course getting away from the contours as quickly as you can that is moving perpendicular to the tangent of the contour so that's a good place to stop because that's where I stopped last time so I'll do that