 Welcome back to our lecture series Math 12-10, Calculus 1 for students at Southern Utah University. As usual, I'll be your professor today, Dr. Andrew Misseldine. In lecture 22, we're going to introduce probably the most important, certainly my favorite of all of the derivative rules we're going to learn from chapter 3, and that is the chain rule. The chain rule is so nice, we're going to do it twice. That is to say, lecture 22 is actually the first part of a two-part lecture 22 and lecture 23, dedicated to covering the many possible examples we could see with the chain rule. First of all, what is the chain rule all about? Well, it's a very natural question to ask, is there an effective way to take the derivative of a composition of functions? After all, our other derivative techniques have taught us how to take derivatives if we have a sum of functions, difference of functions, product of functions, and quotient of functions. Another operation we should be concerned about with functions is function composition. What happens if you have F composed with G and you want to take the derivative? This is exactly the scenario where the chain rule comes into play. Now, the chain rule is going to be a little bit more complicated at first, but this is something you're going to get used to very quickly. The chain rule says the following. If G of X and F of X are differentiable functions on their respective domains, or more precisely, if F and G are differentiable functions such that the range of G is contained inside the domain of F, this is mostly just the compatibility issue as we start composing G with F. We have to make sure that those things match up. But assuming there's no composition worries with the domains and ranges and such. If F and G are both differentiable functions, then F composed with G, that is to say F composed with G, remember that circle symbol means you put G of X inside of F of X. If you take the derivative of a composition, so the F of G prime evaluated at X, that is the derivative of F of G of X. This is equal to F prime evaluated at G of X times G prime at X. This right here, the chain rule tells us that when you take the derivative of a composition of two functions, you're going to get a product of two derivatives. Now, these derivatives are often given the following names. When we look at, for example, F prime of G of X, this is often referred to as the so-called outer derivative. By contrast, the function G prime of X, this is often referred to quite naturally as the inner derivative. The reason behind that is that when we look at the original composition, there is two functions composed together. You have this outer function F and you have this inner function G. G is often referred to as the inner function, and F right here is often referred to as the outer function. Because if we think of this as like some type of nesting toy or nesting doll, we're putting G inside of F, so F as a function is this outer shell, and then G is this function inside of the other one. Well, when you take the derivative of this composition, you end up with two factors. The first factor is essentially the derivative of the outer function, and then the other factor is essentially the derivative of the inner function. The only caveat is that with the outer derivative, we don't take F prime of X, we actually take F prime of G of X. So the inner function still goes inside of the outer derivative. Now using slightly different notation, let's instead use this Leibniz notation, dy over dx. If we take the substitution, we'll introduce a new variable for the sake of example here. If U is equal to the function G of X, that means F of G of X would then become just F of U, Y equals F of U. If we take the derivative dy over dx, if we take the derivative of Y with respect to X, this then factors as the derivative of Y with respect to U, and the derivative of U with respect to X, for which we still have this outer derivative, inner derivative notation going on here, or one other way we could write this from a perspective of notation. If you take the derivative of F of U, this will be the derivative of F with respect to U, and then you multiply that by the derivative of U. So you have to multiply by this inner derivative. Now when you're using the chain rule, this is the part that people need to remember. People near students, viewers typically always remember to take the outer derivative. It's the inner derivative that's often forgotten, and we'll see that in examples in this video in subsequent videos for lectures 22 and 23. Let's look at the proof behind the chain rule. In a nutshell, the proof is basically this statement right here, believe it or not, because if you were to wipe away everything on the screen and only show a college algebra student what they see here in the box, they would be like, let's see here, dy over dx is equal to dy over du times du over dx. Let me see. Oh, the du's cancel? Yep, it's true. Because in terms of fraction simplification, if you have like two fifths, so you times that by five sevenths, then the product is gonna be two sevenths. That's how fractions work and derivatives kind of look like they're fractions. Well, the good news is they are fractions, right? Derivatives are limits of difference quotients. Quotient means a fraction here. And so derivatives behave like fractions, but that's essentially the proof of the chain rule. In a little bit more detail, let's see the following. So let's consider the expression delta u. Remember this delta symbol represents the change of the variable u. So there's a difference that's happening in some regard. So if we take, so let delta u be the change of u corresponding to the change in this variable x right here, which delta x will be the change of x right there. This then tells us that delta u will equal g of x plus delta x minus g of x, right? In other situations before this, we often solve things like f of x plus h going on here. This symbol h can be interchangeable with delta x because delta x is the difference between these two different x values, right? So it's a change of x. And we used h synonymous to delta x previously. So when you see this x plus delta x, that just means it's the number x, but it's been moved by some incremental amount known as delta x here. So this is the function g evaluated a little bit away from x and this is the function evaluated at x. And so their difference would be the change of our variable u. Remember u in this context is g of x. It's u is the inner function. Now in contrast, I mean it's gonna look very similar here but let delta y denote the change in y corresponding to a change in u. That is delta y here is gonna be f of u plus delta u minus f of u. So same basic idea here. Delta u is a small change in the variable u. So u plus delta u is just a little bit perturbed from u and it's just a little bit to the side of u and then you have u itself, their difference gives us delta y. So we have this change of the outer function y and we have this change in the inner function u. Okay, so keep that in mind as we go forward. So now look at this calculation here. We have dy over dx. Well, by the definition of the derivative, this is gonna be the limit as delta x goes to zero of delta y over delta x. So this right here is just the slope of the secant line. If we take the limit as the small step to the left or right becomes infinitesimally small, then the limit of the secant slope will become the tangent slope, aka the derivative. So we see something like that. Next, what we're gonna do is exactly what the college algebra students suggested we do. We're gonna multiply delta y over delta x by the number one times it by one but not just any one, a strategic number one. We're gonna multiply it by delta u over delta u for which that's essentially just a one right there. Now the reason we're multiplying by delta u over delta u is we're gonna rearrange it. Multiplication is commutative so we can rearrange this as delta y over delta u. I should do that one in red because that's gonna be the outer derivative and then the inner derivative is gonna be this delta u over delta x. You see right there. So slightly rearranged, you get delta u over delta x, the inner derivative and delta y over delta u, the outer derivative. Now when you take the limit of a product, this becomes a product of limits. So we can segregate these into two. We're gonna get the limit as delta x approaches zero of delta y over delta u, which is a derivative that sure looks like one. And then we also have the limit as delta x approaches zero of delta y over delta x. Now the next step is pretty critical because after all, if we just take the limit of, the limit as delta x approaches zero of delta u over delta x, that's exactly going to be dy over dx, the derivative of u with respect to x. In order to get to dy over du, we're not quite there yet. What we have to do is make the following observation. Is note that as delta x approaches zero, this would imply that delta u, which remember is g of x plus delta x minus g of x, right? What happens to delta x here? It's getting closer and closer to zero. This is gonna be getting closer and closer. This will be approaching g of x plus zero minus g of x right here, which is gonna give us g of x minus g of x, which is equal to zero. So I should notice here, I should note here that we're using the continuity of the variable u in this situation. But as we are assuming it's differentiable, that implies continuous, so we can make this transition. So therefore in short, delta x approaching zero implies that delta u will approach zero. So we can make that substitution right here. Oh, take the limit as delta u approaches zero of delta y over delta u. Then that is this outer derivative. And so we get dy over du times du over dx, giving us the chain rule. Let's put this into practice. Let's show what the chain rule actually is useful for. So consider the function h of x equals five x cubed plus two squared. So in order to apply the chain rule, we should recognize that this is a chain of two functions. It's a chain here just being a composition of two functions. We have an inner function, that is a function inside of another one. We have this five x cubed plus two. We have this polynomial function sitting inside of the outer function squared. That is we have this power function written slightly differently. We could decompose this as u squared composed with the inner function five x cubed minus two, plus two, excuse me. So we take five x cubed plus two, we put inside of u squared. So by the chain rule, we see that the derivative of h with respect to x will come with two parts. The first part is we're gonna take the derivative of u with respect to itself. Then you're gonna multiply that by the derivative of five x cubed plus two with respect to x. Now this little prime notation can be a little bit misleading because it doesn't tell you what variable you're taking the derivative of respect to. That's why statements like du over dx or maybe like dy over du are typically better notation because it tells us exactly what the variable in play is. But here you can see there's only u, only an x, the context is clear. The derivative of u squared with respect to u will be a two u. And then the derivative of five x cubed plus two, that's gonna be a 15x squared like so. Now u was just this artificial variable we inserted into the problem, right? Remember u is equal to the inner function five x cubed plus two, this needs to be inserted in here. And so we end up with the derivative being two times five x cubed plus two times by 15x squared for which it's good to keep a derivative factored. I mean, if you want to multiply out, you could. I honestly like it factored as like a 30x squared times five x cubed plus two. That one I think is really great. That's an advantage of the chain rule is it'll keep derivatives factored. But if you didn't wanna multiply it out you can distribute the 30x squared through and you end up with 150x to the fifth plus 60x squared. So that would be the derivative here using the chain rule. And I do wanna emphasize that here we illustrate the chain rule but we did it sort of in a long drawn out process. When people do the chain rule, they generally don't do this stuff right here. You generally see it in a manner similar to what you'll see in just a second. If h of x, remember you can't see it on the screen now. If it's five x cubed plus two squared, typically you're like, okay, I'm gonna take the derivative of the outside function. I'm gonna get two times. Five x cubed plus two. And then you're gonna times it by the inner derivative which is going to be in this case, 15x squared. So that's all it is. You just do the derivative in line, not this drawn out thing. I did the color coding and used the variable you'd emphasize we have our outer derivative which we see right here and our inner derivative which we see right here. The two ingredients are there. Inner derivative and outer derivative. They're all there. But this example might not be the most motivating one because you're like, do I really need to have the chain rule in this situation? Cause after all h of x, this is something we might have been able to do before. Five x cubed plus two squared. You could just foil that out. You're gonna get a 25x to the sixth plus a 10x cubed plus a 10x cubed plus a four. For which if you wanna take the derivative, admittedly you should probably combine these like terms to get a 20x cubed, what have you. But if you take the derivative now, you're gonna end up with a 25 times six x to the fifth plus a 20 times three x squared derivative of a constant is equal to zero. So that'll just go away. And notice of course that 25 times six is exactly 150 to the fifth and then three times 20 is 60. So you see that this result agreed with what we saw before. So the chain rule wasn't really necessary in the situation. But on the other hand, what if we changed up the function a little bit? What if we changed it to be h of x equals five x cubed plus two to the 100th power? Even using the binaural theorem, which would be an appropriate tool right here and combining this with a power rule and such. That'll be a lot to do, right? You would have to multiply this thing out. You get 101 terms. I mean, that's a lot of Dalmatians right there. A lot of monomials. And you have to take the derivative of them one by one by one by one. And then again, when it comes to functions, we really like derivatives to be factored. So think about what you'd have to do, multiply that out, take the derivative, possibly factor it. In terms of the chain rule, the chain rule can handle this problem no more difficult than the previous one because like before we have this inner function five x cubed plus two. We have the outer function. It's a power function, a little bit bigger power this time, but still a power function. The outer derivative would be 100 times the inner function five x cubed plus two raised to the 99th power. By the power rule, you'll lower the power by one. And then the inner derivative would still be 15 x squared. That part would be no different. And look, it's already factored. If you wanna multiply the 15 by the 100, you're gonna get 1500 x squared times five x cubed plus two to the 99th power. So we found the derivative and it's already factored. So it's like best of both worlds. The chain rule is definitely king on a question like this. But let me give you one last example before we end this video here. Notice, if you take a function like y equals three x squared minus five x to the one half power, this is very similar to the last example. You have a inner function of a polynomial in this case, three x squared minus five x. This sits inside of a power function that is the one half power. But if you try to take the derivative here, take y prime, you really can't foil out the one half power. The one half power is the square root. And so foiling this out, it doesn't really make much sense, but the chain rule is still equally applicable here. If we take factoring this as u to the one half power composed with the inner function three x squared minus five x, we take the derivative of the outer function, which by the power rule, the outer derivative will look like one half times we get the inner function three x squared minus five x, and then the exponent lowers to negative one half by the power rule. And then we take the derivative of the inner function three x squared minus five x. That's gonna give us a six x minus five, like so. For which then we can rewrite this if we need to. The numerator is gonna look like six x minus five. The denominator will look like two times, well, the square root or the one half power to every prefer of three x squared minus five x, like so. And so the chain rule handles this super easy. And so the chain rule is extremely powerful tool to use when you have a composition of two functions.