 Our next step is to regroup these so that all of our parcels with respect to a particular input value are all grouped together. So this top two lines include all of the parcels with respect to x sub j. So it's just all of the little small expressions from the previous set of equations just rearranged, just reordered. But now we're starting to gather up all of the contributions of this partial of y with respect to an individual x element, an individual input. And if you look at them, you can represent them then with the pattern. So we see that the partial of x sub i plus k with respect to x sub i is w sub k. This is a shorthand way to represent all of the equations here. And you can see that the pattern holds. For any x sub i, for any input element, if we gather up all of the parcels with respect to it, we can take and represent all of those expressions with this shorthand. So for any input x sub i, the partial of the output x sub i plus k is equal to w sub k. So that's a neat little way to condense that and something that we're going to make good use of. Now we can actually plug this back in to our chain rule where the input gradient is equal to the summation of the output gradient with respect to each of these parcels. We can then substitute in this expression w sub k. We have the input gradient with respect to the output gradient times w sub k. So this is a fairly slick way then to do our back propagation. There's one more step we can do. If we take w sub k and flip it left to right, which we're going to represent with this left-handed arrow above it, then everything that was minus k becomes plus k. So we can change the sign on the k index in our output gradient and everything else stays the same. So we just did a little trick by pre-flipping this w sub k. Now this is a sliding dot product. So it is an array, which is our output gradient, and we have this kernel, our flipped w sub k, and we're summing it over the full length of that kernel. And for each value of our input x sub i. So then we can represent that even more concisely as our input gradient is our output gradient, convolved with the reversed version of our kernel. So this is a really slick little result. It says that the derivative of a convolution is a convolution with the kernel flipped. There's a pleasing symmetry with that. Math is beautiful. Exhibit 673. Very, very slick.