 Most of the time, it doesn't pay to optimize your code unless you have an operation or a function that's going to be called a lot. This is the case with convolution. In one-dimensional convolutional neural network, the convolution operation is run a lot. We have for every kernel, for every channel, multiple times through the network for however many iterations. This goes up dramatically when we go to two-dimensional convolutions. Instead of convolving along one axis, we're convolving along two. So however many convolutions we're doing before, we're doing the square of that in a two-dimensional convolution. And in a three-dimensional convolution, we're doing the cube of that number. So it gets very big, very fast. It's worth being careful and not doing anything really dumb, really slow. The natural way to implement a convolution is with a for loop. We're doing the same thing over and over again in a slightly different position each time. It's the ideal case for a for loop. The only problem is that in Python, for loops are pretty slow in native Python. The solution to that most common is to vectorize your code. We turn it into arrays and then use NumPy, the most common tool for this, to multiply those arrays and two-dimensional arrays and three-dimensional arrays by each other. NumPy does this very efficiently. It's very good at it. Convolution introduces some kinks into this. If we want more control over how convolution takes place, it's not always straightforward to vectorize the code. In order to accommodate maximum flexibility, we'll step back from relying on NumPy for convolution and we will do a little bit more of a hand-rolled solution. Our favorite tool for this will be Numba. This is closely related to NumPy. It's often packaged with NumPy and distributed with Anaconda. They're both supported by NumFocus. It is a package that takes native Python code and compiles it down to C code that runs very fast. And it's a just-in-time compiler. That's where the jit comes from here, an njit. Meaning that the first time Python encounters this code, it stops what it's doing, takes the whole chunk of Numba code, compiles it to a C raw C code block, and then executes it. And then each time it comes back, it can take this pre-compiled chunk and run it. And it tends to run very much faster than interpreting the Python code each time through. The n in njit is shorthand for an argument noPy equals true. Usually what Numba does is if it tries to compile a block of code down to C and it can't, it falls back and just interprets it as normal Python. This is not what we want here. Here we want to make sure that the compilation down to C works. And if it doesn't work, we want to get an error. We want to make sure that this is running fast or not at all. And then if it stops running, we can fix it. So the njit decorator before a function makes sure that that's the case. So here we have our custom-written convolve1d function. We put the ampersand njit above that. That's a Python construction called a decorator. And it hides some stuff that tells Python to interpret this to do just in time compiling on it when it runs. And it takes in a signal and a kernel. And as we've discussed before by convention, the signal is the longer of the two, but convolution doesn't really care which is which. It is symmetric with respect to these two. And then we take the signal size and the kernel size. We calculate the total number of convolutions. In this case, we're restricting ourselves to just the valid convolutions. So where the kernel completely overlaps with the signal. So because of that, the number of actual convolutions will be the length of the signal minus the length of the kernel plus one. Then we take and flip the kernel around. This colon colon minus one indexing construct is shorthand for start at the end, step back one at a time, and end at the beginning. So reverse this kernel. Then we initialize a result. We make an array of zeros. That's the right length of the final result. And then we use our for loop and we go through for each of our dot products. And we calculate the sliding dot product of the signal with the reversed kernel. And then for each of those dot products, we take that result, put it back into the correct position of the final convolution result, and then return it.