 Now, we are ready to build a one-dimensional convolution block for our neural network. If you followed along through our 310 course series, especially 312, 313, and 314, where we built Cottonwood, the machine learning framework from the ground up, you'll be familiar with their terminology of layers. That's a very common terminology in neural networks everywhere. Sequential layers stack together, kind of like a stack of pancakes or like a train with cars attached together, all in a row. This is a good approximation of a lot of neural network architectures, but it turns out that there are a lot of things that it won't let you do, especially more modern architectures like long short-term memory, recurrent neural networks, or even Resnets, residual networks. These things have not just pancake stack, not just forward and back connections, but also a lot of side loops. They're a more general example of what's called a graph, where you have these nodes, these things, these chunks, these blocks that are attached to each other, but each block can be attached to more than one other block ahead and behind. In order to represent these types of networks more generally, I have taken an updated Cottonwood. To reflect this, we no longer refer to these as layers, but as blocks. They're more generally, you can think of like blocks of wood that can be stacked to build a tower. One block might be resting on top of one other block, or it might be resting on top of two or three others. Similarly, it might have one block resting on top of it, or it might have two or three others resting on top of it. To build these structures of blocks now, we create the blocks individually and then connect them explicitly, and we'll walk through this a little bit later when we build this example. The code supporting blocks and structures, I'll walk through later in this course. It's optional. For now, I'll tell you everything you need to use it, and then if you want to see how it works later, please feel free to dive into that. But the important parts when creating a block is that it has a forward pass and a backward pass, just like we had before with layers. The only difference is that the way these blocks connect to other blocks, it might take in information from more than one block, and it might pass that to more than one block. But it still has the basic forward pass and backward pass. These are the functions that we derived before in the last section, and so here's where we'll be implementing them in code. So we start here by creating our block as a class. We'll call it conv1d. We'll use a nice class naming convention with capital letters and no underscores. You can tell that this was my first time implementing this because I went through very cautiously and documented every step in detail, being very explicit about the size of each array, the number of dimensions, the size of each dimension being passed into and returned from each function. So this helps me to think through it, helps me not to trip myself up. I hope that it also helps you as you're getting used to these ideas for the first time. To start out with, we'll initialize this. What we need is some kind of an initializer for weights, just like we needed with dense blocks. We need to specify the size of the kernel, so the number of elements in a kernel from side to side, and also the number of kernels. We can choose an arbitrary number of kernels to learn for any given convolution block. We can choose an optimizer as well. This is just like before, the tool that will be used to tweak the weights on those kernels over iterations. And then when we go to initialize it, it's convenient for implementation to have the kernel be odd-sized, so that it's got a central value and then the same number of values on either side of that, 3, 5, 7. It makes the indexing convenient to do. So we'll here ensure that this is the case, and if it happens to get past a kernel size that's even, it'll round it up to the next whole number and make it odd. We initialize all of the parameters that will be important for our block to have. And notice that for now, we'll have the weights be initialized to none, and the bias values be initialized to none. This is because we want our block to be able to look at the input that it gets on the first iteration, and based on that input, it'll be able to infer what the number of channels should be, what the number of inputs should be in each of those signals, and then using the combination of the input size and the kernel size, it knows what the number of outputs should be. The only thing we need to tell it is the kernel size and the number of kernels to create. Also here, unlike in a dense block, we have our weights and biases, we're going to handle them separately, and we're even going to optimize them separately. This is due to the fact that the weights and biases are no longer exactly the same type of thing. The weights here are acting on individual inputs, whereas the biases are going to act on combinations of outputs. We'll see how that plays out in a moment. The optimizer is set to work on an n-dimensional array. Our weights and our biases, because our weights work across all of the channels, but our biases just work across all of our outputs, they're different size arrays. To handle that, we'll create two different optimizers to work on them, but the principles are still the same. We'll initialize our result value so we can refer to it later.