 So, it's time to talk a little bit about autoencoders. What's the setting? We have an input here, x, then that input gets encoded in an encoder. It potentially goes to multiple layers. When we talk about linear ones, of course, there's little point of helping multiple layers, but we basically have here an encoding into a code Z, and then we have a decoding back into x, and usually the code is smaller than the input. So at best we can hope for a lossy compression here. So that's the basic architecture. Now of course we would want to train it so that it produces a good code. Now as we discussed before, there's lots of definition of good code, but a very easy one is the mean squared error. You might first want to consider variance here. So the loss is the mean squared error, which is the sum over all the dimensions of the images of xi prime, which is the output here, minus xi. So we have the difference in the same component input and output transpose xi prime minus xi. So this is simply the mean squared error that we have here, and we might want to minimize that. If we minimize it, then we have the code that basically best allows us to know what the output is with a small vector in between. Now what's a linear autoencoder? We can say we take that code Z that we have in between here, and we say that that code is an encoding weight times the input vector x plus a bias, of course. And then the reconstruction is again a linear function of Z. So we multiply WD, the decoding weights with Z, and add the decoding biases. So what happens if we have no non-linearity? Well let's look into that. We have the linear encoder and the linear decoder, and now it's your turn to train a linear autoencoder in PyTorch and have different members of your pod. Try different k, where k is the dimensionality of the code. And let's compare what that does to the results.