 Last time I spoke about GANs, and a lot of people came up and asked me about super resolution. So I thought, tonight I will do super resolution. And for me, the whole theme of tonight, really, though, is actually about CNNs. And like Martin said, abusing CNNs, or hacking CNNs, or getting CNNs to do different things than what most people think about. So actually, just before I start, can I ask quickly, how many people would consider themselves advanced in deep learning? A few people, okay, how many are like total beginners, okay, and how many sort of intermediate? I guess we've got maybe half-half of those two. Okay, good. So some of the things I'm going to explain should be really basic for the intermediate people. But I think it's sometimes really good to just go through these things and remind ourselves of what we're actually doing. So the first thing I wanted to show was just the concept of a simple neural network of basically how, with sort of like a traditional classification network, how we're building something that has, you know, we've got our input, which might be a bunch of images going into a bunch of hidden layers, which could be CNNs, could be, you know, MLPs, could be a whole bunch of different things. And then we use a loss function, which basically takes in the desired result, the result brought about by the CNNs, and we then calculate our loss and optimize our network. Everyone's clear with this, yes? So, and then when it becomes an inference model, basically we just take the raw input, run our hidden layers with our saved weights, and come up with an output. There's no longer we don't need to do anything. Like that was what the TPU thing was basically, TPUs just work up for inference. I want you to, you know, to sort of constantly remind yourself about that, because if you just start thinking about that sort of way of building something, like building a network, you can apply it to so many different things as long as you've got those key concepts. So, sort of classification CNNs generally fall into, you know, most of them, there are certainly, you know, changes and stuff like that. But generally you have your conv blocks, which is basically convolution layers. You've got some pooling in there. You've got an activation function like a ReLU, something like that. And the way that they work is that, you know, there's a reduction in, you know, say I've got like a here, the reduction in height and width, but there's an increased number of filters. So the depth becomes, everyone understands the CNN, right? Martin explained that last time. Then at the end we've got these fully connected layers or dense layers and we make a prediction. So what we're going to do today is we're going to talk about hacking it. And the way, you know, the big thing I want you to understand is that CNNs are not just for classification. All right? CNNs really have a lot of magic to them. Don't fall into a trap of thinking CNNs are just a precursor to dense layers and logits and your predictions. If we actually hack off those fully connected layers, those dense layers, and just go with the con blocks, the last layer of the con blocks or even an earlier layer of con blocks, we've got all this wonderful information that basically describes what we're putting in in a whole different way. And I think, you know, many of you understand that from sort of seeing Martin's lesson last time and this time of how CNNs sort of pick up relationships between the pixels. And what Martin, you know, what was talking about tonight, are we back on? Yes, we're back on. What Martin was talking about tonight was the original style transfer paper. And it was basically all built around, you know, at the start, a lot of the sort of ideas for style transfer were built around optimizing things on a pixel-based basis of looking at just, okay, how do we sort of take a loss function on pixels? And then they realized, you know, hence that paper that, okay, well, CNNs can actually help this in a big way. And the big thing I want you to take away from tonight is that generative CNN models can be used for almost anything where you want to put an image in and get an image out. I'll talk about that more later on. But constantly be thinking yourself, if you've got some sort of task where you need to put an image in and get an image out, then generally you're going to want to do something like this kind of model. So this is, I think I just talked about this already, basically an old-style CNN, you would basically have a bunch of convex, you would start with a three by three. In fact, I think this is very similar to sort of Dennis's structure for what he was doing of predicting, you know, some sort of output for steering. I, a modified CNN is where we cut the dense stuff off. We just deal with these, all these layers that come out at the end. Okay, so the paper I'm going to be going through tonight is what we call perceptual losses for real-time style transfer and super resolution. And I'm going to be focusing on the super resolution part of it. I, this is, this paper is now, yeah, year old. It was done at Stanford. You may recognize some of the names. Fifi Lee is quite, this now works for Google. Okay, so this is the model that we're going to be building tonight, although I'm going to be talking about. We've got a whole bunch of different things going on. We've got a low resolution image. This is for our training model, right? We've got a low resolution image and a normal resolution image. We're going to put the low resolution image into one network because we've actually got three different networks going on here. Our first network, what I've called the SR network, is basically upsampling the image. We put in a certain number of, you know, of pixels and it's going to generate more pixels than were there. Then what we're going to do is we're going to take that and we're going to stick it into a CNN. We're going to take the normal image and stick it into a CNN also. Both those CNNs are constructed exactly the same and I happen to be using VGG, right? VGG 16, which is just a very simple convolutional neural network and I've just chopped off the ends of them. And what we're doing then is we're then taking those outputs of those two CNNs and using that to work out our loss and then we don't optimize those two CNNs. The VGG networks, they're non-trainable. We're not using them for, you know, we're not trying to train weights and things in them. They've already been trained on ImageNet. So they understand a lot of things about images and this is what we're sort of like hacking and we're leveraging the concept of all sorts of style transfer that those networks have a bunch of sort of understanding about how images work and how lines work and how shapes work. And we basically take the output of those, get what we call our perceptual loss and then we use that to then optimize our super resolution part of it, our upscaling network. So it's actually a very simple concept. When you sort of see it like this and start thinking about like this, it can get complicated in code but just constantly ask yourself, okay, which of the three sort of CNNs are we working with at any point in time and understand that we're only trying to train and optimize one of them? Okay, the other big thing that is really important is to understand that the two VGG networks are actually part of our loss function. Most people don't think of a CNN as being part of your loss function but these are actually using the VGG networks as a part of the loss function. Okay, so perceptual loss, I sort of described this before, you basically take the output features, right, not some sort of prediction of a class or something like that, but we take the output features of both the normal resolution image and the low resolution image that's been upscaled to be of resolution equivalent to the normal one and then we basically work out the difference of those two tensors and then from that, we can basically work out our perceptual loss or mean squared error loss in this case. Okay, this is our upsampling CNN. So upsampling CNN, going from the bottom going up, starts off with a one by one convolution. It's actually a one by one convolution with a nine by nine filter. All right, so this is something a little bit unusual. People would think, okay, why do we do a one by one convolution? The reason we do the one by one convolution is we're actually trying to get a big, using it with a nine by nine, we're actually trying to get a very big receptive field going with the image. So it allows the network to look at big chunks of the image and sort of get a sense of how parts of the image relate to another part of the image. Okay, then we come up into ResNet blocks. So I will go through what the blocks actually are. I'll give you a sort of example of them, but we've got a couple of ResNet blocks. So these are residual network blocks. Then we basically increase the pixels by upsampling. And I will go through that a little bit as well too. Then finally, we end up with another convolutional, without output, and we're using a tan activation rather than a regular activation. Because we actually want the end activation to be between minus one and one, which a tan function gives us. Okay, so this is the concept of a receptive field. So if I walk over a bit, can I do that? You can see, if we had, say, a three by three filter, and we're using it in a way that's gonna give us a bigger receptive field, you can see that actually a three by three filter on the three by three input will give you a five by five receptive field. So you imagine if we're starting on nine by nine, you know, our receptive field is gonna be quite big for the amount of inputs. That means that each filter, as we stack up the filters, are gonna have the ability to sense something that's going on in that particular part of the image. Okay, let's look at the actual sort of the blocks themselves. So the conf block is very simple and very standard. You've basically just got your input going into a convolutional 2D layer. I'm doing some batch normalization once we come out of that and then putting it into a Rayleigh activation function. The res net is a little bit different. So I don't think we've talked about residual networks before. Okay, this is a different sort of architecture. I could, I won't go into the math of it, but basically think of it this way. Basically a res net has two convolutional blocks. The second one doesn't have an activation function. And then what we do is we basically add the first bit, the bit where we call the identity, together at the top of it. So it allows the network to sort of store what it understands at one layer, do some more convolutions, then joins it together. So it's got the new convolutions, the new understanding and the old understanding together. The sort of the latest and greatest in CNNs at the moment would be what we call dense nets and units. They do this on a much bigger scale. So they basically pass things around the network so that even at the last layer you're often passing things that you saw in one of the earlier layers into it. So it makes the network, I have a much more sort of, it's very good for things like segmentation, where you want the network to really understand what's going on with each pixel. But this one is actually a very simple sort of residual block. Okay, so I've talked about this. So there you sort of see. So those bits, everything I was talking about there is just in our SN net, right? Just in the super resolution network. Let's look at code. Okay, so what I've done is I've put this in Keras. How many people have used Keras before? I think when you're starting out and even when you're trying to just sketch out ideas all the time, Keras can be really good. Thank you, yes, I need to make it bigger. How's that? Better? Okay, so I basically just got my imports. I, and the data set I'm using is called a celeb faces data set. So basically it's just a bunch of pictures of celebrity faces. And one of the fun things you can do with this data set is you can actually just take the mean of all the faces and you can get this sort of like, what an average celebrity will look like. If you want to do that. I'll say some more about that later on. Okay, so basically all I'm doing is loading in some vehicles arrays that I've already sort of pre-built with this information in them. Just allowing it to be faster for training, for accessing, and also it's just easier to pass around. So here what I'm doing is I'm bringing in a bunch of different ones. I'm bringing in the original ones which are 176 by 176 pixels. I'm bringing in what I call a low res, which is half that. That's all right, one fourth that, one fourth that. So that's basically 44 by 44. And then I'm also bringing in a set that I call the extra low res, which is, I was not in this one, it's in the eight by eight one. I'll show you that in a minute. Which is also gonna be basically one eighth. So in this network, I'm doing four times super resolution. I'll show you a network after this that's doing eight times super resolution. Okay, so because we're using VGG and it was trained on ImageNet, it already has a bunch of, obviously we're getting the weights for free, the biases for free or worked out for us. But to do that, we have to basically work out the mean of the ImageNet data that it was trained on and be able to do our preprocessing stuff. So that's all that's doing is doing preprocessing. Then we set up our network. So here we've got our COM block and our res block. These are basically just like I described before. We pass in some things to do with X's input in here. Number of filters, filter size, number of strides. Also we're passing where we're using activation or not because remember on the second layer of our res block, as you can see down here, we don't use, secondly here, we don't always use activation, actually. We don't use it, yeah, usually at all. We use it on the first COM block of ResNet, but we, or the first COM block of a residual module or residual block, but not in the second one. And you can see exactly what I was saying before is that here, you're basically on the res block, you're basically, all I'm doing is processing it and then I'm merging it, I'm adding it together, I'm merging it by some. So I'm just adding those two things together at the end to get our output. Okay, here's where the magic happens. The upsampling block, aka the deconvolution or transverse convolution or fractionally strident convolutions, lots of ways to say the same thing or say basically the same thing. Here what we're doing is I'm just using the Keras upsampling layer. So what this layer does is it allows us to basically put something in and it does its strides at normally we talk about two by two strides. This is actually doing strides at half. So what you get out of that is you get expanded pixels. And you can see here is the upsampling network. I've got basically the res, I've got my convolutional block, my res block, res block, res block, res block, then two of the upsampling blocks. And then we've got the last convolutional to basically get it out, get it down to three color channels. We use tan activation on it, but actually if we didn't, it turns out now apparently that the author of the paper has said that the tan activation is not, you don't have to use it. You can get away with not using that. Because we're using tan activation though, and because my output is gonna be an image that I wanna be able to show and stuff, I need to then convert that back into an image. So I do that basically by, because we're between negative one and one, I add one and I times by 127.5. To basically get us back to the zero to 255 for a pixel. Okay. So here you can see this is the, one of the cool things about Keras is that we can just do the model and Keras is very simple to make models. You basically can just, you can see here I've basically told it what I wanna put as my input, what I wanna put as my output, and I've made that into a model. And then straight away I can just do model.summary and it prints out all this wonderful information for me. And here I can see that, okay, the shape that I'm putting in is a 44 by 44 by three channels. So this is our low res image. All right, it's 44 pixels by 44 pixels by three channels deep. All right, you can see when it comes out, it comes out at 176 by 176 by three. Now I'm not gonna go through and walk through each layer, but you can go through and read what each layer is doing quite easily. This is one of the good things about Keras is actually that this, it allows you very quickly to visualize a model. We can see the amount of parameters that we've got going on. All of it's there. But the most important thing is that we're going in with a low res and we're coming out with a normal resolution. Okay, now we've got our VGG networks. So we're gonna have two networks here. But what we're gonna do is we're not gonna use the entire network. Just as Martin chopped off, remember in his example, he chopped off a bunch of stuff to get access to one of the convolutional blocks. We're doing exactly the same thing. And the block that we're gonna chop off at is one of the early ones. Because if you think about an image net, as the further along you go, the more that the networks, if it's been trained on image net, the more it's gonna start to understand big objects. So just sort of last few layers before, say the dense layer, it knows what a cat looks like. It knows what a face looks like. Knows what an eyeball looks like. It's got a good sense of all those things. We don't really want that. We want something because we're not trying to train, maybe if we were trying to train on all of image net for our thing, we could do that. But really what we want more are lines, corners, edges, small sort of shapes that the early layers pick up. Because then it will be able to use those to basically sort of pick those up in a full resolution image and a low resolution image and work out, ah, this should be that. And it will then get a sense of, okay, a curved line in a low resolution image looks totally different than a curved line in a normal resolution image. Okay, so if you can see this, the V2G network is very small, right? After we chop that stuff off, we do, and I do the summary, we don't have a lot there at all. We've basically just got two, a bit of convolution, a bit of max pooling. But you can see this is then, we've gone from 176 by 176 with three layers deep and we come out, okay, it's not as wide and it's not as high, but we come out with 128 filters. And those 128 filters is where it's storing all this magical information. And that's what we're gonna then use to calculate our loss. So going through Keras, basically Keras, any time you, Keras has a really cool thing that anything you wanna turn into a model, you can basically just throw it, you can take it and turn it into a layer using a lambda, and then you can basically just take that and stick that in a model to use that. And this is what I'm sort of doing here, is I'm basically starting to assemble it all together so that when I do a pass, I'm going through all three networks in the order that I want them to be going through them. Okay, here's our loss function. And then basically the final sort of full model that we put together is gonna be, we're gonna be feeding in all these things, we're gonna be using Adam, we're gonna be using mean square error for our loss. We then start training. Saving weights, loading weights, and then finally some predictions. So on the left you can see this is our low res image, and on the right is the prediction for 4x of that. And we can see that the model clearly has learned a lot. Now the interesting thing here, let's just check something. Yes, okay. So this is still, this one we can see really it's learned quite a lot. I haven't done thousands of epochs of training either. I've deliberately sort of limited the amount of training to get it just enough to sort of start learning. If I was going to do this as a production model, I would get as much data as I could, I would train it forever, as long as I could. But here we can see already without a lot of training, it's already starting to pick up stuff. Now this is still on our training data, right? So you would expect that it should understand something reasonably well. And we can see that it's starting to get the concept of like if you look at the girls hair here, it's starting to understand that sort of hair is a continuous line, it's not blocky, it starts to fill those things out. Here's a good one. All the curves. It's really got a good sense of that curve there on her hair. It's starting to fill in the eyes and to work out that okay, these sort of blobs of black over here, black and gray actually should be turned into something. That this sort of light color here, which really is almost the same color as parts of her skin, over here it's worked out okay, that's her teeth. That's comparing to the original. So this is the lower his input, this is our prediction on the right, again our prediction on the right, this is the ground truth image here. It's not bad, right? That's still just predicting on training data. So let's look at some, let's predict on test data. So here's an example of, do people know who this is? Does anyone know who this is? Yes. Yes, the CEO of Baidu, right? So this is an image that the network hasn't seen at all. All right? And you can see it's done a pretty good job of it. It's worked out that okay, that this sort of, all these sort of blobs here really are a mouth. You know, perhaps it hasn't done very well with the light on the side. If we look at the ground truth, and this is the prediction. Okay, so here's one of the interesting things about this data set. If you wanna lose wrinkles, if you wanna lose anything as a celebrity, just turn on enough celebrities and it will just wipe them all away. You will see this more. Look at this one. So here's a lady on the left is the lower his input. On the right is the prediction. All right, this is the forex prediction. And again, this is the ground truth versus here. You can see that it's done a pretty nice job of smoothing out her skin, at least, if you'd say. But here's the cool thing, is that with a CNN, another one of the things about that makes CNN so magical is that they don't care about size of input or size of output. The way CNNs work, really, the size of input, size of output, will vary, obviously, if you put a different input in, you'll get a different output, but you can basically mess with them. So what I've done here is I've built a new model. So in case we have to do this by building a new model to just tell it, basically all we're doing is telling it a new pixel size for the input. And what I've done is I'm now predicting on our prediction. So I've taken the image that was a quarter size, blown it up by four times to the normal size, and now we're blowing that up four times again. So we're actually doing 16x prediction here. And you can see it's not bad. For such little training, it's really got a good sense of hair here, right? You can see that this is our new one on the left, this is our old one on the right. We can see that it's really, it's got a strong sense of hair, it's got a strong sense of curves. What it doesn't have is a strong sense of text. Notice all the text in the background, it just doesn't get any of it right, because this dataset had no text in it. So we wouldn't expect it to have learned anything about text. Now if we trained it on all of ImageNet or something like that, it probably would pick up those sorts of things. And if we threw in a lot of, say, commercial images with text and fonts or things like that, it would definitely pick up those sorts of things. So again, this is the original low res image that we started out with. This is 16 times the resolution. Not bad. Let's look at 8x. So exactly the same network. All I've done now is I've thrown in, and I've done this in a very lazy way to show that you can try these ideas out very quickly. All I've done is thrown in one extra block of upsampling. So that we start with 22 and we end with 176. In theory, we should definitely train this model longer than we trained the first one. You can see this one's done pretty nice. On the training data, that's really now starting to get her eyes. You're going from that. This is where I say we're starting to reach CSI territory. I showed this to my wife and my wife was like, oh, so what? They do that on CSI all the time, right? And I had to explain to her that actually up until recently it's been bullshit when they did it on CSI or whatever. But we are getting into the territory now where we can take something that really is totally pixelated junk and turn it into something that the model makes as a proper face. Look at that one. That's nice. And this is a perfect example of what I mean by the celebrity face. Remember if you see her first one? She's got a little bit of blemishes. Don't worry. Just stick you into the eight times celebrity data set. We can fix that. When I showed this to Martin, Martin came up with a great idea of why don't you train up a whole bunch of really old faces, then we could make an app that basically just you took a selfie and it would show what you're gonna look like when you're old. And you could do this for like, turn this into an app and you would have a celebrity sort of selfie app. But you can see that it's really got a face. Really has a very strong understanding of a face here. Okay, predicting on training data. That was true predicting on training data. Now let's look at predicting on test data. Look at this one. Now, remember last time I showed you about GANs? And we talked about how the generative adversarial network, how the GAN basically is constantly trying to treat the discriminating network. This is a good example of sort of almost like a GAN image where if you look at it, now if I just showed you that, right, and you'd never seen this lady's face before, you probably would think, wow, that's amazing. And it is amazing because it looks like a lady and it looks like someone's real face and stuff, but it's not the same lady. You look at this one. You can see that it's going to a point as humans. We can kind of say, well, it's someone different or there's something, you know, we're a little bit in the uncanny valley there of where there's something that as humans we can detect very quickly. But this would certainly pass a GAN, I would say, because it certainly looks like a real human face. And you can see also that it's really got, okay, so it's interesting to look at this one to sort of see what the network has learned. It really has a sense of that hair should have a shine on it, right? If you're a celebrity, you want shiny hair. And it's also sort of learned to fix up sort of errant hairs that stick out. So this lady's got like, her hair's a little bit messy, maybe on the side, no problem, we'll put you through the celebrity treatment. It fixes it out, smooths it all up. Okay, so let's look at now doing a prediction on the prediction for the eight times. So we're now getting to 64X territory. This shouldn't work at all, right? If you think about even just like your iPhone Zoom or something like that, when you're doing digital zoom and you zoom in, it looks like crap. In future, I'm sure you'll start to see these things used in phones for things like Zoom, for sure. Look at it. It's not right, but it is interesting. So it's definitely got a little bit more sense of the sort of lips. It's certainly played with the color, but there's definitely getting quite serious artifacts happening now. But don't forget, this is where putting in, we put in a 22 by 22 image, and now we're up to, what is it, 1,400 or something. So we started with 22 by 22 pixels, we're now up to 1,400 by 1,400 pixels. It shouldn't work at all when you think about it. So that gives you an example of the code. I will put the code up online so that you can go through it. Where's my presentation? So here's where we're sort of getting at before. If you start to think that, well, gee, if I just put a CNN between any import, any image in, and any image out, maybe I can train this CNN to do something, and you can, right? You totally can. Colorization. I actually started coding this up, I didn't get it finished to show you, but basically this is doing exactly the same thing. Basically what we do is we make a data set, we just take a set of images, we write a script that basically makes a grayscale version of them, we train on the grayscale, going to the color, calculate the perceptual loss, and the perceptual loss is the really important thing here. It's not like pixel loss. This is the power that makes this stuff work, is this concept of perceptual loss. And then we can basically do colorization. We can do segmentation. Now, segmentation, we probably use slightly different network. We probably go down first, and then we would go up in the same network. The reason we go down, down sample first on the first half of the network, is so that we get that ability to see as much of the image as possible in one filter. And this, you know, probably the best stuff for doing this stuff now, it would be dense nets, and before that would be unets. I think there's going to be a big new paper coming out about this soon. So, but these, you know, this is very easy to do once you've got, you know, this concept going. Other ones, depth perception. If you can train up a network to understand that certain things are a certain depth, you can use that. Denoising, you know, taking noise out of images, any sort of visual filter that you wanted to create, you could use this. Audio clarity, super resolution for audio, is one of the things that, you know, Martin and I have talked a lot about. I also, audio filters. So, if you can take, if you can take, you know, something like a recording of a guitar, an electric guitar, and being played, and then run that exact same recording through a beautiful, nice, big, Marshall-stacked amplifier and record that, you would eventually be able to train something that could kind of emulate that. The challenge there, and someone asked before about audio stuff, the challenge with the audio stuff is, okay, what do you use as your loss function? What kind of CNN? Because, you know, audio, we generally tend to convert it into some sort of spectrogram or something like that. That's not probably the best way of doing it. I think, you know, going forward over this year, next year, over the next couple of years, we'll probably work out better ways to calculate getting a loss from two audio things. And that will then see a massive advance in things like the Techotron model, and a whole bunch of different things. But anyway, summary. I want you to understand that, you know, generative models doesn't just mean, like, let's make artsy stuff. I happen to think that the Neural Style Transfer is really one of the coolest things to play around with. You can come up with amazing images. But it's really understanding that you can use these in your work a lot. Anytime you need something that goes image in, image out. Again, power of CNNs goes way beyond classification. Perception of loss comes from two CNNs. And yeah, the whole concept of image in, image out. So, you know, I challenge you to go home and come up with an idea for this, right? And ideally come up with an idea that someone hasn't done. Because there's probably still hundreds of different things that people haven't done for this kind of thing. And you can certainly do it. That's it. I'll put the code up on GitHub later tonight so you can download it, train it, play with it, abuse it, whatever you want. Any questions? So, yes, you can also use MAE as well as the MSE, but when I was testing it out, the MSE just seemed to work so much easier. Giving more time, I would probably play with MAE as well. Because certain new papers have actually shown that, you know, there are probably lots of different ways you can do it, right? For me, the key thing, though, is that you're taking, you're using these CNNs as kind of like a filter to make up all these cool feature sets for you, which you then compare against each other. And the loss that it's getting in those things is not, you know, it's things that it sees in the image. That's one of the reasons why it's not getting the text well, is that on the training model, it doesn't see enough text. So it just doesn't think that that's important. Basically what, you know, cancels it out. And using that data set, that's totally expected. All right, it would be very cool to do, you know, here's my challenge to someone. Go and make, and this would be probably pretty easy to make too, make a network exactly like this that just takes fonts and takes a font and scales that up. You could probably come up with a really cool way to, you know, for scaling fonts. You know, that would be something that would be reasonably easy to do and to try out. Other questions? Right? Let me look at my pre-processing. Okay, so here I've messed the channels by doing negative one, right? So it is getting it back to RGB. Other questions? Yeah, like, okay, so one of the things that I think people, I'm sure people are probably working on this already, but here's a start up for you, right? Take this model, take a bunch of people, pictures of people walking by in high res and low res, train up a model for that, sell it to the NSA. Sell it to some sort of security force so that they have the ability to take, you know, camera footage and zoom in on it. All right, and it's totally doable. You know, I think there's, you know, anything where you could even train up models for things where things are being censored with pixelation, that you could do this. There's another start up idea for you, right? But, you know, you could certainly use it for any sort of thing. You just want to do this different training data, right? And the cool thing about this is it's very easy to make training data. You just go and take a bunch of high resolution images and then you use a script to basically down sample them. Right, and this is for all those things I showed you. The segmentation is not quite as easy, but for things like colorization, you know, denoising, you know, all those sorts of things. It's very easy to make mountains of training data to train on, to hack in how, explain to me what you mean, and make a high resolution of the eye to put it in front of the camera. There's your start up idea. Not that I know of, but I look, I would not put it past people. It allows you to, you know, if you can build a training set of enough eyes, close up, you could definitely do this. So some serious real world applications of this would be things like, you know, microscopes. Taking high, you know, taking sort of medium resolutions or, you know, going from medium resolution to high resolution digitally in a microscope. That would be really useful for things like studying cells, studying blood, you know, all those sorts of things. You know, again, getting back to why I sort of say don't think of generative models as purely being artistic models. That's not what it's about. Yeah, look, you could do that with the biometric stuff, you could do it. I did laugh at something I saw on TV the other night of where they took a picture of a glass that was in the picture and then zoomed in and took the fingerprints off the picture. I don't think you can do that. Right? But who knows in the future? I really don't think you can do that. Any other questions? Oh, not much at all. Just GPU, right? Yes, I bet you could just train it on each of the, you know, you could just take a training set and train it like that. There may be better ways. So for example, like the style transfer for video. The first way of doing style transfer doesn't work so well for video. And basically, because it's sort of bringing out, you get lots of alternatives that are not coherent when you stream them together, if that makes sense. You get lots of alternatives that make sense when you're just looking at it as a steel picture. Like, oh yes, this could be a Van Gogh picture and this could be a Van Gogh picture but this Van Gogh picture has got the house is being blue and in this Van Gogh picture it's got the house is being yellow, sort of thing. That sort of thing. But there's a number of papers the way they fix that. It's basically just stabilizing, you know, the model. But yeah, you could do this for videos quite easily. It shouldn't be that difficult. Any other questions? One here and then, yeah. You know, if you create enough training data and that video is reasonably similar to your training data. Obviously, if you take, you know, so it would be interesting to, you know, one of the things that Martin was curious about was like, put a cat image in it, you know, see what happens. It would be interesting, yeah, to take like, you know, if you're putting different sorts of data in there, you know, it won't work as well, that's for sure. So that's where you come to training on all of ImageNet or something like that. Then it will be able to work on a whole bunch of different things. But yeah, you know, you could certainly do that. It's not fully connected there, so the last layer is a convolutional layer. Peppers use intanatious to basically get it to be between negative one and positive one. That said, you know, like I mentioned before, the author of the paper SINCE has said that they went back and tried it without tanh and it worked pretty much the same. You could, yeah. You can certainly try something like that. Mess with it, right? You know, that's the whole purpose of putting the code up, is mess with it, try it, see what happens, right? Any other questions? Oh, done. Okay, I know we've gone quite late tonight. Thank you for coming. So, yes, you do too. Do you want to come up? So, a few announcements. So the next event will be on the 25th. And what we're starting to do now is make a topic for each night. So like last month and this night, or the tonight, definitely was sort of generative.