 So if you ever watch the TV show CSI, you may have noticed that while in entertainment value, it's pretty good in terms of accuracy of portraying what computers can actually do. It is not as good. I found this great episode where they're like, we need to find the killer's IP address. And someone says, I know, I'll code a GUI and visual basic to do that. And I have no words. Well, part of me wants to say, have you heard of React Native? But anyway, the most kind of egregious and consistent offense of this type that you see on TV shows is, of course, the infamous zoom and enhance, where you have this low-resolution photo of a license plate or the criminal's face, and you need to upscale it magically into this perfectly detailed image that you can use to solve the crime. Now, as people who know something about computers, you probably see this and think that is completely impossible. But it turns out that with recent developments and machine learning, it is actually kind of possible. And so I'm gonna show kind of how that works. So imagine that you're an engineer in the tech department at the police department in CSI. You're sitting in your office questioning your coworker's sanity. And your boss walks in and says, hey, we have this image of this face and it looks like that. Can you please just enhance it like we do all the time? And you say, well, it doesn't really work that, okay, well, we'll give it a shot. So where do we even start? Well, I know what you're all thinking. No, the answer is not to use my... That was yesterday. We've already, enough Excel for one weekend. All right, but so, seriously, what are we gonna use? I guess Photoshop. So in Photoshop, you can make an image bigger by just changing the number of pixels in it. And it uses this thing called bi-cubic interpolation, which is a really simple algorithm. All it does, you kind of like put some new blank pixels in between the pixels you already have. And for each pixel, you kind of fill it in based on the pixels around it. Just do that for every pixel and you get this new image. So we do that. We get something that looks like that. You know, it's not great, but it's kind of the best we have in the real world. So we show it to our boss, it's not gonna cut it. All right, so it's 2017. We have other things we can try, like maybe a neural network. All right, so what is a neural network, right? All a neural network is a function that takes some sort of input and turns it into some sort of output. Except instead of actually writing the code for the function, we just train the function based on a bunch of examples. So in this case, what we want is a neural network that takes this low-resolution photo and turns it into the high-resolution version of the same photo. And all we need to train that function is a bunch of pairs of low-resolution face, high-resolution face. So we can actually get those online. There's this great data set of like 200,000 photos of celebrity faces. So that gives us our high-res versions. And for each one, we can just downscale it to get the low-res version. And that actually gives us a pretty nice training set. So what we're gonna do is we feed each of these into our neural network. And for each training pair, what we do is we give the low-resversion to our neural network and then it takes its current best guess at what the upscaled version should look like. But because we're in the training process, we actually have the real high-res version available as well. And what we wanna do is on each iteration of training, make the neural network a little bit better at actually producing the real answer. So what we can do is kind of do a per pixel subtraction between these two images and see where they're different. And you notice there's this bright red pixel here. And that's because the neural network guessed that that bottom left pixel is black, but it's actually white in the real answer. It was way off on that pixel. So we kind of wanna tell it, hey, you should fix that next time. And if we take the total per pixel difference for the entire image, that gives us what's called our loss function, which is basically just what we're gonna tell the neural network to try to minimize as we train it. Which makes sense, right? Because if that thing, if that difference goes to zero, that means that our neural network is perfect at reproducing the real high-resolution image every single time. So it's never gonna get to zero, but we train for a while with that loss function and we just tell it, minimize that total per pixel difference. And it can get pretty low. So great, that was pretty easy. So let's see how it worked out. So we give it our lowest image. Hmm, not quite the silver bullet that neural networks were supposed to be, interesting. So we compare it to the bicubic interpolation result and it doesn't really look that much better, right? But you know, whatever, we used a neural network, it was cool, so let's show it to the boss. Nope, nope, not good enough. This is not gonna cut it for TV quality. So we need to figure out what's going on. So why didn't the neural network work that well? Well, it actually comes back to our choice of loss function. So if we think about what's going on here, it's kind of illustrative to use an example where there's a sharp edge in the image. So you see here, the low-resolution image, you can kind of see that edge. And if our neural network goes out on a limb and kind of draws a sharp edge in the high-res image, what you see here is that it actually got it a little bit off from where the real edge was. And when that happens, our loss function gives a really high loss because the sharp edge that it drew wasn't exactly where the real sharp edge is. And so if our neural network kind of goes out on a limb and tries to draw a sharp image, it's never gonna get it exactly right, and so it really gets penalized for that. And so what we end up incentivizing our neural network to do with the loss function we chose is to kind of generate this blurry blob that will never be too far off from the real answer. So that kind of limits how far off it can be. But that's not what we want for our TV quality perfect face, right? We want it to generate this realistic, sharp image. So how can we do that? Well, we can come up with a new loss function that minimizes that total per pixel difference, but also makes the guests look kind of sharp and realistic. So how can we do that? Well, we wanna add something else to the loss. And that's where we come to generative adversarial networks. So this is a cool idea that was invented in 2014 by a grad student named Ian Goodfella, and it was inspired by adversarial games. So as an example of an adversarial game, you might imagine an example of people counterfeiting money. So the counterfeiters make some fake bills and they look okay and they work for a while, but eventually the police catch on and they develop these better techniques for catching those counterfeit bills. But then the counterfeiters catch on to that and they're gonna develop some better counterfeit bills and the police catch on to that and so on and so on. And you see this sort of adversarial dynamic in lots of different places like online fraud or DRM in music and things like that. And so what if there was a way to apply this idea to our neural network? And it turns out there is. So what we do is we have the same low-risk high-risk training example and we take the neural network we had before and we call it the generator, kind of like the counterfeiter. And so it's still trying to produce that high-risk image and we have our real high-risk image, but the difference is that instead of doing the per pixel difference, we add a new neural network into the system called the discriminator and it's kind of like the police. So what its goal is, it gets an image as input and its output is to say the probability that that image is actually a real high-risk image or whether it's a fake high-risk image generated by the neural network. And so we train it to get really good at that and its goal is to get it right every time. Is this real or is this fake? We just kind of randomly give it either a real or a fake one each time. Now the generator on the other hand, its goal now becomes to minimize the accuracy of the discriminator to fool that other neural network. And so what that ends up doing is incentivizing the generator network to produce images that look realistic and indistinguishable from real ones. So the one last step is that every time we train, the discriminator gives feedback back to the generator that says, hey, I detected that this image was fake and here's how you can make it a little bit better the next time that would make it harder for me to tell. So it would be kind of as if the police were telling the counterfeiters what they were doing and the counterfeiters were learning from that. So we do that, we set that up and we train for a while and you can see this is as we're training, the results are getting better and better and they start out looking pretty weird but as we go they start getting kind of sharper and converging on something that looks pretty realistic. And so now let's see how it works. So we just pull out our generator network now which has been trained in the GAN structure and isn't that pretty cool? Yeah, so we've kind of imagined this face just out of these pixels. And if you compare it to the original result from the earlier neural network, it's way better. All right, boss is happy. So we can kind of see what this looks like with a couple other faces and you see it produces these results that look pretty realistic. If you look at what the original hybrid image actually was, you can see that it doesn't quite match up. So that's kind of the catch here, right? Is that the information needed to reconstruct the face isn't actually in the image. So we're just kind of making it up but we don't really care about that, it's just TV. It doesn't really matter if we catch the wrong person. So it's not CSI ready yet in the real world. You can also, I tried it on my own face. I found out that it doesn't always work that well. So this is, it gave me like a really long nose and no mouth, I don't know why. The other catch is that we only train it on faces, right? So if we try to upscale like the Bang Bang Con logo, for example, it's gonna try to like make a face out of it. I think I see a face in there somewhere, I'm not really sure. So people have done all sorts of crazy stuff with this technique, it's blowing up in ML these days. You can dump a bunch of photos of zebras and horses into it again and it'll learn how to convert between them. It doesn't always work either. Nice try, I guess. What if the input was a sketch instead of another photo? You can generate a realistic photo of a cat. You may have seen this demo floating around online. It uses GANs. I tried drawing a cat myself and it actually worked pretty well. And you know, one other crazy example out of many, what if you take captioned images and try to convert just from text describing a bird to a photo of that bird and that actually, people have gotten that to work. And so this is just scratching the surface of what people are doing with GANs these days. It's really cool. I would encourage you to go on Google and kind of check out some of the other stuff that's happening. That's it. Thanks.