 Today, we will learn how we can use generative adversarial networks, GANs, to produce much more convincing examples when it comes to images. We will talk about ways in which we fall short of producing good probability distributions. We will talk about specialized settings in which GANs are particularly useful, and the many ways that are needed to get GANs to actually work and why that is generally difficult. So let's first talk about the history of the field. It has improved so dramatically. In 2014, the state-of-the-art was simple, grayscale images like this, and that was hard. That was state-of-the-art. And over time, we get to really, really realistic images like these. It's truly impressive to just see that. I should also mention here. This is a great person to follow for you, Ian Goodfellow, who's in Montreal and has been driving the progress of GANs for a very long period of time, and he's a big inspiration for me. Let's also talk a little bit about GANs in the news. Now, like, I just want to get across how important they are. Now, MIT Technology Review did a feature about the GAN father, which is Ian Goodfellow, and they did a nice piece on him. We talked, like, even in the New York Times, you see, Internet companies prepare to fight the deep-fake future. So it's just something that's important in the news, and you may have seen the videos where people put rather funny words into the mouth of our former president. So how do we know how good GANs are? Well, Photoshop now has new GAN filters that allow you to do things like change the facial expressions on someone. And not only that, but Adobe also now provides a tool to do the opposite, to actually prove to the world that you actually took a photograph instead of just taking that photograph. Isn't it funny how in the past, no one would have said you'd need to prove that the photograph is real? If it was obviously real, you would have to prove that you drew something by hand. Now everything has changed. Fake tax is being used for customer reviews. We have Snapchat filters that use GANs, countless more applications. In fact, the promise of GANs being too convincing has gotten so big that deep-fake detection was a challenge on Kaggle with the price money of $1 million. So it's literally the million-dollar question how to detect that something's fake. Now let's briefly put this into a broader context. Now, like, they are discriminative models. This is standard machine learning, standard deep learning. We put in features, maybe an image, and out comes a class. We have dogs on one side and cats on the other side. And we use features to classify that in a way we're interested in the probability distribution, which is the probability of the class given the features that we have. Now there's the other possibility of generative models where we could say, well, I give you some noise here and I give you a class and I ask you to produce the features. Like I give you some noise and you will have to draw the picture of a cat. In that case, we are rather interested in the opposite, the probability of the image given the fact that we will tell our systems that this should be a cat. Now, and there's lots of different cats, and ideally a generative model should produce every image that is possible with the corresponding probability, which we sketched here by a Gaussian. Obviously, cats do not fall into neat, simple Gaussian distribution, but the distributions that we have in the world are very complicated. Now, what is the idea of GANs? The idea is we have one generator that draws images. Look, here's a nice image. And the generator will try to produce images that look like the real images. Then that image is shown to the discriminator and the discriminator might say, that's so fake. I don't believe this. This isn't the real image. Now, there's something subtle hidden here, which is the generator might want to produce images, but that's not enough if the generator would always produce the same image. The discriminator could still figure out that that image must be fake, because the discriminator seen so many of those. So it's a really interesting game where the generator shouldn't just produce an image, which is part of it, but it should also produce a distribution of images that matches reality here. Now, let's talk about how it works. So we have a generator. The generator constructs an image. The reward function in a way is to not get caught. The reward function of the generator is that the discriminator cannot distinguish it from the real image. And again, setting the output of the generator is an image, a fake image, that fake image is then handed to the discriminator along with a set of real images, and then the discriminator asks, is this a real image? Now the discriminator's job looks like this. You have a generator handing out a fake image, discriminator gets a fake image and a real image, and so it sees a mix of those, and for every one of them, it has to say, is this real or is this fake? And the reward for the discriminator is not getting caught. Now here's what the model is to be more specific here. The generator G takes as input a vector z. That gives it randomness, and the reason why that's necessary is because there's a real distribution here. And now the discriminator has to tell the difference between fake images here and real images. Now, you can say, at first, if we have a really bad generator, this should be easy. Now, a real image looks very different from, say, white noise. And then through training, the generator draws progressively better images, and then it will be harder. And it forces, in a way, the generator to produce progressively better moralistic images, and the discriminator is to get better and better at figuring out the many ways how a real image might be different to a fake image. And that is, in a way, the crux here, that there are lots of different real images, and that, in a way, a good generative model would understand this manifold of images that is there. Now, what's the discriminator loss function? The loss function for the discriminator is, here we have 1 over m. This is just because we want an average. The log of what the discriminator says for those stimuli that are real. So here, yi, if it's real as 1, so we have this sum that takes the log probability for the samples that are actually real. And then it takes, we have the second sum here, like 1 minus yi is 1 if it's fake, that's 0 otherwise. So this thing here is effectively the sum for the fake ones of log 1 minus d. So basically it means we want to have high probability for the real images and low probability for the fake images. Now, let us build a discriminator. We start with the discriminator, and give it a couple of animal faces, hooray week 3, and some random images. And let's train it so it can distinguish the two. And we will use just a very simple multi-layer perception, and somehow this should be possible. Not like white noise images look very dissimilar to real images.