 Hey guys, so I'm making this video because of a tweet that I saw by Ian Goodfellow. Now Ian Goodfellow is an AI researcher at Google Brain, and he's responsible for creating something called a generative adversarial network. Again, way back in 2014. Now GANs are responsible for creating those fake pictures with people who don't exist. Over the past year we've also seen the rise of deep fakes. Now deep fakes are videos where someone's face is plastered on some video, so it creates a video of a person doing something that they didn't actually do in real life. It's quite eerie to see how we're able to doctor images and videos in this way, but at the same time it's also incredible to see how far technology has advanced in just five years. But how did we get here? We're gonna talk about just that, and this video won't be too technical, so if you're curious about AI, but really don't know the main mechanics behind many applications in deep learning, then don't worry about it. You'll still follow along. I'll probably throw in some extra tech jargon in there for extra curious viewers. This is Code Emporium, and so let's get started. In this video, we're going to take a look at how to create images of people who don't exist. Now these face generators are composed of a fundamental structure called a neural network. These networks take information from one end, they perform some processing in the middle, and they spit out a result. If you want to solve simple problems, then simple networks are good enough. A simple problem could be you throw an email into a network, and it determines if the email is either a spam or non-spam. Pretty simple. But the problem of face generation that we're talking about is kind of complex. It's complex because the network first needs to read an image input, then needs to determine the features on your face, like position of your eyes, the nose, the mouth. It needs to determine the texture of facial features, determine the contours of these features, and so much more. The neural network will only be able to generate its own faces only once it understands how faces work. As you can imagine, this could take a lot of time, a lot of processing power, a lot of data, and a complex neural network. So complex neural networks are required to solve complex problems. Now neural networks aren't anything new. They've been around since the 1940s, but we never had hardware with lots of processing power until fairly recently. I'm 23, and I grew up in an age where we stored stuff on floppy disks. And come on, floppy disks are a piece of plastic that hold like two megabytes of data. And this was a thing 15 years ago. So yeah, processing power of computers grew exponentially in the last decade. And this fact is important because neural networks need a ton of data and a ton of processing power. So they were virtually useless. It wasn't until 2010 to 2012 that the neural networks exploded in use. Before, they were only good in theory, but then they started outperforming everything. Language translation, you feed an input English sentence and the system spits out an output French sentence. The state of the art is a neural network. Google Translate uses a neural network for its translations. Security and defense systems, the state of the art is a neural network. Image captioning, feed an image to the system, and it spits out the description of the image. The state of the art is once again a neural network. And the list of applications just goes on. In all of these applications, neural networks outperform everything else. This is also the case with another application, which I said we would be talking about face generation. Face generation. Like you would guess, neural networks are the state of the art for this problem too. The neural net that made the major breakthrough by generating decent faces was generative adversarial networks or GANs. These GANs are a type of neural network, and it's really interesting to note how they work. The typical neural networks take some information, processes it and spits out a result. GANs make this interesting by making it into a game of cops and robbers. So again is a neural network that consists of two sub networks. One of them is the robber and the other is the cop. Now when I say robber, I'm not talking about the guy who steals stuff, but he's more of the counterfeiter instead. So in this context of face generation, the counterfeiter network creates fake faces, and it is the job of the cop to catch them. So the cop looks at some face image, and it should try to tell which images are counterfeit images and which ones are the real thing. The cop and counterfeiter play a game where they take turns. It's round one. The counterfeiter starts by generating a face image, and it puts this image into a pile containing two types of images. Images of real people found online, and the images it generated, the fake images. Now it's the cop's turn. The cop takes an image from the top of the pile and answers a question. Is this image real or fake? If the cop answers the question correctly, then he wins. Otherwise the counterfeiter wins. After the round, the loser will tweak itself ever so slightly to improve its performance. So if the cop network loses, then it tweaks its network to become slightly better at detecting fakes. And if the counterfeiter network loses, then it tweaks its network to become slightly better at generating fakes. So after thousands of rounds, the cop becomes better and better at spotting fakes, and the counterfeiter becomes better and better at generating fakes. And once the game is done, we just ask the generator, or counterfeiter, to generate the image. And this image should look exactly like a realistic face. Now this is the main idea behind generative adversarial networks. If you understood this, you understood the main point behind the landmark paper of 2014. So yay. All along, I have addressed the cop and counterfeiters as boxes. But what exactly are these boxes? I said before, they are neural networks. But not just any neural network, they are simple neural networks called multilayer perceptrons. These simple networks are typically used to solve simpler problems and not complex problems like face generation that we're doing now. If that's the case, then how are we able to get such results? It's because we use multiple simple networks to create a complex network. This isn't too efficient, and we can still see the image quality isn't that great. So in 2015, researchers Alec Radford and Luke Meds thought, instead of using simple networks to make a complex network, let's use slightly more complex networks to make an even more complex network. So instead of having the simple multilayer perceptrons, the cop and counterfeiters now become more complex convolutional neural networks. And this entire network was appropriately named deep convolutional GANs, or DC GANs. This was a straightforward advancement. We have the problem of dealing with images. Convolutional networks works well with images. So let's use convolutional neural networks. And this showed more promising results. Around the same time, we saw the introduction of another GAN, called a coupled GAN, or a co-GAN. Instead of using one counterfeiter and one cop to learn to create fake images, we would use two cops and two counterfeiters. So we have two simultaneous games going on in every round. Here's the idea. The counterfeiter networks both share information with each other. But they also need to slightly tweak themselves to fool their corresponding cop. The result is that we end up with counterfeiters that learn to counterfeit images with slightly different features. So it can generate simultaneously a person with blonde hair and the same person with brown hair. It can also generate a person without glasses and the same person with a version wearing glasses. That's pretty neat. Now, over time, we've seen different types of GANs learning to generate faces. But all of them have the same problem. They aren't exactly high quality. This is because the cops would easily tell if an image is fake if the counterfeiter always generated high resolution images. Since the counterfeiter knows it's going to be too easy for the cop to tell the difference, the counterfeiter will just make sure the quality of the image is on the lower side. Clearly, this has its disadvantages. We cannot get high quality images from our GAN. However, this all changed in 2018 with NVIDIA. Usually the cop and counterfeiter played thousands of rounds using similar images of similar quality throughout the game. But now, we start the cop and counterfeiter out as simple networks with, say, just for 100 rounds. Because of the simple network, the generator will only be able to generate low resolution images, and the cop won't really be that good at telling the generated images apart from the real images. We're also using real images of low quality, too. After the first 100 rounds or so, we make both networks slightly more complex, probably by just adding an additional layer, and use high resolution images. So progressively, as the rounds go on, the generator will generate higher and higher quality images. And finally, we get results that are even difficult for humans to discern. All the images you see here are synthesized faces. Everyone is fake. None of these guys exist. You need to look really hard to tell them apart from real people. And this progressive growth addressed a fundamental problem with GANs, the lack of image quality. So clearly, this is pretty cool. But the researchers didn't stop here. It's amazing to generate these images, but it would be even more amazing to have control over the images being generated. Say if we want a face with brown hair and that's smiling, then we should be able to pass this into the network, and the network should spit out a counterfeit face, which has the same features. This transfer of style to an image was done by using a slightly more complex counterfeiter. Before the counterfeiter was just a typical convolutional neural network that generates an image, but now the network has more components that allow us to define the kind of image we want to generate. The result, we have so much more control to generate high quality images, so much so that we can even create a database with these images. Since its inception in 2014, it's really amazing to see how we've progressed this far into face generation with GANs. And note, these same networks can be used to generate all kinds of data. It can learn to generate literature like Shakespeare. It can learn to paint like Van Gogh. It can even learn to play any kind of music to your taste. If you want to know more technical details about these neural networks that I've discussed, then check out all the other videos on my channel. But that's all I got for you now, and I hope you understood something from this video. Show us some love with a like and subscribe for more deep learning videos. Bye bye!