 Okay, the next presentation is from Wenxin Zhao and Yao Zhao, okay. They bring us high-frequency targeted attacks. They use this method to win the car CTF yesterday. Thank you, Hai Bing. Hi, my name is Yao Zhao. This is my friend, Wenxin Zhao. We're both NLP researchers and in our spare time, we work on adversarial attack and defenses for neural networks. And today we're going to talk about our method, high-frequency targeted attacks that we used in yesterday's CAD computation. So in the first half, I'm going to introduce some basic concepts of adversarial attacks and defense. And then the second part, we're going to talk about our techniques in the computation. So neural networks are becoming a lot more popular in image classification and are deployed in a lot of commercial systems. In this case, when an image is given to a neural network, the network takes in the raw pixel and calculates the activation through a lot of hidden layers and then output a final label for the image. In a popular case like ImageNet, there can be a thousand labels for an image. And an adversarial attack against a neural network is that we apply some small perturbation on the input image and make the prediction of the neural network fail to another class. In this case, we changed a correct label from Snail to Fox. There are generally two types of attacks, an adversarial attacks. The first one is non-target attacks. It basically changed the correct label to any non-correct label without a specific target. The other one is the targeted attacks. That is to give a target and perturb the image to have it classified wrong in the other target. So the method of constructing adversarial images are the most popular method, is the gradient-based attack. When given an input image, and the neural network, we can calculate the loss through the neural network and calculate the gradients back to the image. And then if we perturb the image in the way that is opposite to the gradients, then we can get an adversarial image that can fool the original neural network. A more powerful attack is the iterative attack. It applies the same gradient method again and again over many iterations. And as you can see in the curve, the more iterations we apply this method, the higher successful rate the attack can be. So in the realistic system, there can be black box attacks and white box attacks. So for white box attacks, the attacker have access to the model weights. In this case, gradient attack can be applied and the gradients can be accurately calculated. Usually the attack success rate is very, very high. In the black box case, model weights are not accessible to attackers. So to successfully attack neural network, we need to either guess the neural network the defender is using or ensemble a lot of neural networks and attack them at the same time. For those ensemble attacks, like single neural network attack, we add the loss function of many different kinds of neural networks together and calculate the gradients back through all of the neural network at the same time and apply the same gradient based attack as the previous step. So in this competition, we focused on the targeted attack and the target attack has some specific behavior that when you use the attack method on model, it usually doesn't transfer to a different model. In this case, we have a lot of different attack methods on the rows and columns and they can only attack the defender using the same model and the attack images really apply to new defenders. So Vincent is going to talk about this computation and the method and system we're using. Thank you for the introduction. Something I want to add is for example here. You see that especially for a targeted attack, the image is not transferable which means you have to guess. The first thing is that it's really expensive to train a new model. So we think that in practice, if you work on the image nets, people will use the pre-existing pre-trained model instead of train their own model. So they only have a couple of dozen models out there. So the question is if we can attack them all, so we can with high probability to attack any system, because we assume that people are using ensemble of those models and some combination of them to do the defense. So that's our assumption and it's basically the case actually. And then the other thing is that it's not transferable. So that means if somebody is using Inception V3 and we don't have that model, so it's really hard for us to build an attack model, an adversary image to attack that model without using that model to generate the image. So for competition, what's important here? So we are allowed to submit our attack every six seconds. So that's a budget we have. The competition runs for 30 minutes, so that means we can try 300 times or maybe 200 times. So the key here is we want to try different combination of ensembles and to generate the image, but we want to do that really fast. So how to do that? So basically it's quite simple. We run a multi-thread program. So there is one thread which do the submission. So the submission is controlled by one thread. And the image is put into the double-ended queue. And we have an automatically generated generator to generate a separate image as well as a manually generated. So automatically generated, we have some prefixed ensembles combinations. We have like 50 of them. And we will try them all, so it's fully automatic. And to make it run fast, so you have to... I mean the technical details is basically we use a tensor flow and the tensor flow is pretty slow to build a graph. The build graph takes like 30 seconds. So you don't want to build a graph for each iteration. So we want to reuse the graph, but we want to change the ensembles. So for each ensemble, you have some weights. So weights is as an input, so not a graph. And the bad thing for tensor flow is like, for example, in this batch, you don't want to use that model. So you don't want to evaluate that model. But tensor flow doesn't support that. So there's some space to improve that. But basically right now it's like if you have five models in your ensemble, so no matter you use it or not, tensor flow will always evaluate that model. So it takes time, but still good enough. And so that's basically for our automatic generator. And for menu generator, so we will look at the results of the feedback and come up with some combinations we think might work and submit that job to CPU. So automatic generator is run on GPU. And menu generator is run on CPU so that they will not compete for memories. But definitely menu generator is slower. So that's our strategy. So what you can see yesterday is that we attack everybody crazy, right? So I mean the success rate is not high, but as long as we can get some scores, that's fine. So that's what we did yesterday. Yeah, our strategy. Yeah, thank you. Question? It's tremendous, yeah. So it's like if you use our model. So our biggest ensemble is that we use seven models. So to build a graph, it takes like 30 seconds. And then to do the computation, I mean we calculate, we compute 10 images at once. So 10 images takes like 20 seconds. But for CPU, if you do the same thing for just one image, it takes like four minutes, something like that. Yeah, it's a different skill. So basically, in CTF, you probably did one. You guys did one main work type program. Yeah, yeah, yeah. So we heavily rely on the automatic, those predefined ensembles. We guess them. We guess those people might use them. That's our strategy. Yeah, because nowadays we still believe that. So for the black box tags or the key is to guess what model the opponent is using, right? Cool. Thank you. Thanks for listening to Yao. So this is the last presentation. Yeah, we finished this morning. Thanks everyone.