 Our next speaker, final speaker here in the garage, will outline some defense strategies we can use to mitigate this threat. So let's welcome Guglielmo Hiotsia, Associate Director at MSD. Hi, how are you? Hi, Nicos. Hi, everyone. So first of all, let me say thank you to the organizers as well, because they are doing a tremendous work. I know it's really challenging to do this presentation, this event as virtual. So thank you again, and thanks for this opportunity. All right, take it away. The stage is yours. Sure, thank you. Okay, let me share my presentation. It's coming up. I'll let you know when we see it. Can you see it? Not yet. I'll let you know. It came up? No, can you put it in full screen? Let me try. Oh, I'm sorry. Let me try again to share this. Okay, can you see it? Is it on full screen? No, it's for some reason, doesn't go to full screen. We did this before. We're not seeing it yet. Which in a presentation about perception is a bit ironic. Yeah, that's it. Or maybe this is part of it. You're just building the suspense. Let me try now. Now for some reason. Doesn't come up into screen. Let me just close it. That's weird. We did that a few minutes ago. Okay, would you like us to take a minute break and you can try with the technicians to fix that and we'll come back? Yeah, sure. Thank you. Okay. So we'll be back in a couple of minutes, hopefully with this small technical issue sorted. Okay, we're back. We're crossing fingers here. Okay, I think we're ready. We're not going to see it in full screen immediately but we'll sort it out. So... Yeah, can you see the question? I'm being told that it's usable. So go ahead. Okay, so I'm doing this way. Okay, I can see it's perfect. Yeah. So apologies. No problem at all. We're ready. Okay, let's start. So just a few things about me before jumping into the main topic. So my name is Guillermo Yuzier and I currently am part of MSD. If there are people in the audience coming from North America, probably they know the company more as a Merck. Basically it is one of the top 10 biotech pharmaceutical company in the world. And there I'm busy on understanding new cases of application of computer vision to the biotech manufacturing. And in order to understand if really we could produce some value for not just for a house as a company but for the patients as well as the area where I'm involved is in the immune oncology manufacturing. I start my journey into big data in the machine learning and AI at Optum NIVM before joining MSD less than two years ago and coming from other experience at FAO of the United Nations. So today I want to talk about adversarial attacks in relationship to computer vision. Unfortunately, adversarial attacks happen to any kind of machine learning AI system but because this is my field of expertise today and trying to put the attention of this matter that is not yet addressed by many organization so it requires more attention. And by the way to set up a defense strategy to mitigate the effect of an attack it's really challenging and the greatest challenge of all is to try to explain this concept in less than 40 minutes. So let's jump into the main topic. So just to have everyone into the same page computer vision is the field that try to set up some techniques or that allow machine to see and understand some digital contents such as images or videos or live camera streamings. And this is not the future is the present. So there are many application deployed to production about computer vision. Some of them are within the perimeter of your organization such as for example visual inspection or product inspection in manufacturing or medical imaging analysis. Some others are deployed in the wild such as for example automotive in surveillance or some other application with drone for example. So they are much more prone to be attacked to adversarial attacks. In general computer vision nowadays there is a big chance that is based on deep learning which is a specialization of machine learning which relies on deep neural networks. There are different architectures for neural networks but today I'm going to focus on one specific which is the convolutional neural network which is specialized on addressing non-linear problems in computer vision. But by the way each architecture share the same concept. So there is an input layer, an output layer where the results are produced and different combinations of inner layer. That's why they are called deep because there are so many layers between the input and the output. And with reference to the machine learning so another challenge when it comes to in particular testing or security it's on the nature of this new paradigm because in traditional programming in IT you have some software engineers that implement the code of your application so that anytime you receive an input then you expect a given result. So this is a very easy to test to address problems if any to do bug fixing but when it comes in particular to deep learning it's different story. So basically the data scientists and the machine learning engineers provide the input and the expected results and then through a learning process is the algorithm itself that has to learn and produces the logic. So that's why you often could hear referring to deep learning models as black boxes. So you can understand this is an extra challenge also from a security standpoint. In this slide, I prepared also short video but probably I will move on in the interest of time just to show how an information propagate across a convolutional neural network. So the input images it's split across the three color channels, red, green and blue and then the information is passed through the different layers that typically could be a convolutional layers, pooling layers or fully connected layers. The input layer typically is a convolutional layer and the image within a given channel is not passed as a whole. So it's split in small chunks of the original image and the process happens on this small portion of the image. This is basically mimicking what happens when our eyes see a new image or a video and the information is passed back to our brain even if then the implementation of the network a little bit differs from the mechanism that our brain uses to process images and videos. The last layer is always a fully connected layer that produces the result. In this case you have a sports car result which is the prediction for that image which is in this case correct but could have been something else depending on how well you train your model. One concern that I have that I'm seeing so far in related to the evolution of computer vision system is an analogy I see between the evolution of the web that happens 30 years ago and what happened in the last decade for the computer vision. This was supposed to be an animated slide so apologies for that. But by the way, we are not losing so much content so basically the chart on the background is showing that for the web everything started early in the 90s with the HTTP protocol and the first release of the HTML markup language. So that means that in time there had been an evolution so new version of the protocol and language new components had been added to build the web and related applications. And the outcome was that nowadays we have web application that are far away better than traditional desktop application in terms of feature or user experience. The downside is that the more component had been integrated the more security risks you have. But in the last 12 years about or more there had been much more maturity from our organization in terms of applying security by design to web applications. And so there are multiple initiatives such as these on the screen as an example so probably you're familiar with the top 10 web application security risks from the OWASP organization which is always updated with major security risks for a web application. And if you look at your experience on dealing with web application in your daily life it's now more frequent to find a web application which is affected for example from cross-site scripting or SQL ingestion problem because there is a much more major mindset for that. The same is happening for the computer vision because in the last decade we have plenty of new CNN architectures which are very powerful. Also the evolution and the reduction of cost in terms of the hardware is making a lot of companies that are big tech companies, SMEs, startups which are providing dedicated hardware for this but there is not yet a standard awareness document for security same as for the OWASP one or other initiative related to the web. If I look for example to back to the first time I did a talk on this topic which was a couple of months ago there was no real concrete initiative on this space looking for example at the INISA website which is the cyber security agency for the European Union. If you look back at their website there is a list of the 15 major treats and related reports for anything which is IT but there is nothing, no reference to artificial intelligence or in particular to computer vision. The only initiative that came up lately is a joint venture between Microsoft and the no-profit organization and MITRE which produced the adversary attack metrics for security analysts which address a lot of assessment that analysts have to do in case of attack to in general a machine learning AI system but there are not yet still specific references to the computer vision but it's a good start. So this is something that is a sign that probably something is changing on this. And so today the goal of my talk is also to make people aware if you have a computer vision system what could be the treats you could face and what could be a good mitigation strategy for that or a starting point for that. So let's dig into the main topic now. What are adversarial attacks? So adversarial attacks could be two different types. They could be natural adversarial attacks so it means that this doesn't implies that there is a necessary malicious intent on that. You see an example here, you see in this picture there is a butterfly behind the grid so there is no malicious intent so it's just the case that the butterfly went behind the grid and probably this is going to fool your recognition system to recognize the butterfly as something else but that's fine, that could happen. There is no one trying to deliberately fool your system. Here there is another example on the bottom of a natural adversarial attack where basically there is a malicious intent because this was a typical action. I grew up in the south of Italy and this was a scene that I saw a lot of time so there were some gunmen testing their illegal firearms on a road sign. So of course imagine you have your self-driving system or navigation system that spot that sign with those bullet holes and that could fool the system not recognizing that particular sign causing trouble. So in this case, there is a malicious action but not the malicious intent in the sense to fool a computer vision system but this could happen. Today I'm going to cover the other type of adversarial attacks, the so-called synthetic attacks where basically the attackers take some legit input to the images or videos and add some perturbation and produce an image which resembles the original one which cannot fool our eyes but definitely would fool your system and so you can understand that this is done with a definitive malicious intent and I prepared some example to explain this and try to explain why this happens for neural networks, where it happens because that's important to understand how to do some remediation. So basically consider these two images. So one of the amazing things of convolutional neural networks is that they nowadays have a high accuracy so they are deployed to work also in mission criticals problems and the downside is that if you are the very little perturbation, the network could fail so it means that could produce the bad inference, bad result. If you look at these two images, the one on the left is a legit image of a Tabicat while the one on the right looks like the original one but really that is an image that I tampered myself using an adversarial attack algorithm which is based again on AI which is called DA2FGSM, FGSM stands for fast gradient sign method which basically adds some perturbation so that the final image looks like the original one but we didn't tend to make the system do a different prediction. In this case, I try to do a so-called targeted attack. What is a targeted attack? So it means that I'm not trying to make just the CNN the computer vision system fail. I'm trying to make it fail by producing a result that I want the network produces so not just a random result. So in this case, if you look at the image on the left this is a Tabicat. A Tabicat basically is still part of the Tagger cat breed but there are some patterns that identify it as a Tabicat such as for example, probably you can see this pattern on the cat forehead which resembles the M-digit and some others such as the color of the fur and some of the strays and others by observing the behavior of the model to which I submitted this image which produced a result Tabicat with even if it's not good confidence around 64% but my way is the right subject in that picture. I observed also that the second top prediction was a Tiger cat with lower confidence that Tabicat. So I directed the attack to add some perturbation so that by submitting the tampered images to the same system, the result was as a specter so it was around 100% Tiger cat. So I reached my goal and how this happens. So let's move to the next slide. So in order to understand why the model is producing that result, just consider that when your system is subject to an attack like this, you of course, you don't have the situation on the left so you don't have the original image. You can just observe the image or a frame or a video you received that filled your network and that's the only part that you could use for your investigation. As mentioned before, the neural network are seen as black boxes but there are several explanation techniques that could give you some insight on why the model produced that particular result in that case so that you could also explain this to non-technical people but also send this information back to your engineers, the machine learning engineers, data scientists to use this information to make the model more robust in terms of accuracy and also in terms of security. This technique I apply here to explain the result is called the anchors. It's some sort of eye accuracy rule-based technique which works on a local prediction. So it means it doesn't explain the overall behavior of your model but it explains the single prediction. So I submitted the temporary image of the cat to the system that produced that result and now I'm going to explain why is coming up with the tiger cat result and not a tabby cat. So this technique basically has two steps. The first step basically generates some regions into the original image called super pixel. A super pixel is our groups of pixel on the image that share the same low level features such as the color, the brightness and others. Then the next step is to calculate the anchor. The anchor basically is the links between some of those super pixels and the presence of upsets of the calculated super pixel into the final anchor. Tell us the information on where the model paid attention to produce that particular result. So you can see for the tampered image which is this one on the right. The produce anchor is this one. And from this we can understand that the model was still good at recognizing the subject of that picture as a cat. That's fine. So this is the good news. The bad news is that if I compare the same anchors on the legit image I can see that there is something missing here. So because this was a targeted attack. So definitely the model couldn't recognize the cat as a tabby cat, as a tiger cat. And the missing part, so the model now here on the right is not paying attention anymore to an important information on the original image which is the forehead pattern of the cat. So that's why it's producing that particular result. Now, I understand this is a silly example but imagine happening the same in some autonomous driving system or some biometric surveillance system because these techniques work on any computer vision system. And in a couple of slides you will understand why. So this is why this is important than COCOS series consequences in the real world. So let's move to the next example. In this, let me just move this up here. Supposed to be some animation. So this is another interesting example. It was really hard to prepare for this presentation because I have to first pick up some wake model to a CNN model to do this and then pick up some image that could fool that particular model. Typically when you train your deep learning model you prepare some meaningful data set for training and validation of course of the model and try to have some meaningful images for it. Imagine in this case that was a model that should be able to recognize cars, different kind of cars and also other vehicles and objects. So I pick up this image from the web. You can see, okay, this looks a very nice image of a full spag and beetle. So it would be good to train the model and also I expect if I submit this image to have the proper results for the model. So I discovered that by submitting the legit image the model wasn't really good. So it produced a result that is totally wrong with this. But that's fine, that's another problem. The interesting thing is that when I start performing an adversarial attack I found this result. So the final outcome of the model in this case it wasn't a targeted attack. So my goal here was to just fool the model to produce something else. And you can see that by the way the result changed. It's 99% convertible and it is interesting because the original image also contains some natural adversarial attack. So probably you don't know if you already spotted it but if I move to the next slide you will see that this is a combined attack, a synthetic and natural attack. So it means that in this case looking at the same applying the same techniques the handker techniques here. This is the produced result. I could ask you not paying too much attention on the image on the left for one reason because these handker techniques has been proven that is very effective on explaining the outcome of prediction. In those cases where the model produces a result with a high confidence. So typically if it's around 89, 90% or more but in some cases also after 70%. So you saw from this prediction it produces just 45% confidence. So the result of the anchor should not be taken into great consideration but for the adversarial sample we saw that the confidence was 99% recognizing this as a convertible car. Looking at the produce anchor you will see that the model is paying attention to some super pixels that really resemble this as a car because this is definitely is a car but is not an a convertible car. So did you spot now where the natural adversarial attack is? So because this car comes with the luggage rack on top of the auto bit this is projecting a shadow on the top of the car. And so this is the model is paying attention among other things to this super pixel. And this is fooling the model say, okay to me this is a convertible car which is not because it's just the shadow of the luggage rack on that. And this is something that by the way is you have to consider not just to defend your model more robust from adversarial attacks but also for the overall accuracy of your model because in this case for this particular super pixel there was no malicious intent. But by the way that information is contained there. So it's the case that you go back to train the model in a different way. By the way, this wasn't a good model because was misbehaving for some other result. So at this stage, if I come back to the original example of the cat. So probably you're still asking but why my eyes are still seeing a tabby cat in that picture same as the adversarial sample and the model is producing a different result. So in order to understand the weakness we have to go back to the training process for deep learning. So what happens? So this basically is an iterative process. You typically prepare your data set for training and validation. Images are passed as input to the model. This information flows through the different layers. And then the model for that particular input produces a prediction. The predicted value along with the expected value are passed an input to the loss function which along with the optimizer are the other two important components of a convolutional neural network that basically produces a score which is called a loss score that is sent back as input to the optimizer that needs to use this information to update the weights of the learning parameters of the model. So this is the process that happens. The goal of the optimizer is to minimize the loss as much as possible to improve the performance of the model. Can you spot the weakness on this slide now? So basically the problem is that the gradients are back propagated to update the network learning parameters only but there is no action, no gradient calculation for the input data. So it means that as an external attacker just updating a little bit the input data is possible to maximize the loss for the predicted result and which should be closer to the true prediction. And at the same time in the case in particular of targeted attack is to minimize the loss for a different prediction. This basically doesn't touch too much the semantic characteristic of the original image. So the tabicat is still tabicat, the car is still a car but the produced result will be something else but your eyes will still see the subject as for in the original image. So this is where the weakness is. Forgot to mention before that in this conversation I'm considering a situation where your computer vision system and so your model is within a very robust security perimeter. So it means that as an attacker I don't have any way to reach the model. Otherwise in that case will be much more easier to tamper with the model. I'm not a professional hacker but I know at least a dozen ways to tamper with a model if I could get my hands of the serialized model. So the only way for an attacker to fool the system is to manipulate the input images. As you can see by working on this mechanism I can have some way to do this. So this is a weakness that is in the nature itself of convolutional neural networks but there are different ways to put remediation to this how to make your network and computer vision more robust. First I have to do mention about the typologies of attacks. There are two main families, single step or iterative. Single step means that there is just a single step of grading computation on the input data and this is the kind of attack which has the better transferability. So it means that whatever is the algorithm and attacker used to tamper with the input image there is a big chance that the same sample at the serial sample work on different architecture but unfortunately this is the kind of attacks that is the statistic has been observed is a less frequent attack while in the iterative attacks there are multiple steps of grading computation to produce the adversarial samples. They have less transferability so it means that the same adversarial sample if it works with a particular network architecture doesn't mean that would work with others but don't count on this because the landscape is changing there are new techniques such as the one I used which is moving beyond this issue and this is where the statistic is showing the majority of the attack are made as an iterative adversarial attack. So how we could defend our network? So there is one solution which is the quick not so dirt that has several outcomes that I will explain in next slide but let's go to the details then I prepare also an example at the end of the presentation for this. So one way is to basically we don't want to touch our pre-trained CNN so the training process for that particular application particular problem of our system should happen the same way as it usually happen. So you're not going to touch your CNN but what we could do is that we don't directly pass the input image whatever it is if it's a legit image or adversarial sample we need to put some randomization layer upstream. The first randomization layer that's a very simple action so once an input image is submitted to this layer just applies some random resizing within a given range to the image. Then the output of this layer before going to our CNN is the input to a second randomization layer which adds some random padding to the image. The result of the passage across these two layers is an image which contains the selection of those two random selected patterns which has the effect to remove most part of the perturbation that have been added to the original image. And then this is the image that really is passed finally to our CNN which will produce in the case of a legit image or adversarial sample the same result with a little bit different confidence but at the end it will be the expected result for that prediction. There are, this is quite simple and I'm suggesting to start with this particular mitigation because it has several outcome. For example, I'm pretty sure if you already have some computer vision system moved to production you know that the training process is time consuming depending of the particular use case could happen in hours or days and typically you don't just consider a single architecture or you consider the same architecture with different values for the hyper parameters. So you do different experiment and you know that the training process is time consuming and costly and by the way time translates in money as well because you need some dedicated hardware resources such as GPUs or RAM, cloud storage for the data, et cetera. So in this case you don't need to change anything or retrain your models because we are putting something upstream at inference time. In terms of the inference time basically by putting those two extra layers before submitting the image the other computational overhead is almost negligible from the calculation the stats you can see that is really negligible depending on the specific use case or architecture. This is also more diagnostic because you saw we didn't touch the original CNN architecture for our system so whatever it is you can still apply this simple mitigation strategy. It has also minimal impact on the performance of your network by performance I mean here all the metrics you typically check to assess if your model is ready to move to production or not or what is the best candidate to move to production such as the accuracy for each single class or the overall accuracy of the model, confusion metrics and many, many others. The test that we did basically show that by applying this strategy the accuracy of the model is affected in the worst case in no more than 0.7%. So it means for example for a given class the accuracy of your model is let's say 98% by applying this strategy then the result will be no less than 97.3% in the worst case which is really good considering that all the benefits you have by applying this. And last but not least because the adversarial attack algorithm landscape is continuously evolving you cannot foresee a specific strategy for each one of them because you don't know which one the attacker or the attackers would use. So this is basically diagnostic for the type of attack that at the end will produce the same sort of results so some adversarial sample where there is a minimal perturbation trying to make your system diverge from the proper result. I prepare a small example here in the slide or a real use case for a very popular and used neural network, the Inception ResNet V2. And basically in this case this is how this strategy translates. So you have your input images you train your network, do transfer learning whatever you need to do with your network then start adding the layers and also if needed, in this case if needed some sort of preprocessing in the image. In this case is a very simple preprocessing stuff is just resizing the input images to 299 per 299 pixels per three color channels of course because this is required by the model. So this is not the preprocessing that is needed from this mitigation strategy but because this is required by the Inception ResNet V2. Then you set up the first layer, the randomization layer. You can stay in an interval of randomization which is not so far from the sizing of the input image. In this case I'm suggesting for this example 330 per 330 could be a little bit less because if you just consider all the potential combination of resizing randomization it could be we are talking about some thousand of combinations. So this is not convenient also for an attacker to consider all of the potential randomization layer that has been put to defend the network. So this is an interval which is pretty robust. Then you configure, put the configure the second randomization layer which typically from our experience came up that in many cases just one pixel is enough but I suggest no more than three or five pixel to make this pretty robust. Then the result of the passage across the two layers and the preprocessing layers is a randomly selected pattern for the image and by submitting the original image of the adversary one the prediction would be what expected. You will just notice a little bit change into the confidence for that prediction which is no more than 0.7% in the case of the adversarial sample. So as you can see this is a very simple strategy that works. It could be by the way is an excellent start depending on your needs if for example cost of retraining the model just to make it more robust to defend it. It is not an issue. You can also do something else such as for example retrain the model by adding those randomization layer as part of the architecture of your model. That's a good strategy and not expensive if you really have to retrain the model by the way or start with from the scratch this way. You could also perform adversarial training which means that when you prepare your data sets for training validation you don't just put their legit images but also adversarial samples that you could generate different techniques or applying algorithms or to generate some random samples starting from the legit one. The downside is that this could have an impact on the accuracy of your model because the model could probably overfit having some issue because we'll learn something from the adversarial sample that could be part of the legit images. So this is something that you really have to be careful the way you set up this and by the way you will have extra costs in terms of training because of course you will work with more bigger data sets and also have to do different consideration in terms of preprocessing preparation and the model configuration itself. You can also do something, some defense at the preprocessing level which is good and also you can still have the benefits of the randomization layer strategy. You could do something more complex by moving to variational out encoders. I never tried myself but there are several use cases and papers on that. So if you have a chance to work with that just let me know how it would work. By the way, the message I want to pass today this is the last slide is that from a security standpoint please consider your computer vision system in particular the models, pay for them the same attention as for any other components of your infrastructure could be web application, network, data, everything. So because this could be a weakness point on your system. And as I mentioned at the beginning of this talk unfortunately this doesn't affect just computer vision system. So if you are some other use cases of machine learning guy working on different kind of data there are similar problems of course then you have to address the mitigation in a different way, a specialized way for different kind of data. I know that many people from other conversations you had in the past couple of years have some concern of using AI tools for cybersecurity for different reasons including also ethical concern which I totally agree with them. We have just to consider that the same is not happening for the attacker. So they don't have any problems already using AI to attack our system not just for computer vision system. So we have to start fighting fire with fire and it's possible try to apply some techniques to explain the output of your model would be beneficial for you anyway. Also to understand the robustness of your model not just from a security standpoint and keep in mind that if you don't do this kind of assessment in the cyber world at the end that should be consequences in the physical world. So that's another motivation more to move toward a better mindset for this kind of issues. And that was my last topic and I'm open now for questions and of course feel free to connect to the networking area and the big things conference website to Twitter, LinkedIn or in the next month I'm going to share some of the result of the code of what has been done in this space in my blog. So let's stay checked. Thank you so much. We gave you a little bit of extra time because we started late but now we don't have very much left for the questions but I would like to say first of all, thank you very much. I found that fascinating. First of all, because I love cats and I have one just like the one you showed there. Also, I am a photographer and I spend so much time removing what you call perturbations but what I call noise or grain from my images. So I found this fascinating. I would have liked to have seen some of those images blown up and enlarged so that I could see exactly what you meant by the perturbations. But it seems like these defense strategies that you talk about is a bit of a cat and mouse game with hackers. So if you could perhaps just tell us briefly how you think attacks might evolve in the future. So how would hackers take advantage or move on from what we've been looking at already to create even more ways of fooling AI in the future? Yeah, one thing that is observed at the moment is that you remember I talked about two different categories of algorithms, single step or iterative. Now we are serving the birth of some algorithms that are a combination of both. So this is an extra challenge because at the moment they are in a very early stage. So the result and the effects are similar to the existing algorithm, but probably this will change in time. So we have to stay really tuned on this. This is something that is happening now. What happens in the future we never know because of course there is evolution of computer vision algorithms. And of course, consequently there will be new trades, new way of attacking them. Okay, Google Animal, thank you so much indeed. I'm so sorry we don't have time for more questions, but we're really up against the clock now. But once again, thank you very much for that amazing talk. Thank you. Thank you, everyone.