 Hello and welcome everyone to QCon 2021, North America. And good morning, good evening, good afternoon, from wherever you are watching this. Hopefully you're all enjoying and are keeping yourself safe. Myself is Animesh Singh. I'm the CTO for Watson Data and AI Open Technology. And with me I have my colleague, Andrew Butler. Andrew, do you want to introduce yourself? Yeah, I'm Andrew Butler. I'm working for an open source team in and around Kubeflow, integrating trusted AI and KubeTekDome projects. Great. This is our topic for the day, defending against adversarial model attacks using Kubeflow. Obviously Kubeflow is a conduit. We are at KubeCon and Kubeflow is a machine learning and AI platform built for Kubernetes. But then in addition to that, we need other tools and technologies and, you know, in general, what are the tools, algorithms, metrics, scores, except what to use. So this talk is going to be focused in those areas. Now, in general, business, you know, about IBM. IBM does have a history of tech for social good, right? If you are not aware, right, there were thousands of IBMers who actually work very closely with NASA and participated in the first moon landing, right? We have been doing a lot of research and contributing to the field in human genome sequencing, off late with call for code efforts. IBM has been very, very active in areas like infectious disease response or climate change, working jointly with United Nations, and cross and others. Right. So there is a lot of work IBM has been doing in the field, which is, you know, in general for the larger social good. And getting forward that tradition, we have also been working responsibly to bring trust and transparency into AI. Now, what do we mean, right? What is our vision for trusted AI? Essentially, you know, we look at this world from the pillars of these four pillars, robustness, fairness, explainability and lineage. Robustness means, okay, can anybody tamper with it? Did anyone tamper with it? Fairness, is it fair, unbiased, ethical, explainability? Can you explain what your AI is doing? And then last but not the least, lineage, right? Can we trace back the lineage and have an audit trail? When we translate that back to open source projects, the projects which we have in the corresponding areas, for example, in robustness, we call it robustness 360 or R. There's one of the first projects which we moved out in this particular space into open source. And then, you know, followed with fairness AI fairness 360, which was followed by explainability 360. And off late, you know, we have really released something called AI fact sheets 360, which essentially is focused on, you know, the standardization of creating lineage for your life cycle. Now moving forward, something like fairness. So if you're looking for tools and techniques on which you can measure bias, detect bias, and if you do find bias, either in your training data or your models and you want to mitigate it, AI fairness 360 is the tool to go, right? It has around 70 plus metrics on which it can detect bias and then more than 10 algorithms which you can use then to mitigate bias in your AI models. Similarly, you know, we have AI explainability 360, which has a lot of proprietary algorithms from IBM research to explain model predictions both at a global level as well as, you know, at a local level and explain not only models, right, but also, you know, datasets and datasets features. One of the things which we also did when we moved and after we moved these tools in open sources, we essentially moved them in open governance because we firmly believe just open source is not enough. Now, this committee has essentially, you know, with the input from a lot of these participating member companies has come up with eight principles, what it means for AI to be trusted. So these are, you know, reproducibility, homelessness, equality, privacy, explainability, accountability, transparency, and security. Today, our focus is on this particular area, which is security. So let's talk about security in AI, right? Now, if we have followed, you know, we are living in this unprecedented time over the course of last year, right, we have seen the state of security attacks increasing quite a lot, right? There have been many ransomware attacks. A lot of these attacks have actually prompted, if you noticed, you know, just last month President Biden calling, you know, the CEOs of all these companies, major tech companies including, you know, our CEO from IBM or when the traditional was there in terms of, you know, figuring out and forming a task force, how to handle the security and ransomware attacks. Now, the other part of it is, you know, when we actually look at the state of security for AI, what we find out that in general, you know, the awareness of risk is low, right? Or, you know, there is a low understanding of what AI security means. So that translates in general, right? That security posture is close to zero. Now, if you look at, you know, what adversarial threats to AI can do, now this is probably one of the worst case scenarios, like now this is, when we actually launched General Solidar Business Toolbox, right? This is one of those scenarios we prepared for. We had notebooks around it, like, you know, these are physical attacks where stop sign images, right? In this case, either, you know, can be intentionally perturbed or, you know, over a course of period of time because, you know, because of physical weather, wear and tear, these stop sign images are not as accurate as they should be, right? An AI model, in this case, a self-driving car can essentially, you know, not detect a stop sign and you can imagine the consequences can be terrifying, right? So this essentially, you know, highlights the importance and to the extent at which, you know, adversarial threats to AI can cause a damage. So to handle all these kind of different attacks, IBM launched a toolkit called adversarial robustness toolbox or ART. We will be using the word ART quite a lot moving forward in this session. So that refers to this toolkit adversarial robustness toolbox. It's essentially a Python library, which works across different kind of frameworks, whether it's the deep learning frameworks like TensorFlow, Keras, PyTorch, MXNet, et cetera, or, you know, machine learning frameworks like Scikit-learn, HGBoost, CatBoost. The goal is, you know, to actually evaluate, defend, certify and verify machine learning models and applications, right? So, you know, it provides many algorithms to actually detect whether your models are vulnerable to adversarial attacks and if they are found vulnerable to adversarial attacks, then it also gives you algorithms to defend against those attacks, right? And it works across different kind of data types like image tables, audio, video, et cetera. Okay, so let's look at the toolkit a bit. So, you can go and look at this demo at art-demo. I tried my Bluemix.net. So, very quickly, this is the website if you want to go and try it on your own. And essentially, you know, it just gives you a way to try out art in a very quick format. For example, this is a CME's cat. And if you see the model is 92% confident that this is a cat, CME's cat. Now, if I choose an attack, let's say a predicted gradient descent, but which is one of the attack methods which art provides, even with a low strength of that attack, the model's confidence has reduced to 4% if you see them, right? And this is just low. Now, if I change this attack to medium, now the model actually thinks this is a basketball. Now, if you see behind the scenes, if you want to look at the code, like we are essentially using the projected gradient descent method from the art toolkit, right? And launching an attack on that particular model. Okay, so now that the model thinks it's a basketball, right? And it's very vulnerable to attack, how can you reduce and mitigate that? So then art also provides you defense algorithms, right? So for example, something like spatial smoothing, which essentially decreases the pixels in this case on this image. So that attack surface is becoming less, right? So if we introduce this defense algorithm at a low strength, the model is back to 76% confidence, right? That it is a cat, great. If you make it medium, 92, that's where we started, right? So as you can see, that's how you can use both red team attack methods as well as blue team different methods, which are provides you to actually implement. Okay, let's go back to this. So we covered this Qflow and Trusted AI, right? So very quickly, I think we are at QubeCon. There are other sessions around Qflow. If you haven't heard of Qflow, right? It's a tool which is focused on the end-to-end AI lifecycle. It gives you tools to create your models and then distribute your training on your models. You launch, you know, use hyperparameter and then use neural architecture search, right? So that's a toolkit called Catted in that. Then it also gives you a platform to deploy your models and not only deploy your models in production, but also monitor them for things like drift anomaly, right? So whole end-to-end platform comes as part of the Qflow starting from, you know, very beginning, which is launching a bit of notebooks and start creating your models, right? Now, two significant projects in that space where we invest heavily are Qflow Pipelines and KF Serving, right? So Qflow Pipelines and KF Serving, essentially, are the two projects where we are invested heavily in that particular space and contribute. And when we look at Qflow Pipelines, it's essentially, you know, a platform and a toolkit which gives you capability to create end-to-end machine learning pipelines by encapsulating each of these steps inside a container, a sequence of these containers which you can orchestrate together becomes a pipeline, right? But even though, you know, everything is running in containers, you are manipulating Kubernetes objects like secrets, volumes, et cetera, there is a Python interface. So for data scientists, you can program this whole pipeline using Python, right? And if you don't want, you don't need to deal with the Kubernetes constructs, right? So that's a great part about it and that's what made it really popular. Okay, so that's Qflow Pipelines. And then there is another project which I mentioned, right? We are heavily invested in core care serving which was funded jointly by Google, Selden, IBM, Bloomberg and Microsoft. It's part of the Qflow. And here we focus on 80% of the use cases, right? And we provide serverless machine learning and financing with capabilities around candidate rollouts and also, you know, generating model explanations as well as, you know, plugging in your own pre-processing and post-processing code, right? Okay, so with that, I will pass on to Antu to talk about adversarial robustness tool box integration into the Qflow, specifically into the pipelines and the serving project. Over to you, Antu. Sure, thanks, Antu Mesh. So what we're going to be looking at specifically is inside of each of the components. So Qflow Pipelines and care serving. And so we're going to take a look at the specific adversarial robustness toolbox component inside of Qflow Pipelines and exactly what it does, the inputs that you would give to it and the output that you would expect after you're done with it. So at the beginning, you'll have some inputs for this component and specific parameters focused around the individual components use case of the trust AI toolbox. So this is where we get to see the FGSM attack epsilon, which is a hyperparameter for the fast gradient sign method, which is used to get and check the robustness inside of the model that we've just created. So for this component, which uses the fast gradient sign method, basically what happens is a number of samples are given to the originally trained model and then the model interprets how it would classify those samples. And then the fast gradient sign method will compute the loss and the sign gradient of the loss and try and maximize that loss in order to get an image that makes your model miscritic. And for output parameters, we have the model accuracy on test data. So in the original test set that you have set up, what is the accuracy for that? And that gives you sort of a milestone to do is to see how well it's your model performs against the adversarial samples, which is next in the model accuracy on adversarial samples, which says, hey, we've now perturbed your images and came up with these adversarial samples. How does your model do against that and compare it back to the baseline? And then you can compare it back to the baseline of the original test data. And just to give a little bit more information on how your model is performing as well, you get the confidence reduced on correctly classified adversarial samples. So if your model is able to correctly classify the samples after they have been changed, what was the amount of confidence that was reduced for like how confident was your model before and how confident is your model now and how has that changed. And then the last one is the average perturbation for misclassified. So if you were able to get the model to misclassify how far did the FGSM have to go in order to trick your model. And we'll look at this in a demo as well. So first we have a training step. That could be any training stuff that you use to train a model and you can be removed and placed in, you know, it's portable. You can take whatever training stuff that you have, whatever model you have as long as you change the parameters to agree with it. And so this is what we had seen before the parameters. And then we also have this model fairness check as well, which is part of the AI fairness 360 component inside of trusted AI and Q-Flow pipelines. So let's create a run here and we'll choose an experiment one that we've already have set up, trusted AI and we need to specify the namespace that we're going to use. In our case, we're going to use the Q-Flow user example com namespace and then you can see that we are actually giving actual data to our these are the defaults, but this is the actual parameter choices that we use. So 0.2 for the attack epsilon we want this model class file, a three layer CNN, and you can do this for all as well as the components for the fairness, the parameters for that then we need to specify. So which of the labels is a favorable label when we want to identify when we want to look at bias we have to say which is a good outcome and which is a bad outcome and then also which groups we believe will be biased against and which groups may be the bias may be in favor of. So this will start impending and then eventually it will show running and then it will execute successfully and we'll see this walk through these are executing fairly quickly because they're cached and so the reason for that is this takes a very long amount of time to run so it's much easier if we cash in and just show it running through and collecting from cash, but the basic idea here is that this step will train as we've seen before and then these two will once your model has been trained check the robustness and then check the fairness. So in this example we see the inputs that we have it's getting from the original that we specified and then also something similar to what we had before model accuracy on test data all the things that we had talked about previously and we can see they're very similar to what we had before and the robustness status remains false and so quickly we can look at the fairness and see we have a few things here from the ending the classification accuracy so something like the robustness it shows the original classification accuracy and then a bunch of metrics on how biased it believes your model performs and now we're going to take a look at CAHF serving so we've also implemented all the trusty components into CAHF serving so to give you an example of this we have implemented the square attack method for art inside of CAHF serving and just to give a concrete example we have an MNIST image on the left and we see that it shows a one and then it will add through the square attack method a mask that looks something like that and when those are combined hopefully if it performs correctly the square attack method then you will get an example that makes your model mispredict and so how these actually get deployed is the original spec so everything above the explainer portion is standard for deploying just a normal predictor without this added explainability and so it's very simple to just tag this explainer spec on afterwards and specify you want the adversarial robustness toolbox explainer so let's take a look at the demo for this then what we're going to look at specifically is that MNIST example that we had previously and so this is all on the CAHF serving repository that you could go and look at and we'll provide the link at the bottom of the presentation but basically what we're going to do here is we've already set up our inference service so it's ready to be queried both a predictor and an explainer and we'll show that and then basically what we're going to do is we're going to send a request to this explainer to try and get our model misclassified like it does in this image here as well so this is the kind of mask that we would expect to see let's look at our pods here see that we have them up and running and so yep here we go we have a pod for our explainer and a pod for our predictor and then let's look at our inference services ready to be queried yep you can see that it's ready here and you can see the URL that you would want to bury that I've already had it set up here so we're just calling the script that will implement the authentication and also provide a way to visualize the results that we get back from our model explainer and so that will run for a couple seconds and then give us back a similar example here so you can see you know to the average I you know an average person would be able to look at this image and tell that it's a three but our model has actually predicted to be a two this is all through the inference service and like we said easy to prop up and push open all you have to do is apply the ML with that we specified and shown earlier and you can get these examples that are able to fool your model into misclassifying these are the links that we have for the demos that we've gone and ways to reach us afterwards thanks thanks a lot Andrew that's great so hopefully you all enjoyed the session and also you know appreciated the fact that Trusted AI and in general in the context of Trusted AI the security of AI is something which is becoming very critical the time to act on it is now we don't have time to delay and given all these different regulations whether it's the existing GDPR or the new upcoming regulations which the European Union is already very close to forming regardless of whichever way we look at the need of the hour is now so hopefully you all join us in this mission to make AI more secure more robust and much stronger so and if you need any questions or if you have any questions and you need more clarifications about these projects definitely reach out to us we would love to have a chat with you that thanks again and thanks for joining us