 Hi, welcome to Three Questions. I'm Tariana Stewart and I'm an AI Licensing Executive here at IBM Research. I'm here with Pen Yu Chen, who just had eight papers accepted in the new reps. Thank you, welcome. Congratulations on having eight papers accepted into this conference. So the topic of your papers is on adversarial robustness. And layman's term, this basically means we are hacking proofing AI and different machine learning systems, right? Yes, definitely. So I guess you can say that we are living in a world essentially to where AI is touching every single aspect of our lives, right? And so we have a lot of industries that are trying to incorporate AI into their products, but they may not necessarily understand the negative impacts of these systems, or these AI systems can have in their products. If it's not in their business plan to have certain testing done, like robustness, because your AI system might be accurate, right? But that does not necessarily mean that it is robust. Accuracy does not necessarily equal robustness, which is where your research comes into play. So let's talk about your papers. Why don't you just describe adversarial robustness as in your own words? Yeah, thanks, Tariana, for a nice introduction. So my research focus on adversarial machine learning and our goal is really to bridge the gap between machine learning development versus the deployment. Let me make an analogy here. So creating our AI technology is pretty much like growing a plant. And we usually, when we are doing development, we grow our plants in an ideal environment, like a greenhouse. So we assume the conditions are nice and everything is friendly to the machine learning algorithm. But when we are about to deploy that machine learning system in the real world, it will be facing a lot of issues or troubles, a different environment. So how can we bridge the gap and ensure the machine learning we develop in the greenhouse can be safely deployed and survive in the real world? And when we're talking about survive, there are really two scenarios that we look at in the adversarial setting where we actually assume there will be an actual adversary out there, like a malicious hacker that tried to sabotage the performance of our machine learning system and tried to break in our system and gain their leverage. So that's in the adversarial setting. And also, not just for the adversarial setting, but in the natural setting, we also want to make sure our system is robust to natural corruptions, like image corruptions, incomplete information and so on. We still want to make sure our machine learning system can make robust and correct decision against different scenarios. So that's the whole picture and the reason why we are introducing adversarial machine learning in the world. And adversarial machine learning is the way we are using to achieve adversarial robustness of our machine learning system. So what was your contribution to the field of adversary robustness through the eight papers that were submitted through NeurIPS? Yeah, so a very important thing we did for the past years is to really look at the vulnerabilities when we develop our machine learning systems. And speaking of developing machine learning systems, we can develop the machine learning life cycle into two phases. So there is a phase we call training phase where you will collect data and decide which machine learning model you will use to train on those data. And there is also a testing phase where when the models are fully trained and tuned, you will deploy that model either through a white box setting where you put everything transparent to the user, like releasing the model details, the weights, everything. And or you can also deploy your model in a black box setting like through an API where the users can access, but they did not know what is the steam or the details behind that model. So looking at this life cycle of this machine learning systems development, there are a lot of places where the hackers can come in and try to compromise our systems. So a lot of the things we do in our adversarial robustness research is to actively identify those potential risks when we are developing this machine learning system. So for example, in the training phase, if we assume the attacker has the ability to inject some malicious data into our training data, then it will be a code of backdoor attacks or training phase attack. And once they inject those backdoors, they can manipulate whatever machine learning models train on those poison data. And you can imagine how damaged it will be to have the ability to affect or control to some extent the machine learning system we are deploying. So when we look at this life cycle of these AI systems, what we are trying to do is to identify those bugs or those potential risks that can potentially exist in the machine learning development and deployment pipeline. And after discovering those risks, we will propose mitigation strategies, including detecting those threats and also strengthening our machine learning systems to be robust against those adversarial attacks. And there's also another line of work that we do called the certification, which basically provides a level of robustness of the systems that we are using. It's very important because those proofs can be used for AI regulation to meet the requirements of future AI technology and so on. So that's pretty much the scope that we do to make sure our AI system can achieve what we call holistic adversarial robustness in this scenario. Well, you just describe different types of attacks. Do you mind elaborating on that and maybe talk about real world examples where this might happen? Yeah, so we have seen a lot of real world examples, kind of the potential negative impacts brought by lacking robustness of our machine learning system. So one typical example we often highlighted is the autonomous driving scenario. So in this case, we are using AI technology to help us recognize or even drive our cars. So it's very important they will not, for example, misidentify a passenger going through or misidentify a stop sign as something else. But in many research, including ours, we actually show such where we go adversarial examples, these similar examples, but carefully designed objects that to deceive the perception and decision of the machine learning system is actually possible. So for example, you can simply add some stickers to the stop sign and then suddenly, all of a sudden, you will become a blind spot to the autonomous driving system and you will recognize as something else, like a speed limit. So the car wouldn't stop at the place that it is supposed to stop. And also in our research, we also designed something like we call a physical adversarial t-shirt. So it's again some spatially designed pattern. So whoever is wearing it. It's a physical t-shirt. So we designed this pattern and we printed it on our t-shirt. So whoever is wearing that t-shirt can invade the detection of a personal detector. So you can imagine there are a lot of implicitations in terms of a safety related application. Maybe like a large crowd or something like that, who may want to, I guess, distract the system from like facial recognition that might have something on their t-shirts, right? So in a scenario, especially like death and life matters or in scenarios that is of really high risks, but we are using AI to help us making decisions or helping us to make observations. We have to make sure they are robust and safe to use. OK. So what is federated learning, right? Because that's a topic, too, that you have as part of your papers. And how has that applied to AI? Yes, so federated learning is nowadays a very popular emerging machine learning technology. So the idea is different entities. We call workers here actually hold a set of private data. Like, for example, hospitals, they have information about the patients. And also financial institutes like banks, they have some information about the customers. And all different banks or hospitals, they want to jointly use this data in a private way to train a better machine learning model, let's say a long application model or some health care related to machine learning products by collectively using the data in a private manner. So the idea of the federated learning is there any way we can build such a machine learning system that will share data indirectly without violating the privacy issues. So the idea of the federated learning is that each client will share something called gradients, which is some aggregated information about some of those functions we design for the private data. So by sharing those gradients, each client will eventually obtain a federated learning model. We'll have a better performance by aggregating all this higher level information about the private data. But what we discovered in one of our papers is there is actually a way to leak this private information. I was going to make my next question. What happens when all this data leaks? Exactly. So why this is very important is because when we are developing this federated learning system, we are supposed to protect privacy and not leaking any information. But without careful thinking or without caution, we may think by sharing gradients, because we are not directly sharing data, it should be private or secure to use. But what we discovered is that in this vertical federated learning setting, this data can actually be leaked while you are training these federated learning algorithms. And so after discovering this, we call it a large-scale data leakage issue, catastrophic data leakage issue, we also proposed some mitigation strategies by designing a more secure gradient sharing mechanisms to prevent this leakage from happening. So let's talk about another paper that's been accepted on the topic of contrastive learning. Explain what that is. And again, how is that important to AI? Yeah. So contrastive learning is also a widely adopted technology. And the goal is to really learn general representations of our data in an unsupervised manner. So unsupervised here means we are not really using any labels to learn such a representation. So it's very different in contrast to a standard machine learning setting where we are, for every task, we will annotate some labels and train a machine learning model just to solve that particular task. But recently, there's a new trend of training a large-scale pre-trend model for a general purpose. And we see a lot of success in this field, like GBD-3, like large-scale language models or large-scale image recognition models. So they learn general representations of either text or images. And with these powerful pre-trend models, you can use that model to fine-tune to different tasks and achieve state-of-the-art performance. What do you mean by fine-tuning? What do you mean by that? So fine-tuning means we first use this large pre-trend model. For example, data that we can collect it from all the image database or data we can collect it from all the Wikipedia tags. And then we train a model to represent those objects, like tags or images of our interest. And then we are going to take that representation, learn from that model to fine-tune to a specific task we are going to solve. For example, image domains, the tags could be image classification or object detection or vision question answering. And for text domains, it can be like fine-tuning a large language model to do question answering, natural language understanding, sentiment classification, and so on. Wait, hold on. The analogy of cracking the different tasks, like what do you mean by that? Just elaborate a little bit more. Yeah, so I would say the big hammer is really the foundation models that we train in a very costly manner because we need a large model and also collect a sufficiently large amount of data to make that representation, to learn that general representation. But once we spend that cost and having that nice general representations, these general representations can really represent the relations or semantic meanings of the objects we care about. And it can be efficiently fine-tuned to downstream tasks. Okay. That's why I mentioned is a big hammer and this downstream task, it can be like cracked easily once we have that hammer ready. So, but if you start, look at this problem from a robustness angle, seems are not so trivial, right? So when we are doing this fine-tuning step from this pre-trained foundational model to a specific task, we should not only care about accuracy, we should also care about other trustworthy factors, especially robustness. And again, to our surprise, there's really no free lunch when you are thinking about robustness, right? So when, although this contrastive learning idea or this pre-training fine-tuning idea seems to work when to preserve accuracy and achieve state-of-the-art performance in several downstream tasks, when we look at the robustness of those downstream tasks, it's pretty much nothing, right? So they can only preserve accuracy but not be able to preserve robustness, which means when you deploy, again, when you deploy this machine learning system in the real world, it may not behave as ideal as we want. So that's why in the other paper that we work on, we propose a new ways to train such foundational models, a new way of doing a adversary robust contrastive learning to ensure when you use our model to do fine-tuning on a new downstream task, the robustness and the accuracy can be jointly preserved. So where do you see this technology going in the future? How do you see this being accessible to industries, like we're saying hospitals or banks or maybe even the average Joe at home? Yeah, definitely. I'm very happy you brought this up. So there are two perspective, I want to answer this question, there's education perspective and there is also research perspective. So in education perspective, I really want to convince people there is really a confirmation bias in terms of using accuracy as the only metric to benchmark the success of our machine learning model. And I think there's a good reason that we have been using accuracy for so long because there was a time where AI's performance is just the way below a human's performance and in that period, we are really focusing on boosting the accuracy. But I believe now we are at the stage where we believe AI technology is mature in some sense and ready to deploy and even help industrial revolution. So in that case, we should really look at the different factors that are essential to make sure AI's can be safely and reliably being deployed like trustworthy factors like fairness, explainability and robustness. So in the education level, we really try to make sure people understand and will acknowledge the danger of using accuracy as the only metric to benchmark the performance of the AI system. And we are offering a lot of tools to help them inspect the different trustworthy dimensions of the AI system. Tools and certifications? Exactly. And a lot of tools are being offered for robustness have been offered in Adversal Robustness Toolkit, which is the open source library and that's part of our education purpose. And for research purpose, I have this long-term vision to build a system that I call AI model inspector. So it's something that I'm very excited about is really to use AI to improve AI. So the idea is really to make sure we have an AI system that keeps monitoring the status of the AI service or product that we are deploying. And that includes inspecting if there is any errors or threats that could happen in the current state of the system. And once those threats or risks have been flagged, how can we mitigate such a threat and make sure the machine learning system can operate in a safe and robust manner. And also along the way, we also need to make sure we are able to certify the level of robustness of our AI systems, especially for these high stakes applications. So that's on the research side, what I want to achieve. Thank you so much for joining me today. I really, really enjoyed this conversation. I definitely learned a lot and I hope the audience at home learned a lot too. So thank you for joining us. Please like and subscribe to our channel. If you want to learn more about adversary robustness, please check out the link below.