 Trust is the next, right, see how it goes. And I probably don't have to tell you this, but we kind of took AI out of the lab, and it's not writing papers anymore, but we are putting it into things that matter, into things like, you know, who gets to go to college and who gets to get parole and who gets to be employed, and it's inside our most critical enterprise workflows. So this notion of actually trusting AI is getting into the forefront, and it's something that in some way, shape or form might make it or break it, because if we don't make it trustworthy, we might not ever be in a position to use it for real. So this idea of equipping AI with trust is really getting into the forefront of, you know, how we design algorithms and how we implement our systems. When we look back, typically in machine learning, we thought about trust as being creating something that is very accurate, so we would say, hey, my algorithm is 99% accurate, or oh, IG, I have this algorithm that beats human benchmarks, so accuracy and performance. As we move into the next phase of using AI for real, we know that it has to be much more than that, right? We need to make sure, in order to trust a decision made by a machine, we need to make sure that that decision does not have a potential to harm individuals and communities. We need to make sure that we actually understand that decision and relate to it because we don't accept things that we don't understand. We also need to make sure that no one actually tampered with the system and that that decision was not manipulated. And finally, there has to be kind of some information preserved in the lineage of our system that kind of can be later reproduced or validated or certified so that we actually understand what happened. So what I'm going to do today is kind of talk a little bit about our works in what we call these pillars of AI, the fairness, explainability, adversarial robustness, transparency, and also how we actually make it and solve it not only at the algorithmic level, but how do we really incorporate it into the true fabric of the AI system? How do we make it a part of the actual AI lifecycle? Starting with fairness and rightfully so because it's all over the news and papers and lots of bad examples of pretty bad usage of data and algorithms. I don't probably have a tell you this either. You know it. AI and machine learning learns from data, data that we create, we collect, we generate the data that most of the time reflects our prejudices, biases, whether they are intentional or unintentional. It reflects the fact that collecting data is costly so it never fully represents this society. It reflects the fact that it's cumbersome so that data is difficult to get. And as such, these imperfect data sets applied to train the model lead to AI learning our biases and prejudices and automating them and scaling them more broadly. Interestingly enough, even though it's a very important question, this notion of fairness did not get a lot of attention until recently and it's only within the last couple of years that we are seeing machine learning community beginning to think about how do I actually really deal with biases in models and in data. Here is an example of our work. A technique that kind of attempts to take a bias data set and create a less biased version of itself. So it's really an optimization problem and it's kind of transforming the data subject to a fairness constraint, subject that you want to have a minimal distortion of individual features and also this notion that the input and output distributions have to be fairly close so that you can actually learn the model that was supposed to learn that kind of stuff. Now, this is just one algorithm. The community is coming with more and more stuff and this is a kind of a nice diagram from Kathy O'Neill's paper that shows how fairness or actually how bias can be mitigated in various parts of a machine learning or data science flow. There are ways to treat a data sample and build a more fair model. There are ways to actually build models that are directly fair. There are ways to use a bias model and then just observe the outputs of it and fix them and correct them in some way, shape or form. Also this definition of fairness is kind of hard because you know statistical definitions of fairness. There are so many of them, many of them actually often conflict. So this actually puts a lot of burden on people who are creating models and AI systems to actually understand and deploy and implement this correctly. So in addition to the algorithmic work, a great deal of focus in our research kind of focuses on how do we help people do this right? How do we develop practitioners and users and communities of users that will begin to understand how fairness feels and how it should be implemented? So last year, for example, we created a toolkit called AI Fairness 360. It's the most comprehensive open source library of fairness algorithms. It includes over 70 different definitions of fairness, fairness metrics. It has 10 state-of-the-art bias mitigators. And what's even more important, it puts a lot of emphasis on tutorials and education showing the practitioners how you actually address a problem of bias in a particular application or in a particular problem. I want to be very kind of clear about it. I don't want to take a total technocratic view of fairness and saying, hey, having an unbiased machine learning model takes care of it. No, no fairness is a part of a culture. It's part of best practices. It's also part of growing practitioners and scientists who understand that. Just like once upon a time, feature engineering was difficult, but we kind of got it to a point that it can almost be automated. Perhaps something like that can happen to understanding fairness in AI systems. So a great deal of our work actually is really directed towards tutorials, educations, and working with the communities and working with all of the researchers around and therefore this notion of putting everything in open source and making these tools available to everyone because there are kinds of problems in AI that one group, one organization, one research lab will not be able to solve and it's only us as a community working together that will make some sort of improvement. I'm just going to skip a couple of slides in the interest of time. Having said that open source matters, it's also important to create tools that instruments these kinds of concepts and make them kind of automate them inside the enterprise or practical workflows. So just as we have recently released the open source toolbox, we've also implemented some of these models in our products. This is an example of a product called Watson OpenScale. It's a dashboard for monitoring a performance of an AI model in the enterprise and it allows actually the full automation of fairness checking within the life cycle of the model. Moving to explainability, because if I said that the fairness is difficult, explainability in a way is even more difficult because we as humans have this fantastic vocabulary of creating explanations and giving pointing fingers and showing contrast of explanations and doing a whole bunch of stuff. This is a snapshot of GDPR where it says, so that's kind of a law, right? It has to be obeyed and it has, every user or consumer has the right to be provided with meaningful information about logic involved in the decision. And we recently, actually last year, we hosted Paul Nimitz, one of the authors of this particular paragraph in the lab and we said, what did you think by meaningful? And he said, well, I'm a lawyer, you are computer scientist. That's kind of difficult. So this notion of meaningful is very difficult to capture, right? Meaningful means different things to different users in different applications. Let me give you an example. If I'm a doctor diagnosing a patient, I actually will benefit from seeing the cases of patients who are similar to that person because this is how a doctor's mind works. I relate to examples. If I'm a user, actually a consumer whose loan was just denied, what I really want to know is why was my loan denied? I don't care about anything. Why was my loan denied and what can I do to reverse the outcome? Having said that, if I'm a regulator probing into that exact same system, I don't want to know about one data point. I actually need to understand the behavior of the system as a whole in order to understand whether this system actually violated the regulatory constraints. If I'm a developer, I actually may want to visualize the flow of information through the system because that's the only way I can understand it. It's really doing what's supposed to be doing. So what we really want to do is actually arm our AI systems with this kind of expressive toolkit, so making sure that we can actually support all these kinds of various types of explanations and this expressiveness that we as human beings possess. So a great deal of our work today in the lab goes towards actually coming up with ways to do so. So this is kind of a very short snapshot of recent works in explainability, recent techniques contributed by our teams that kind of attempt to do what I just described. So for example, Boolean decision rules creates a directly interpretable model that actually consists of a set of Boolean rules that capture the mechanics of the problem. Explanation based on missing, that's an interesting one because sometimes we make better explanation by pointing what is not there. The reason why we have a Sherlock Holmes is if you recall a book called The Adventures of the Silver Blaze, this is where Sherlock Holmes famously solved the crime by noticing that the dog did not bark. And because it did not bark, it was someone who was familiar to the crime scene. Doctors make diagnosis most of the time by focusing on the symptoms that are not there. The contrast of explanation is the algorithm that attempts to hook up to a black box model and create an explanation for a particular data point that says, hey, here are the features that are important to the decision, but here are the features that are not there had they been there, the decision would have been vastly reversed. Remember, why was my loan denied and what can I do to reverse the decision? ProtoDash is an algorithm that explains the data set by creating the most representative prototypes. ProfWate probes into a neural net model to understand where it's more or less confident and reweights the input data and uses it to build a better and far more accurate, directly interpretable models than compared to if they were trained on the original data set. So this is just a snapshot. But again, in the spirit of openness and in the spirit of really understanding that these are fast-moving fields, there are new algorithms that are coming every day. And it's also something that it's not simple for everyone to understand and relate to. We've just about a month ago created an open source AI explainability 360. It's an open source toolkit that implements all these algorithms that you saw on the previous page. But it also puts a tremendous emphasis on kind of tutorials and demonstrations and teaching practitioners how you actually take them and apply them to a problems at hand, the problems that matter. And again, just as we promote open source, we also have to be able to create tools for people who use AI, for people who might not have the skills or who might need to automate these kinds of decision-making. So again, just like we did with fairness, we implemented several of the different explainability algorithms inside Watson OpenScale to be able to allow automatic trust and automatic creation of explanation inside the enterprise workflow. I'm going to go very quickly through adversarial robustness, but the idea is that because of the nature that AI systems are built and because of the fact that machine learning models are trained on the data that we oftentimes do not own, oftentimes somebody else collects it for us, there is this wide open door for adversarial mindsets to come in and inject various kinds of things that could lead or fool the model in the same way models can be probed into so that we can steal the data or even the model itself. So a great deal of the work in the community and in our lab is now directed towards this fast-moving field of creating new ways to attack and defend AI models. So here is just one example from the work in the lab that is kind of looking at this notion of poisoning AI. And this paper is actually quite interesting because it shows that the poison data, even though perceptually indistinguishable, creates different activation patterns in the neural net. And by studying different activation patterns, you are actually able to pinpoint and isolate the poisonous examples. I'm going to run through this, but again, having said that, there are problems that we should not be focusing on in isolation and that really are problems for everyone in the community. As a result, we have also created our 360 adversarial robustness toolkit, which is again the most comprehensive open source toolkit of attacks and defenses on AI. And it's really our way of reaching out to practitioners and scientists and researchers and saying, hey, help us out here, let's work together. Everyone should be really contributing to these kinds of works because that's the only way for us to move the needle. Now, suppose that we kind of solve all these problems and we can build the most fair algorithms in the world and save algorithms and explainable algorithms. It's still not going to be enough to ensure the trust of the community and society and practitioners because we as the creators of these technologies also have the responsibility to communicate to the users what are the kinds of things that these systems cannot or cannot do? What are their actual performance levels on many dimensions, whatever that might be? As a matter of fact, these transparent reporting mechanisms, they are the basis of trust in almost any industry that touches on our lives. Like, you know, we go to the supermarket and you buy food and there is a food label, nutrition label, and bonds and financial instruments are graded and refrigerators have energy star labels and even children's toy are inspected in some way, shape or form. Yet it's only the AI that kind of does not really have this notion. You go and buy an appliance and it comes with the user's manual. AI does not come with the user's manual. So in our work last year, we started to scratch on this idea. We called it fact sheets for AI services, which is really the idea that every piece of AI, whether it's a model or service, a pipeline, a product, it should come with some sort of user's manual that describes and articulates what's inside. What does it do? How is it tested? What are the quality levels that it conforms to? Is it safe? Is it reliable, etc., etc.? So in this work, we're actually looking into how do you define these standards and metrics. And obviously, we are not the only ones. There is now a tremendous energy in the community looking into the AI standards. And even the EU Commission's ethics guidelines for trustworthy AI, if you go and scroll to the back of the document in the last couple of ages, they actually outline the checklist for responsible AI development and deployment and they're calling on practitioners to embrace it. Having said that, another important thought is, well, you know these checklists, they cannot be generated by hand. It's kind of hard to go and ask a data scientist or a practitioner and say, hey, can you spit out all of these things? So in our work at the New Rebs last year, we showed a demo where we actually demoed or showcase how many of these things can be automatically computed at the time when the model is built or are updated or retrained by using these trust libraries that we are creating, fairness, explainability, robustness, quality control in the next instance, maybe, causality, et cetera, et cetera. Now, having said that, okay, so even if we create the perfect AI and we instrument it properly, there is one more thing and it is really this responsibility to think about how we apply it and to perhaps maybe move from creating recommendations and counting clicks and, you know, generating shortest path to come to a certain point towards kinds of problems that are actually much more difficult to solve. Things like poverty and injustice and lack of access to medicine and clean water and education and climate, the kinds of problems for which today we do not have a solution. So again, in order to develop this ultimate trust in artificial intelligence, we as a practitioner should be communicating the opportunities and the potential of this technology to really actually help us with these kinds of challenges. As a result, a couple of years ago, about five years ago, we've created a program called IBM Science for Social Good and in this program we partner with NGOs, with public sector agencies, with social enterprises to understand what is it they're working on because these are the people who are on the forefront of these challenges. We are way too blessed to be able to understand what it means to be poor or hungry, but they deal with that every day and by learning from them, we get the exposure and start to begin to understand these problems and begin to foresee ways in which the technology can help address them. So when we do that, we actually partner with them, scope these projects together and invite our scientists in the lab and the fellows, IBM social good fellows, to work together to start to create some examples of how we can move the needle. These are some of the stats of our program in the last couple of years, four or five years. We did over 28 projects. We awarded 36 fellowships. We had 110 of our scientists contribute and I just want to show you a couple of examples because I think it's really truly illuminating in the sense of how difficult these problems are and how by thinking about them and focusing on them, we actually become better engineers, better developers, better scientists. We create better algorithms and solutions and we also become better human beings. Opioid epidemic, it's the largest health crisis in the United States. I actually didn't know that until we started this project but opioids kill more people than gun violence and car accidents combined. And in this project, what we're trying to do is opioid addiction happens in a very kind of behind way. You go to a doctor, you have a pain injury, surgery, you get your bottle of pills. It's not really kind of a benchmark or in any way shape or form, so you get what you get and you keep on taking. You usually take more than what you need and a couple of years down the line, you're an addict. So in this project, we're actually trying to understand can we look at the wealth of healthcare data, prescription data, claims data and can we actually develop causal modeling techniques to tease out these relationships of how addiction happens, what kind of prescribing behaviors are more or less risky, which people are more or less vulnerable because if we can do that, we can come up with better prescription guidelines, we can come up with better healthcare policy. Another example is two years ago, we worked with Neighborhood Trust which is a non-for-profit organization that provides financial counseling for low-income individuals. And we try to understand if we can build cognitive models that in a way may make or emulate how these advisors work or make decisions, at least a little bit, because if we can do that, we can actually automate this process and Neighborhood Trust can help maybe five or six thousand people a year, imagine being able to scale that and help many, many more. Another example, antibiotics of last resort, in this project we actually use looking at the possibility of using generative architectures to create peptides and proteins that have antimicrobial properties. Why is that important? Well, it's important because I think it's 2050, who estimates, World Health Organization estimates that by 2050 antimicrobial resistance will become the number one killer on the planet. And creating a new antibiotic is not something that pharmaceuticals do. It takes a long time, it takes billions of billions of dollars. But imagine being able to create a sample in virtually no time. And AI can do that, and actually we're getting results that showcase a tremendous promise of these techniques in these kinds of applications. Now, again, this is just kind of a map of our project map, from hate speech to understanding the supply chain of food pantries, to understanding causal modeling behind innovation, whole bunch of stuff. But I think you get the picture of the opportunities that we get to develop and direct AI research by focusing on these kinds of problems are really fantastic. Having said that, and I also don't want to come across as arrogant, we don't think that any of these projects on its own is going to move the needle. It's just one little tiny block in making progress and one technique at a time, one step at a time. This is basically a kind of a map of our fairness projects, only starting with 2015 all the way to 2020. And by doing this in these small increments, we were actually able to maybe do a little bit more and create these reusable patterns that now can become somewhat operationalized and maybe even lead us to some sort of a toolkit in this space. So just to conclude, we started with what trust means to us as humans to what it might be in the realm of artificial intelligence and how we can create these kinds of tools and techniques and instrumentations to make somewhat more trustworthy AI. But ultimately, as I said, none of that will matter unless we as a practitioner take this ultimate responsibility and think about how our creations are being used and what kind of problems are directed because when we do so, we will truly end up creating artificial intelligence that has the power to benefit us all. So with that, thank you.