 Hello everyone and thank you for being here with us today. Our talk is about building trustworthy AI with open source. We will start by short introduction of what are the challenges that came in with the use of machine learning and AI. Then we will talk about what trustworthy AI is. And we will end up with how we can operationalize trustworthy in action, in practice. Today I'm here with Tildora Sejkova and I'm Diana Tanasova. We are both open source software engineers in the open source program office in VMOR. We are working on machine learning and security related projects. So let's get started. I'm pretty sure that we all here know that AI has a great potential. It is changing the way we entertain, the way we communicate, the way we work, do business, but even the way we live and think. It is disrupting every industry in every country. And it also promises us to help us solve global challenges like climate change, protect our environment, protect and preserve biodiversity, provide quality health care in places where such is an absence and other. But with the growing adoption of machine learning in production systems counts the risk of being crunk. So is AI risk-free? We believe that when much is involved, it is by default fair, neutral and cannot be wrong. We believe that the use of AI in the recruiting process or for criminal justice, these will eliminate human biases in decision making. But as a matter of fact, AI brought up questions about ethics, trust, responsibility. And as machine learning systems becomes more powerful, but also more complex and less transparent, there is an increase in ease of having explainable AI. Also there is a new kind of security and privacy vulnerabilities that came in with the use of machine learning. But we are using software for a while, so what's the difference now? In traditional software development, we are providing clear instructions to the machines what to do. Meanwhile in machine learning we are providing a lot of data. And the algorithms try to teach themselves or try to find statistically significant patterns in this data. And now we are asking questions that may have no single right answer. Questions that may be controversial, may be subjective. And the answers we are given are more probabilistic one. Like you will probably like these movies, probably not. So we should be very careful how we are implementing fully automated AI-based solutions, especially when it comes to decisions that involve ethical dilemmas. Because our life is full of ethical dilemmas and we, by ourselves, sometimes we do not know what's the right decision. And we should be careful if we empower AI to make decisions on our behalf. And now I will go through these three problem domains. First one, the ethical problems. As I said, the AI requires a lot of data, but we should not forget that this data is produced by us. And we as humans, we are biased and we make mistakes. So that this data produced by us reflects current social inequalities. And when algorithms then teach themselves, they consider all these as a norm. And I will provide you just a few well-informed examples that contain unfairness. The first one is Amazon. Amazon has been using an algorithm in their recruiting process, which has been found to be injustice against women, because algorithms taught themselves that men are more preferable for technical related jobs. Sometimes, we have to think about the way to jobs. Sometimes biases could come not from the data itself, but could be injected during the labeling process. A classical example here is the ImageNet. This is an image database that contains millions of images. And each of these images has been labeled manually by persons. So these persons were responsible to identify what's on each image. If there is a cat, dog, apple, whatever. And it is figured out that when it comes to pictures that contains humans, things get strange. For example, a young man drinking beer is categorized as an al-Kahlik. This database also has been later found out that it's unbalanced in terms of age, gender, skin. And all these later reflect on the algorithms. And it could become more worse when it comes to human rights and our health. US healthcare industry is using machine learning algorithms to help them specify which patient will benefit for additional high-risk healthcare program. And this algorithm has been found to be injustice against black patients because one of the input parameters has been the past patient's healthcare spendings. So all these examples that I have just provided, they has been unintentional. They are already fixed, but these systems has been working for a while and during that time nobody has an idea that there is a problem with these machine learning systems. So this leads to another question. How do we know if these systems are making the right decisions? How do we know that the decisions are the best one for us if we truly don't understand them? And here it comes, the need of having explainable AI. Explainable AI is a research field on machine learning interpretability technique which aims to understand the model and provide human understandable explanations to different stakeholders. So if we are able to achieve explainability in machine learning, we will first, we will be able to understand the decision made by the AI systems, but also we will be able to kindly debug unexpected behavior from the models. And all these will encourage AI adoptions. AI adoptions that is built through trust because actually we are going to understand all these algorithms. The black box will be not so black. And all these will be used as a foundation for defining regulatory compliance. Last, but not the least, security and privacy vulnerabilities. There's a new way of security and privacy vulnerabilities called adversarial attacks. And it's figured out that even that even attacker does not have any prior knowledge about the model or the input parameters or the training data, the attacker still is able to kind of still or duplicate proprietary machine learning model. Also it is able to probe if a specific record has been included into the training data set. With the use of adversarial machine learning, an attacker also could create slightly modified input image that can totally fool the model. And here I am providing you two examples. On the first one, the perturbation designed by machine learning adversarial machine learning is designed to mimic graffiti. And even though that we still, as a human, we still that there is a change, that there is something on this top, we still see stop. But the machine learning algorithm is totally filled that this is sign speed limit 45. And on the second example, these eyeglasses, they demonstrate that if somebody is wearing these adversarial generated eyeglasses, someone could be impersonalized. So these are examples from the physical world. In this domain, there is quite a lot of research that is happening now. Tildor will talk more about this. And once we know that there is a lot of problems with using machine learning, the industry, the big tech players, governmental bodies like European Union, US government, Neuro Foundations, they all have put a lot of effort to define what does it mean trustworthy AI. So all they have started providing their explanations. And here I'm providing you just a few references to some of the documents, but luckily all of them, they have concluded that more or less, they're talking more or less that our system should be secure, that they should be, that we can believe in them. So they overlap. And here I'm sharing the Linux validation AI principle for trustworthy AI. This acronym repeat captures all of them. You can see them reproducibility, robustness, liquidity, privacy, explainability, accountability, transparency and security. Someone could say that these principles are vague. Yeah, what does it mean to be robust? But still, this is some start. So how do we achieve these principles and actions in practice? All AI innovators, they all see the benefits of, if they're able to implement all these principles. So they see that if they're able to create sustainable trustworthy systems, they will be able to mitigate with the risk. They will be able to meet constantly changing regulatory requirements. And all these will provide them a competitive advantage. They will be able to attract new customers, retain existing ones, and they will build confidence. They all understand that all these principles should be ubiquitous. They should be evangelized across the entire organization. All these principles should be operationalized in every single step in the machine learning lifecycle. And another thing is that the responsibility should not fall just on the data scientist folder. But still, what's the way towards AI? Being able to create diverse team of experts will be critical. These team will include different roles, domain data scientists, machine learning engineers, domain experts, legal and other. But still, we have an increased needs of having tools, frameworks, libraries to put them in action, to use all these libraries and frameworks in the development. And now comes the second part of our talk where Tudora is going to show you some of the open source tools that can be used in the development process to achieve these trustworthy principles. Thank you. Thank you. No, I don't need it. In the second part, I'm not sure you can. In the second part, we're gonna try to do what I decided to call an exercise. We're going to imagine that we are machine learning engineers that now have heard Diana's talk and she has raised your awareness about the new challenges that AI brings into our lives. And we want to do something about it. So where do we start from? Nowadays, no one implements anything from scratch. Usually, you go look for some open source project. But open source is not only about taking other people's work and reusing it, it's also a great community to solve such complex and overwhelming tasks such as defining trustworthy AI. But at least on this forum, I guess I don't have to preach more about this. But so as a machine learning engineer, where do I start from? How do I get familiar with the ecosystem and if there are any existing tools? At least a good starting point that I found is the Linux Foundation AI and Data Landscape. It's quite big as an image. So I didn't put it here, but it contains an interactive map of a big part of the existing open source project related to AI and data. They are grouped by categories, machine learning frameworks, deep learning libraries, I'm sorry, deep learning libraries, et cetera. Anyone can actually put their project on the landscape, but it has some conditions to be on GitHub, to be relatively popular in terms of Starsome GitHub. And as I said, projects are grouped by category and one category that is of particular interest of us today is the trusted and responsible AI. And the projects there are grouped in three, explainability, adversarial bias and fairness. And you can see these are projects that, projects that at least try to address the problems that Diana mentioned earlier. One other cool thing about this is that you can very quickly get familiar with the project, at least get some insights of the project helped because you can see our latest release, commits, things like that. So now we know, this is by no means an exclusive, exclusive like extensive list of all the projects that exist, but we've found our starting point. Let's define some tasks to solve so that we can actually apply some of those open source tools that we have discovered. So this is my imaginary task. I have already developed the machine learning model that is doing very well on the MNIST dataset. This is a very popular dataset. Now it's used more as a tutorial as an example, I guess. But here this is not the focus. So we have our machine learning model doing well, achieving high accuracy in recognizing hand-treatment digits. But we want now to extend this and let's say before we've put our model in deployment or in production, we want to convince ourselves, how does it make its decisions and maybe why not test it against some known adversarial threats. So how are we going to achieve this? For the explanation, task I've chosen the AI explainability 360. Both tools that I chose are hosted under the Linux Foundation AI and data. So sort of a neutral governing body. Both are Python libraries that any machine learning engineer can easily use in their usual development cycle in your most likely Python notebook. The first one, explainability tool, gives our provides algorithms to give the can give you insight on your data or on your machine learning model. And the machine learning model explanations can be local meaning that you can get an explanation of only one input or the whole behavior of your model. And an algorithm that I've chosen to show here is the contrastive explanation method, which is quite interesting, but you can read the paper whenever you have the time. In general, I chose it because it aims to provide human readable explanation and it's interesting to show and display. But it looks for, it works in two passes. First, it looks for a minimum set of features in your input that made your machine learning model classified in a given class, and which in the literature is called a pertinent positive, but it also then looks for a minimum set of features that are absent in your input. And if they were there, your input was going to be classified in the next most probable class, which is a pertinent negative, but it gets simpler when you see an example. Then for the adversarial robustness evaluation, there is the adversarial robustness toolbox, which contains quite a good collection of implementations of known in the scientific literature, adversarial attacks grouped in categories. It provides also some defenses, so you can try attack, defend your model, see what's going on. Here I've chosen an evasion attack that is quite popular. It's a Carlinie and Wagner method or algorithm. This is a white box attack of your machine, towards a machine learning model, meaning the attack has access to the internal, internals of the model, while the more complicated are the black box attacks, which have a more difficult task. But anyway, this one is very successful and it's known in the literature. What these evasion attacks do is they're done during the deployment phase. So your model is already trained. So what they try to do is they take an input, in our case it will be an image, they take an input image and they're looking to find small perturbations in the image that are not distinguishable by humans, but they manage to find some blind spot in the machine learning model and at the end the classification is wrong. But we'll see an example of these two. I'll have to switch in a bit weird way. So the demo won't be absolutely 100% live demo because both algorithms are a bit slow. So I didn't want to just wait for them. But I've written this notebook by taking two notebooks out of the GitHub pages of the two projects. So if you don't trust me you can go check them out yourselves. Also the project has other very interesting examples that you can find. I'll only stop on the highlights on the more interesting parts. First is what do we need to import in our machine learning development in our notebook to use these tools. It's very simple. From each tool we import in the first case the explainer class which implements the contrastive explanation method. Then from art we import a class that implements the colony attack. And both tools provide a classifier, which is a wrapper. Both tools support the most popular machine learning frameworks. Here Keras is used. And then this is the usual stuff. You load your model, load some tests it. The model is already trained. It's a convolution on your own network. And then this is where the interesting part comes. Let's say in our model which is doing very well it predicts with high accuracy the digits. We have found this example which the model classified originally as three. But it's a bit, it's not convincing. It could be three, it could be five. So let's say you start to hesitate. Even the output of the model says, okay, 19 is the logits of the class three, 14 as a five. But then the end result will be three. And let's say this is a simple example but in some deployment case where you're recognizing some, I don't know, banking, financial documents. This could come up as a question. So let's say you want to check how can these new tools help you to understand why the model took this decision. So what we can do is we pass the classifier to our explainer class. We set a ton of parameters but they're given by the authors of the algorithm or usually they have default values. And we have to run the explainer twice to generate first the pertinent negatives that I mentioned and also then the pertinent positives. So the algorithms run on the iterate, iterate twice. And what's more interesting is if we plot the results. We can see the first is the original image which was classified as three. Then on the third one we see the pertinent positive. So these are the minimum set of pixels that the model needed to be convinced that this, to decide to classify this as three, at least according to the explanation algorithm. And what's even more interesting is the pertinent negative which the explainer says that if the model, if the input had this, let's call it a dash, then the model would have decided that this is five. So up to this point it's quite convincing. At least from a human point of view you are now saying, okay, maybe my model is correct. I'm convinced that it works fine. Let's try and test it with an adversarial example. And then art comes. It works in a similar way. You instantiate a class which implements the attack. You pass the classifier to it because it's a white box attack. It has access to your model. And you generate an adversarial example. And then what happens with this adversarial example now, the original, originally the input was classified as three. But the second image, which is the adversarial image, and for me, makes no difference with the first one, is now classified as five. And I've also plotted in the third plot, the absolute diff between the two because at least for me I couldn't tell if there is any change. So there is a change. There are slight perturbations which are not exactly random, not at all random, but they seem like some noise usually on the edges. And this completely switched the class. And this attack usually works in almost all cases. It's a bit an artificial example. It's much harder to be achieved in a real world case, but it quite often works. So our algorithms are not robust at all usually, but it's just not exploited yet. But then, of course, now that our algorithm got confused, why not keep trying and try to explain why? So if we take the adversarial example and now give it to the explainer and try to see what the explainer is going to say. So I repeated again, calculating pertinent negatives, positives, and this is the result. And at first I was a bit disappointed by this result because it's not as beautiful as the first one. You can see on the third figure is the pertinent negative saying, yeah, maybe if our input contained these pixels, it would be more rounds or more like a tree, okay. And then the pertinent positive, which is the last figure, it's completely random. And it makes no sense to a person. But then when I talked about it, this is what actually the adversarial attacks do. They find some changes to the input, which for a human don't make any significance for a human, but actually they are quite special for the machine learning model and good enough to confuse it. So this is it. As you can see, as Diana said, this is not the exact science anymore. The machine learning is quite vague. It's about interpretation. Even if there is human in the loop to evaluate the results of an algorithm, even the human can get confused because we're not in the training phase anymore. We don't have the labels and the animation that maybe the labels could be wrong. So it's a new topic that will bring new problems and we better start addressing it sooner, I guess. So this is it, we have time. But this is it, so we have time for questions now. Thank you, if you have questions. There are some defenses that exist that you can apply, but they're like, I'm not really, I haven't learned much about, I haven't played with defenses much. So I've seen even playing with the input, like doing some compressions on the input, which somehow the filtering, the compression, mitigates this, there are also, there is an adversarial training that with the adversarial training, you try to also include these adversarial examples in your training, so now the model can react. But as the teacher said, it's a cat and mouse game, so you keep trying to defend yourself, they keep attacking you. So there are some defenses that you can apply, but I cannot speak too much about them. I have to check, I haven't heard. Yes, I forgot to repeat the questions, sorry. But I will repeat the next one. Much than we have. Okay, if you don't have any questions, we, you have one, I'm sorry. Then we can hang around if you want to chat, if you don't have any questions. Thank you. Thank you.