 Gweithio, mae'r athgorngu. Gweithio gweithio i fynd i gyd. Dyma gweithio, mae'n rhoi gweithio i gyd yn ymddangos. Mae'n gwneud o'r cyfrifio gyda'r Cardi Cymru yn yng Nghymru, mae'r ein bod yn ei wneud o'r gwaith ymlaen. Mae'n gweithio'n bwysig i gyd, David Gunning, ond ydych chi'n ei fyw i'w gweithio, ond, mae'r gweithio'n gweithio i gyd yn ei gwynhau. Fel ydych chi'n gweithio ar y gwnaeth. the presentation is on the record and the questions and answers aren't. It's really great pleasure, David, to welcome you here, and it's of honour to the Institute. David is going to be talking to us today about something that we probably all have thought about but haven't had the luxury of working on about explainable artificial intelligence. David yw'r pryddoedd Cymru yn mynd i'r bywch â'r byddwyd yma'r defnyddio cyffredinol yma o garfent rhagor pethau yw'r Abod, a yw'r pethau�arddedig sydd wedi'i gael y ddweud roedd ni'n gwybodaeth ar yr Abod, elwodraeth, mae'n addysg fel nawr yn yr unig o ddweud y gweithredu yw'n ddegwyddol yn roedden nhw, a wedyn byw yw edrych i ddigon o'i this. The thing I heard, David, is that, I think at the age of eight, you saw the sputnik go over and that was the beginning of your interest in technology. I also remember my dad bringing me out to see the sputnik, but our careers went in different directions. David has over 30 years experience in the development of AI technology and at the moment manages the explainable AI as well as communicating with commuter's program. This is David's third tour of DARPA and I can explain that DARPA has tours for four years so David has had three tours and it's there to get the excitement, interest and development on each tour. They're all different but he's also worked, as you can imagine, in startups, in industry and research. So he's had a very distinguished career and he's previously managed the PAL or the personalised assistant that learns the product or the project that produced Siri. He's at the intersection I think of the human mind and computer science and I say that because maybe there's some similarity between us David that you did cognitive psychology but then went on to do a master's in Stanford and computer science. I think perspective within that area is very valuable and I think it defines a lot of the work that you do. So we are asking the question can AI be taught to explain itself as machine learning becomes more powerful. People working in the area increasingly find themselves unable to account for what they know or what they don't know. And I suppose today we're really, really honoured to have you David here to talk to us about explainable artificial intelligence. Thank you very much. Thank you. Hi, good afternoon. It's a real pleasure to be here. This is my first trip to Ireland and really enjoyed Dublin. My ancestors came from here in the 1850s from Sligo County so it's a real pleasure to see my homeland. So as you said I'm going to give a talk on explainable AI. This is a project I'm running now at DARPA. We started about a year and a half ago. It's a four year program so we're just beginning to get some results but I can give you an idea of what the program is about. So here's the premise, the need for the program which I think we're all aware of is just to say we've now had this explosion of new AI technology. A lot of applications of AI, especially of machine learning and deep learning that are behind it and so that's being applied in medicine to do radiology. It's being applied to do sentencing in criminal justice. It's being applied in a lot of places in defense, especially in intel analysis and in the development of autonomous systems. So this new technology is much more powerful than what AI technology could do previously since 2012 is when this kind of revolution started with deep learning. So it's very effective but the models are very complex and difficult for an average user to understand. So that's the problem we're trying to address. So here's a little, most people now are beginning to get tutorials on deep learning which is really the heart of the technology and this kind of simplified version of it will kind of give you a basic idea of what's happening in these systems. So these machine learning systems can always learn to do a function, a classification or prediction. That means you give it one set of data as an input, it's going to give you an output that's given you a classification or a prediction. And this is done by feeding huge amounts of training data to these systems and that's what has to be done to make them work. So let's say you fed the system millions of images or photographs, you've had thousands of objects in there and you've labeled them all. You said, oh, this is a cat, this is a chair, this is a house, this is a table and you do that over and over again. And what happens in the layers of this neural net, you feed it the input at the top, you give it the classification at the bottom, it learns how to weight the input. It's calculating numeric weights in every node of this neural net so it can do the correct classification. So that's where the magic happens and now since 2012 this technology has actually been around since the 80s, the basic idea of a neural net. But in brown 2012 several things came together. One, there was enough data, plenty of data out there on the internet. We had more computing power and we could use GPUs and new computing infrastructure to do computing on these systems like we couldn't before. In minor changes to this architecture and then suddenly this stuff was doing much better at image recognition, speech recognition, kind of orders of magnitude better than what had been done before. So that's what came about. Now in terms of explainability there's kind of good news and bad news in these systems. You see this picture on the left or on your right. These systems have layers and they can have thousands of layers with millions of these neurons. As they inspect what's happening in the layers it tends to be at the early layers they're detecting simple features like edges, colors, textures. At the later layers it's detecting more abstract parts or features or concepts till at the very bottom it's learning to classify this picture as a wolf or a dog. When people inspect these systems they find out it's actually detecting features in these inner layers that people would understand. Just like our visual system we naturally organize information we see in the world and recognize concepts that are reusable that are important for us to understand. The bad news is the net has no idea what to call this feature. As far as the net is concerned it's a particular equation to calculate a weight at one of these nodes. So when the user gets an answer out of one of these things he has no idea why it did it. The only thing he gets is the numeric weight of the node at the bottom said more likely this is a dog that was the highest weight. So that's kind of the challenge we're trying to work with. How can you take these systems still take advantage of their power but find a way to pull out the features, the concepts, the logic of what's happening in the net and make it more explainable. So this is a kind of a simple version of what we're trying to do in the program. The top level, the top row here is the standard machine learning deep learning process today. You have a lot of training data, you need thousands sometimes millions of examples of training data to train these systems. You feed that through a learning process that then learns a learn function or a model. You can then take that model and feed it a new piece of data like this picture of the cat and if it's trained properly, a pretty high probability it's going to get the right answer, it's going to say I think this is a cat. If you say why it'll say the node for a cat had a weight of 0.93 and that's about all the explanation you'll get. You can't say why isn't a tiger or tell me more about why you thought that was a cat. You just don't get that out of these systems and that's the problem. So the row on the bottom is what we're trying to do to remedy the problem and that's really to do and make two changes to this process. One is to change the machine learning process so you get a more explainable model. So you identify what those features are inside. You somehow get more semantic information out of the net. Maybe you change its architecture so it's organized in a way that makes it more explainable and there's a wide variety of techniques people are trying and I'll show you a few examples of those. The second thing is going back to my background in cognitive psychology is to make sure you put together the right explanation interface so you can generate an understandable compelling explanation for the human user. And in this case our target is an end user, not a machine learning expert, not someone who's an engineer who understands the technology, but the end user, a lawyer, doctor, a soldier, whoever's depending on this system to make decisions and see if they can get an explanation that's useful for them. So here I believe is our characterization of the goal and I believe there's an inherent trade off between the performance of these AI systems and how explainable they are. So it's much like with people. There are people that are incredibly brilliant but can't explain anything. There are people that make great instructors but may not be the most brilliant researcher at the university. So there's something going on here in these systems. If you have the largest, most complex deep learning system that's learning all of these internal features it may not be possible to make that completely explainable. And there are other machine learning techniques like a decision tree or linear regression that are much more explainable to people but they're not quite as high performing. Now there is some debate about this. I have machine learning people that argue if we really get the architectures of these things right, if we really get explainability right, we'll actually improve learning performance. But so far that hasn't been shown to be the case. But that's kind of an open question if eventually we may get it all, right, make these systems more explainable and more higher performing but right now I think we're stuck with this trade off. So the goal of my program is just to move this curve up and to the right. These orange yellow dots are kind of where we are today and I want to have a portfolio of techniques. So any given person that's developing a new system they can look at how important is performance, how important is explainability for their application and their new techniques that they can use to make that trade off better for whatever purpose they have. So here are the two application areas, challenge areas that we're working on that are really generally applicable in a lot of areas but they're certainly important to US Department of Defense. One is if you have an AI system that's doing data analytics on masses of multimedia data. So we have Intel analysts that are flooded with images, photos, overhead images of all sorts of situations in the world. They have to pour through all that data and find items of interest, find where there's new activity in North Korea, what's happening in Syria, that sort of thing. But there's too much information for them to look at manually so you have to use AI systems to pull up the most relevant targets for them to look at. They need explanations so they know why the system is pulled up a particular item. We started the program, I was at a workshop with some Intel analysts from the US government and one of the ladies there talking about the workshop was basically a lot of machine learning people trying to pitch their new technology to the analysts. And she said, well, this doesn't solve my problem. She already had big data, data analytics systems that were giving her recommendations. But she's saying she has to put her name on the recommendation that goes up the chain. And if that's wrong, she's blamed, not the AI system. And so to her having that explanation was kind of the most important facet of the system. The second area is autonomy, right? Of course, self-driving cars, you know, we're in the middle of beginning to develop all sorts of autonomous systems for the defense department, right? So, and this is a more, still it's not quite here, right? A lot of these things are in development. You don't see a lot of these autonomous systems out in the field yet. And the ones that are coming soonest are not really very heavy in AI technology. They're programmed with much simpler control logic. But the people in the research universities are now using this deep learning technology to train robots, to train autonomous driving systems, to train autonomous air vehicles. And when that's done, if an operator sends one of these systems off on a mission and it comes back, he'll want to know why did you make the decisions you did? Why did you turn around? Why did you succeed or fail? Unless there's more explainability built into the deep learning versions of these systems, he won't be able to get that information. So those are the two challenge problems we're having our researchers work on. Probably a little small for you to read because it brings up the question of how do you know if you have an effective explanation? What makes an explanation better than another? And we'd love it if we could find an automated way to measure that, but I don't think, as far as I know, we can't do that yet. Because the quality of the explanation depends on a human user reading that explanation and getting a better understanding of how the system works. So the only way we know to measure this is to actually build some of these systems, put them in front of a user, have him use the system, get decisions, get recommendations. But then measure everything we can about it to see if he has a better understanding of the system as a result of the explanation. So he could ask them, are they satisfied with that, with the answer. One of the most important aspects is, does it give the use of the right mental model of what the system is doing? So you want the person to know, have an intuition on when the system is strong and when it's weak. When should he trust it? When shouldn't he trust it? If you gave the user a hypothetical situation, can he predict what the system will do? So that's some of the techniques we're using to see if the system gives the user an accurate mental model. And it won't be a model that's completely the same as the complex math model inside the AI system, but does it give him the right intuition or the right kind of analogous model so he understands what the system is doing? A second can measure their task performance. Of course, if the human and machine are doing a task together, they should do a better job if you have the explanation. Do they trust it appropriately? Is there calibrated trust? It can be relatively easy to trick a person into either blindly trusting or blindly mistrusting one of these systems. So the goal here is to not do either one of those, but to give the person a better idea when they can trust it and when they can't. There are some experiments done in psychology where they'll show people two different computer systems. And in one case, they put a smiley face on the computer and people will actually trust that one more. So there's just all sorts of quirks of human nature that make us either trust it when we shouldn't or mistrust it when we should. So we're hoping explanation will help correct that. Is it a little more technical diagram? This Venn diagram, which you probably can't read, is a description of all the different machine learning techniques. There's neural nets and deep learning that's so popular, but there are other statistical techniques like support vector machines. There are these graphical models like Bayesian belief nets. There are simpler systems like decision trees. So we are looking at that whole portfolio of techniques, although deep learning is the most important and the hot topic these days. So this is where I get this curve, where some of these techniques are high performing, but not very explainable. Others are more explainable, but not as high performing. This is going to show us three broad strategies that we're employing on the program. First, what I'll call deep explanation, which means do whatever you can to make deep learning more explainable. So that's the most severe problem. It's where all the activity is these days. So more than half of the performers on my program are trying to do something there. I'll show you some examples of ways they can make these systems more explainable. The second one in the middle is what I'm calling interpretable models. So this is to say don't use deep learning, use some other machine learning technique that will learn a structured Bayesian graph or Bayesian belief net, or learn a decision tree, or something that has more structure that makes it more explainable. In the last technique I'm calling model induction, this is a little bit like what people do with rationalization. We don't always know what our deep nets, why they've made a certain decision, but we can make up a plausible story that kind of fits our history and the situation. So in this case I'd have the explanation system will not try to understand what's happening internally in this black box system, but the explanation system can experiment with it, run millions of simulation examples, try every input combination and see what's the output and see if it then can infer some logic that explains what the system is doing. So I have different teams working on all of these, often combinations of these. And these are the 12 teams working on the program up on the right. There's one team that's not developing a system. They're just a group of cognitive psychologists that have studied the psychology of explanation. There's a lot of research and education in decision making, et cetera, on when is one explanation better than another? When does an explanation contribute to a good decision? When does an explanation contribute to learning? So they're just there to dig through all that literature, find the most useful nuggets out of all of that and give it to the other members of the team. The other table you see here are 11 teams who are working on different versions of how to make these AI systems explainable. Three are working on both autonomy and data analytics. Three are working only on autonomy. And five are working only on analytics. And they're pursuing a wide variety of approaches. Some are heavy and deep learning. Others are only doing a model induction or rationalization technique and different combinations. So here, in terms of the deep nets, here are four different approaches that people are trying to make these systems more explainable. And I'll show you some follow-up examples of a few of these. First is tension mechanisms or salience maps, as they're called. And this is a technique that's even in use today, a lot of our researchers are trying to refine it to make it more effective. And that's basically, if this case, you've given this image to a deep net, it said, this is a picture of goldfish. And if you say, why do you think they're goldfish? It's highlighted exactly the pixels around the goldfish in the image. You can trace back through the net and see what in the input was it paying attention to. What data, what pixels in this case made it, contributed to it making that decision. So that can often be a very useful explanation. Like one of the researchers who did an early version of this did a test system where they trained it to distinguish wolves from huskies. And then they could ask the system, why did you make this, why did you call this a wolf? And in that particular case, they trained it in a way so the system highlighted all the snow in the background. So it, which is a kind of mistake these systems often make, right? It happened in all the photos it was shown. Every time there was a wolf, there was also snow. So that was a much easier feature for it to grab ahold of. And so you can siege from just this kind of salience map, get a lot of information about what the systems are doing. So that's one. The second one at the below that feature identification. And I'll show you a more detailed example of this is really at the heart of a lot of these techniques. As I mentioned, these deep nets are often discovering features on their own that are very useful and meaningful, but it just doesn't know what to call them. So we've got a variety of people in different ways. We're trying to find out, recognize useful features inside the net, find a way to associate those nodes with language or with labels or with some kind of description that they can give people. Modular networks at the top. This is, in general, because of all the emphasis on deep learning, there's a huge amount of work throughout the research community on how do you have better architectures for these? How do you set them up in a way so they might be more modular or so the architecture is more organized? So we're very much like in the early days of software programming in the 80s. People would write code, they used to call it spaghetti code. It would just point all over the place and it was a mess. Nobody could understand it. But eventually we learned how to write software so it's much more modular, much more understandable. Something like that is happening with these systems. Right now the deep learning systems are one mass of interconnected nodes, but eventually we'll find much more regular organized understandable architectures, I think, for these that'll make them more explainable. Finally, the most interesting one I'll call Learn to Explain. This is where the deep learning people are saying the way out of this is more deep learning. So if you've trained a deep learning system to make the decision, let's train a second deep learning system to generate the answer. In a sense what it's doing is being trained to find the useful features inside the first net and generate language or images or something to explain that to the person. So those are the four broad techniques. I can show you, oh did I? I didn't mean to have that slide in there. So here's an example of the salience method. This is one of the people, a group at UC Berkeley. What they're doing, you see this dynamic image on the left there. They have the image of the goldfish. They're randomly obscuring different pixels in the image and seeing how that changes the classification. So you're trying to find what are the fewest changes to the pixels that have the biggest change in the decision the net makes and that ends up giving you kind of a very precise heat map on what the net is paying attention to. So that's one technique. Let me see, did I? Well, I'll jump to this one. I thought I had a different slide in here, but it's an earlier version of the charts. Here's, again the Berkeley group, they're doing both the heat map and the way out of it is more deep learning. So what you can see here, you get the picture of the elephant. They're testing this system by showing users an image and a question and saying can you guess will the net get the right answer to this or not? So in the no explanation case, it just shows the person that picture of the elephant and says does this elephant have tusks and now the user can guess whether or not it'll get the right answer. In the explanation case, they get the image of the elephant. They also see the heat map so they can see the net is paying attention to it exactly the right spot. It's looking exactly where the tusk is. But then you look at the language generation that comes from the second deep net, it says because there are no bones sticking out from its mouth. So now you know it may be looking in the right spot but somehow it didn't pick up the right feature. So you can correctly guess it's going to get the wrong answer and that's what happens. In the second case, this is one where it gets the right answer. Let's see, is this a professional sporting event? It both highlights it's looking at the person's uniform and the verbal description is because they're wearing official jerseys. So things line up, you get an intuition that's going to get the right answer and it does. So this data at the bottom is just showing when they show these different situations to people if they don't get the explanation they can guess what the net will do about 57% of the time but if they have both of these explanations that increases the 70%. So it really does give people a better idea of what the system is doing to get a right or wrong answer. Now here's the same group. Now this is an application of a similar technique but to an autonomous system. In this case they've trained the system to mimic a human driver so if you will they've got a simple autonomous vehicle if you will. They can then watch what that system pays attention to as it's driving down the road. You see this red, green, blue heat map. You can see what is it paying attention to when it makes a turn. Independently they've crowdsourced with a lot of people on Amazon Turk. What features of a similar image do people think are important? Which parts of the image are important? Characteristics that are meaningful to the person. And then they can take the union of these two. They can see okay what did the net pay attention to. Which of these are features people thought were important and then generates a verbal description of why the system has made a decision. So now I'll just show some very simple examples of how this works. So here's the explanation. It's saying the car slows down because it's making a left turn. You can now see the heat map. It's a little obscured because of the heat map but you can see the car did indeed make a left turn. So in that case it was a good explanation. These go on a little bit. I'll show you a few of these. The car accelerates because the light turns green and the traffic is moving. So again you see this. Closer you can see the green light there. So it's picked up that feature that people thought were important and gives that as the explanation for what it's done. The car slows to a stop because the car in front has stopped. Pretty easy explanation indeed. That's exactly what happened. So this is fairly primitive. Just the first year of research but there's a lot of interesting ideas here on how you can dig into these nets and pull out explanations that are going to be useful to people. This is one where it actually made a mistake. I think it said it turned because it saw a stop sign but it was actually a do not enter sign so it kind of had the right idea but misrecognized one piece of it. I see I've got several. Let me... These are several other researchers that are working. Here's a group at UMass in a CRA that's learning to learn causal models using model induction. Here's a group at UCLA that learns something called a statistic and orgraph which is a more interpretable model. Here's a group at Oregon State that's working on an autonomous system. Another group at Park. They're interesting. What they're trying to do is they're taking a cognitive model developed by psychologists that try to mimic human cognition and they map what the learning system features into this cognitive model and then use that for explanation. Here's a group at Carnegie Mellon is learning a physics-based model of a system that will play Atari instead of just an uninterpretable deep learning model. Here's a group at SLRI doing more feature, a salience map. Here's the group at BBN that's using work at MIT where they're recognizing reusable features inside the net. Here's a group at UT Dallas that's using an interpretable model called Attractable Probabilistic Logical Model. A group at UT or Texas A&M where their test problem is actually detecting fake news so they not only detect it but then generate an explanation of why they think that was misinformation. Rutgers, this is an interesting one that is just a psychologist who's worked in an area called Bayesian teaching which is basically if you're trying to create a tutoring system for a person you want to select what's the next best training example to show the person that will correct his misconceptions or inform him which problem is the next best one for them to work on. So in the explanation case they don't try to understand internals of the model but they can look at all the training data so they try to pick out which piece of training data is the one that really caused the system to make this decision and often showing the training examples can be very informative. Here's our program schedule just say we're just about a year and a half through this program all these teams have just finished their first big evaluation we're kind of collecting all that information you know have a big meeting in February to kind of summarize all that and kind of see where we are and then we have another two years of the program to continue to develop the technology. This is how we're doing the evaluation in every case the teams are show their system to a user without the explanation show it to the user with the explanation also with partial explanations and then measure all those parameters I showed earlier to see if the explanation actually improves the user's performance and their trust and that's it. Thank you.