 All right, so let's get started. So thank you for coming to this panel of experts on AI. We're certainly very excited to host a series of those events where we bring luminaries in the different areas under the Ideas Festival. My name, by the way, is Dimitri Peroulis, and I represent here the College of Engineering. So I'm very delighted to have four of our colleagues in the university debate a really, really interesting question. Can we trust AI? I know we all want it, and I know we are kind of rushing to integrate it in our lives. But when decisions about safety and saving lives matter, would you trust AI? So this is the question that our distinguished panelists will have to debate. And let me start by introducing Professor A. B. Ribman. Professor Ribman is actually with the School of Electrical and Computer Engineering. She has research interests in the image and video quality assessment and video analytics. She's an IEEE Fellow, and she was a distinguished lecturer for the IEEE Signal Processing Society. Aimee, just took her seat. Thank you. We have Professor Bill Cleveland, the Shanti Gupta Distinguished Professor of Statistics, and Computer Science. He was with a member of technical staff at Bell Labs. For 12 years, he was the department head. And his research in the area of statistics, machine learning, data visualization, data analytics, and high performance computing. He has received the 2016 Lifetime Achievement Award for Graphics and Computing from the America Statistical Association, the first one since 2010. So thank you so much for joining us. We have Professor Jennifer Neville. She's the Miller Family Chair, Associate Professor of Computer Science and Statistics at Purdue. She's leading research in the area of data mining and machine learning techniques for complex relational networks, including social information and physical networks. She has been recognized in the past from IEEE as one of the AI's top 10 to watch. And in 2007, she was also elected a member of the DARPA Computer Science Study Group. So thank you very much. Our fourth panelist is going to be Professor Millind Kulkarni. Professor Kulkarni is working in the area of programming languages and compilers. And specifically, he's developing languages and compilers that support efficient programming and high performance on emerging complex architectures. He's a university faculty scholar, and he has received the Presidential Early Career Award for Scientists and Engineers. He's also one of the Associate Directors for the Center for Resilient Infrastructures, Systems, and Process. So thank you very much. And to moderate our panel, we have Professor Anand Raghunathan with the School of Electrical and Computer Engineering. Professor Raghunathan is the Associate Director of the newly founded Center at Purdue on Brain Inspired Computing. He has worked for 12 years in the industry, leading a research group at Neck Labs. In 2017, he really designed the fastest pedoflop single node server for training deep neural networks with Intel. And his area is basically a new generation of computing hardware. So thank you very much, and let's welcome more of our panelists. Well, thank you, Dimitri. I guess we can promise to take at least one question from each of you. So I'm glad all the panelists agree to be here. Thank you again. I know I was a bit of a nuisance when trying to recruit you for this panel. The question we are trying to address, it already came up a number of times in Min's talk. And Min and then I were just joking. We need to assure the audience that we didn't make our slides while listening to his talk. So this was ahead of time. Can we trust AI? So we're certainly proceeding, whether we like it or not, at breakneck speed towards this AI-driven world where AI drives and controls a lot of things. More pertinently, AI is increasingly used in a range of critical applications where it's making decisions that have huge financial impact, have impact on safety, and the extreme case on lives. So the question is, are we ready? Can we trust AI? Can we hand the keys to the AI system metaphorically to make all of these critical decisions? Why not, you may ask? Why should we even be asking this question? Well, as anybody who's familiar with the field of AI knows, there's many concerns. Vint, again, allure it to some of them. We'll try to go into them in further depth and hopefully debate to what extent their concerns or need to be concerns in this panel. So the first one is the concern of explainability. These are black box systems. They often don't produce a reason for the decision or result that they produce. And perhaps the earliest example of an AI system that was a black box with an explainability concern was back in, does anybody recognize this quote? It's from the Hitchhiker's Guide to the Galaxy by Doug Adams. So it's probably the earliest example of unexplainable result from an AI system. This is the deep thought computer producing the answer to the ultimate question of life, the universe, and everything. And of course, we know that answer is 42. But why? OK, today's AI systems are actually not that different. So again, when a doctor is using an AI assistant to make a diagnosis, he or she doesn't always necessarily understand why the AI system is making the recommendation that it is. We need to open these up from being black boxes. This is a very active, ongoing area of research. Some of us are working on this. But it is a challenging problem. Second concern, adversarial attacks. Many of you have probably seen this image of a, this is a panda bear that through the addition of very small amount of denitivity added noise to the pixels appears to be a given to the AI system, although no human would really think of these two images as different. And it's unfortunate that Mung left, because I was actually going to put up his name here. So this is an example of research from Purdue where Mung was involved. And here, the researchers actually showed that they could look at signs of, for example, KFC and by making very, very small changes, adversarial, adding adversarial noise to these images, fool autonomous vehicles to thinking that these are stop signs or other traffic signs, which clearly they are not. So, you know, this isn't for just some fancy toy images. Researchers have shown these attacks in medical systems as well. And so, you know, there's a concern that this can entirely derail adoption of AI. Last but not the least, there's the concern about bias, right? So, many AI systems are biased, and this can happen in many advertent and inadvertent ways. So the data used to build the AI models may implicitly carry bias. The humans who curated the data or who wrote up the algorithms to train the models may themselves be biased and unrealizingly transfer that bias into the AI, and that may get amplified and applied at a much larger scale. So when such bias systems are used to make decisions, then obviously that can be concerning. So the bottom line is this, right? If you ask any prospective users of any critical system that uses AI and, you know, we'll come back to self-driving cars as the example, this is a survey of prospective users and of the top eight reasons why people feel concerned to use self-driving cars, six of them really involve trust in the AI, okay? So this is a very central issue. It can be a showstopper and really decide whether, you know, the trajectory that AI takes from this point onwards. So with that introduction, let's talk. So I'd like to invite the panelists in order starting from Amy, Bill, Jennifer and Melend to make initial position statements on this topic. And then we'll have a round of responses to the position statements. And as I said, we certainly will be happy to take and look forward to take questions from you. So I'd like to start by just telling you what my experiences are with AI and machine learning. So my work is based on machine learning for vision-based systems. And in particular, I'm interested in agricultural systems and working with open agriculture technology system and also on vision-based systems to improve food safety and so forth. So I thought it was worthwhile to start talking here as we asked the basic question, can we trust AI? Maybe we should take a look at the definition of trust. So of course I go online to my favorite sources and I look at Miriam Webster and Wikipedia and I synthesize this definition of trust here from the two of them. So trust is defined to be the assured reliance on the character, ability, strength and truth of something or someone to the degree that we voluntarily abandon control over the actions chosen by that thing. And so I think in the context of this question, it's worth unpacking what this definition means and in particular focusing on the notions of character, ability and truth and also the notion of abandoning control. And so in terms of truth and ability, in terms of trusting AI, I think one of the questions here is does the system create the right answer? Is it actually answering the question that it's been designed to ask? And there are examples come out in the news every single day it seems. We see, I won't mention any particular companies as I talk about these examples but we've seen examples of humans being labeled as gorillas in an image labeling system. We have an object detector that winds up labeling all the images from a particular horse enthusiast website as a horse that's been designed to detect horses and all it's managed to do is detect the logo of that website and it has no idea what's happening in the images. And there are speakers, there are systems that are designed for speaker verification and these have to be robust to these sophisticated speech generating methods. And so the basic question of truth here is is the system even doing what it was designed for? But in addition, in terms of ability, can the system deal with unexpected or difficult scenarios? We all know it's more difficult to drive in foggy conditions or wet conditions. If the system is unable to cope with that, maybe it should tell us that hey, I'm not capable of driving under these conditions. And so in general, we need something that's gonna be robust to the typical environmental conditions like fog, low light, or glare. And we all know how to deal with those as we're driving along, it makes some difficulty. And right now these systems are designed for pristine clean situations where these situations have been curated by humans. Is it gonna be able to work as we go outside that notion? And is it, and many of the systems also as they've been trained for these clean, clear situations, they're not taking into account the impairments, the difficulties that our own technology has created there, like compression of the video and the fact that we're sampling at a low frame rate or we're sampling at a low spatial resolution. And so the computer has sensors that's not even seeing the reality of the world as we see it. And how much does that impact of those sensors affect our ability to trust what the system is actually able to process? In terms of the next topic, which would be the character, I think these get to really philosophical questions. So I think it makes sense to ask a question for whose benefit has this AI system been designed? Was this, if we consider a lone AI that's deciding who to give loans to, are we deciding the loans based on the best for the company to make the most money? Or are we trying to help out individuals so that they can have a better leg up in life? Are we trying to help society as a whole or are we just trying to make somebody's pocket a little fatter? In addition, we can also recognize that when a self-driving car actually killed a pedestrian a year or so ago, there was evidence that supported the fact that the system did exactly what it was designed to do. The system detected the object in the way and the system was designed to ignore those alarms that went off so the car continued to go ahead and hit the person. So this is a question of the character of the company that designed this algorithm that decided that it was more important not to inconvenience their test driver than it was to actually hit something in the middle of the road. And I think these types of questions as the character of the company that's creating these products, I think that's a worthwhile thing to consider as we go forward. And when we ask the question, can we trust these things? And in the final case here, we have that the system is created in a secure manner or can it be hacked by a bad actor? And so is it doing what it's been designed for? And again, I would say this goes back to the question of character because it goes back to what is the system trying to achieve here? Which human is it trying? Whose human values are being replicated in this system? And can it be hacked by somebody that has nothing to do with this company and perhaps that's the company's problem. They didn't design the system well enough to protect it for their own interests and the humanity's interests. So then I think the third, the next topic is about seeding control and this notion of yielding control and that's part of what is required to trust something is the willingness to let go of control. And I think that in asking the question, do we trust AI? We have to ask under what conditions is a person willing to let that AI make decisions for that person? And I can envision that if I want an AI to make a reservation at a restaurant that I'm willing to trust an AI at this stage to do something like that. That seems perfectly reasonable except maybe if I wanna make a reservation for a dinner at a very crowded place that and it's a special occasion for my sweetie and I would like to make sure that this really happens. You know, maybe I would be wanting to do that phone call by myself. Maybe we would trust the machine to make a screening decision about whether or not to get a medical test but we'd really like to have a human be looking at the result of that test. Are we willing to let a machine drive on a crowded highway without supervision or would we just be preferred to drive along in a golf cart type vehicle in a closed community where there's not, at 25 miles an hour where there's not much going on? And would we like a machine to perform complex surgery on us or would we prefer a human? In terms of that last one, actually sometimes the machines are much better than the humans. So in that case, I would definitely trust the machine but in general I would say I'm not ready to trust AI in all these situations. I believe that AI these days is just at the apprenticeship level so it's very good at some things and we can trust it for a certain set of things but it still needs the master or the journeyman to be looking over and making sure it's making all these decisions correctly before we completely trust it. Thanks. Let's see, I use this. Okay, so I'm gonna strike a theme here and it'll be similar to Amy's but I'm gonna phrase it differently. It's not about trust, it's about validation, okay? And it's not about trusting AI, it's about validation and do we trust the validation, okay? So think a little bit about drugs, okay? We got pharmaceutical people producing drugs and they use all sorts of tools by the way especially in drug discovery, I'm sure, well of course this is all proprietary information, I'm sure they're using AI tools along with a whole lot of other data analytic tools and discovery tools. So do we trust that process for producing drugs? How do we know? It's the FDA that we need to say we trust because they're enforcing lots of performance testing and validating those drugs. So that's what we should be looking for. So I wanna just address that and just give you a couple examples of that sort of thing. So I'm gonna talk about AI for data science. Now here's just a bit about data science basics that the foundation is the analysis of data and there are technical areas of data science and the purpose of those technical areas is to develop analytic methods and computational environments that will improve data analysis. So we have analytic methods, statistics, and machine learning, we've got mathematical models for data sets, we have computational algorithms for methods and we have computational environments for data analysis. So work in all three, in all of these areas here has in recent times made big advances in our ability to be able to analyze data and the analysis of data is really in a way the foundation for much that goes on in AI. So there are, for data science, there are really two kinds of automation. I'm gonna switch my term now, I'm gonna go from AI to automation. It's really taking, tasking, and automating it on behalf of what it is that's going on to make things better and simpler, okay? So for data science, there's internal automation and that is automation to enhance the use of analytic methods, okay? So I will say a word or two about that and then there's external, which is developing analytic methods to automate a subject matter task based on data, okay? So it's outside of the framework of data science in the sense of all the technical areas and developing things, but it now becomes developing through the analysis of data, something about some other technical area. Now in both cases, there needs to be validation. Does it work? And validation here means application to a convincingly large sample of data sets, okay? So, but we have another, in all of this, we have another entry into this whole topic and that is the subject matter expert. So in terms of the analysis of data, we've got lots of tools developed by those systems and by those analytic methods, but judgments made on the part of the subject matter expert is critical. To data science, to the analysis of data. So we have a hybrid that you want to occur, okay? Because just blind analyses without any knowledge of what is going on, they just simply don't work particularly well. And by the way, this gets, I just want to address this question that I saw come up here. It's, well, there's a problem here with machine learning AI systems because they can't explain what they do. Well, for decades and decades and decades, at least in the field of statistics, there has been two very different kinds of tasking and this is well delineated and well understood. One is causal, okay? So when the goal is you need to know what causes things, then you've got to use a whole bunch of tools that are directed toward causal modeling so that you know what it is the drivers are. The other is predictive. You want to predict something and you basically go, you know, I don't care how you do it, just tell me next week if it's gonna rain or not, okay? Cause I got tickets to a football game. So those two things need to be distinguished. So if it's really causal, everybody has a right to ask the question, but if it's causal, you're gonna be using other technology. So I'm not sure I think this is even really a question that is necessarily cogent to all of this. Anyway, okay, so what about the subject matter expert? Well, the subject matter expert supplies knowledge that's critical to the choice of methods and models, first of all, and the diagnostic, when you fit a model to data, there are a lot of diagnostic analytic methods used to assess the fit of the model to the data. And the subject matter expert is the best to judge such fitting because you have this, say, surface. You've fitted to some response as a function of 10 variables and you use both visualization and other analytic tools to present the fit of the surface to the data. And you show that to the subject matter expert, the subject matter expert can judge whether or not the model is good enough for practice. So the expert might say, there's a little bit of a problem with the data over there in that upper top, in that corner over there, but I don't care. Cause we don't really care about that. Or, yeah, gee, that's the most important thing of all. The model is missing that, and so you gotta go back and do another thing. Okay, so internal, that kind of thing with a subject matter expert, that's actually going on right now, and I'm part of it, DARPA program D3M, data-driven discovery for modeling. It's built for subject matter experts who are not data scientists. Now, in this program, task one is, well, cataloging thousands of analytic methods. And by the way, they're all either written in Python or in R, although much of them call lower level languages like C, but that's the pop, that's the catalog, and you sort of, when I say catalog, I mean, you know, like a library. It's, you know, it's, well, it is a library. But, I mean, a book library. So you have to have a lot of information about that analytic method, okay, and what it does. Now, task two is automated model selection and fitting of the data. Okay, so through looking at what has, oh, sorry, up front there, when the subject matter expert walks in, there is, the subject matter expert specifies certain things about what goals are, okay? Task two is model selection. So there are automated methods for selecting a model class to be used and to fit that to the data, okay? So the subject matter expert is still waiting. And task three is now diagnostic methods to display the fitted model and the data, and the SME is the one who judges the fit in this D3M system. Now, I'd like to talk, so that's, I'll call that internal. Oh, by the way, as we build this system, how are we tested? You know, how do we show that the system works? Well, what DARPA has done is they're calling in a whole bunch of subject matter experts who sit down in front of combined systems for, you know, well, they're using TA1, TA2, TA3, and then they see the results and they then rate it. They say, this was really good, the interface was wonderful, I love it, and the results were good, and I'm happy with it, and I was able to interact very well, or this thing is awful, I'm gonna leave here and never come back again, so. But anyway, actually we're doing quite well so far. Not perfect, there's lots of good statements being made and there's lots of suggestions being made in this testing process. Okay, so external. So, some time ago, a collection of us got together and we developed a cybersecurity application in which we developed a streaming statistical algorithm for detection of SSH keystroke packets. Okay, so we have SSH connections, and we wanna see whether or not there's keystrokes and the reason for that is that we wanna know whether or not this is a human that has used SSH and is on that connection or is it a file transfer? So, and what we had was eight variables and we used those eight variables to characterize the, characterize the behavior of a keystroke. I mean, this is, by the way, this is TCP at work, so it's like, you know, you have to know all about TCP, but you take the TCP protocol and you say, we have times between packets, types of packets, and things like that, and we were able to use those, we were able to use those patterns or detect patterns that were unique to keystroke packets. And as a result of being able to detect that, we wound up with an algorithm that actually was, it was quite accurate, I forget the numbers, but it was something up around 95% in both cases, 95% accuracy and saying there's no keystrokes and somewhat less, about 90%, there are keystrokes somewhere in that. And so this got implemented, one of the people in the project was Carter Bullard, who has this wonderful connection level ARGIS monitor, and we implemented it there. Okay, oh, and yeah, the key thing I wanna say here is the testing, we tested over a million SSH connections. Now, some of them we had very definite information about whether or not this was a human in the loop, and others, it was more foggy, but we used a huge amount of information to look at things and determine, yeah, with very high probability, this is just a file transfer. So it was really many different kinds of testing that we carried out to verify that this was gonna be good enough to put in a system that's carrying out cyber security analyses, okay? So again, it's all about testing, it's all about verification. Great, okay, thanks Bill. I wanted to start off by saying that I work in the space of machine learning and AI, and I just wanna say that I'm very thankful that we've gotten to the point where we can have a panel discussing whether you should worry about using some of our methods because 20 years ago when I started working in this area, we spent a lot of time trying to convince people that what we wanted to work on was interesting at all. So I think it's really a testament to how far our field has come that we actually have methods that are being rolled out in such critical applications that we have discussions like we're talking about today. So let me, I just wanna frame how we do our, how we lead this discussion to talk about what has happened in the past in terms of the Industrial Revolution. And so some people say that what is happening right now is really the fourth wave of the Industrial Revolution and you can see here, over time, it's gone from the 1760s to today and really now the new Industrial Revolution is coming from the use of cyber physical systems in AI in a really sort of woven through every aspect of the facets of our lives. But all of these successes that have come from the various aspects of the Industrial Revolution have really improved our lives. So I lost my notes here. So it can recognize my face up here. Okay, so as we have had these improvements in first mechanization, then mass production, then automation through computers and now with AI, you can see if you look at the overall quality of life of people throughout the world, this is some data I grabbed from a site that has a lot of data about every aspect of people's lives throughout the world and you can see how from this starts, this plot starts from 1870, so not 1765, but you can see how the basic quality of life of people have improved over time and this is an index that includes both how long you live, how happy you are, how healthy you are and how educated you are. And so this corresponds to improvements coming from the Industrial Revolution. And so now that we're at the point where we have AI embedded in our lives, starting to be embedded in our lives, I wanna point out from an optimistic perspective that it really has been improving our lives in ways that maybe you haven't realized so far. And so that includes both from a medical perspective that I think Vent referred to in his talk. So here's some research on using machine learning methods to automatically predict whether patients are going into septic shock in hospitals and this is something where there's 700,000 people that die each year from septic shock and it's something that's very hard to predict when it first starts to happen because the symptoms are very similar to other more benign conditions. And so machine learning methods have really started to be able to identify this much earlier than nurses and doctors would be able to and alert people to changing the protocols and saving the lives of these patients. Some work that I have worked on in the past is using machine learning methods to predict identity theft in cell phones and fraud prediction in stock brokers. And this is something that is rolled out in many ways that you're probably experiencing in your everyday lives but you don't realize it. So this is behind many of the spam detection algorithms that keep email out of your inboxes or at least try to. Maybe sometimes it keeps real email out if you want to see but a lot of times it's saving your time to not have to deal with that spam email. It's also rolled out in your credit card companies to predict when somebody has likely to have your credit card number and be buying products on your behalf. And so these are examples of methods that were developed in the machine learning, statistics and machine learning communities 20 years ago and now they're just seamlessly involved in our lives in ways that based on Amy's definition of trust I think we are letting them make these decisions and being comfortable with them. Finally, some things that you might not know about. There's methods that are being rolled out to actually detect human trafficking online by looking at patterns of behavior in the dark web that you might not even have access to or knowledge of. So that's saving people's lives. And then finally, I just wanted to end with the fact that machine learning is also making our life easier by improving aspects of our supply chain. So this is an article that says that Walmart back in, I think this is 2004, 2005. I can't see the number right now but they were already automatically predicting when there were massive natural disasters like hurricanes. What kind of products should they ship to the Walmarts nearby ahead of time if they knew that it was happening so that they could actually have the supplies already there for the people that are going to need them. You might think that that would mean that they need to have more water, batteries, flashlights and so on but it turns out they found that they really need to have beer and popped hearts being shipped there. Okay, so I just wanted to wrap up by saying that this is indicating that really AI has already been improving all of our lives and this discussion that we're having right now is really just a function of the fact that people are distrustful of new technology in general and this has happened all the way through the Industrial Revolution. So for example, the Luddites were very distrustful of textile machines and taking away jobs from people that were craftsmen at that time. And so I think we're having the same sort of discussion now where people are afraid that AI is going to take away the jobs and things like that and that doesn't mean that we don't have anything to be concerned about. I think a lot of the things that have already been discussed so far this afternoon in terms of having robust, reliable and safe systems as we roll out these AI models to make these automated decisions are really important for us to strive towards but it's not something that we should be afraid of. I think to sort of restate what Bill was saying as long as we are validating and testing the models in the right way then we should be able to really trust that we can use them in ways that will improve our lives in ways that maybe we can't even anticipate right now. So that's it. Thank you very much. So the danger in going last in these things is that I wind up presenting a bunch of slides that recapitulate stuff you've already heard but hopefully I can give things a slightly different spin. And so my name is Millen Kulkarni. I'm from Electrical and Computer Engineering and I think one reason that I might give things a slightly different spin than you've heard up until now is that my research is not actually an AI. I don't do AI and ML. I'm in programming languages and software engineering and formal methods and I live in that space of sort of designing and building software to do different kinds of things. And so let me start by just answering the question that I'm gonna pose which is can we trust AI? And I'm gonna answer it in a way that panel moderators always hate which is by not really answering the question. I'm gonna say can we trust AI? And it really depends on what you want AI to do. If you're asking AI to tell you whether the picture that your friend sent you is a picture of a Bengal cat versus a Siamese cat, maybe it's not a big deal if it's getting it wrong every now and then if you're playing with your Google Doodle, trying to make it turn your tune into a box symphony, maybe it's okay if it comes out sounding like Philip Glass instead. That doesn't matter so much. But if you're trusting your AI to drive you from here to Chicago without getting into an accident, what you expect from the system changes and what you might trust the system to do changes. And so 10 years ago, I used to be able to give Simpson references and everybody would just get them. Maybe people don't anymore, but this is one of the quotes from Reverend Lovejoy. The short answer is yes with an if. The longer answer is no with a but. And the reason that I come at it from a different angle is that the question that we ask in programming languages and compilers is not can we trust AI, but the broader question of can we trust software? And I promise I made this slide before Vince made exactly the same point an hour ago. So this is a problem that people in my community have been looking at for years. Can we trust software to do the things that we expect it to do? And the way that you wanna frame this question is what do you expect your software to do? I want you, the user, to tell me what you expect from your software. And then it's my job to tell you that the software is actually going to do what you expect. And that sounds like a completely simple straightforward thing to do, right? You tell me what you expect and I'll just make sure that I give you what you expect. But how do we define what we expect from our software? This is actually an extremely hard question. It's one that we haven't solved even in the case of easy software. How do I, what does it mean to say that the software does what I expect it to do? I can write a piece of code and give you a list of expectations for that software and it will do exactly what you expect and then you go and deploy it in a real system. It encounters a situation it's never seen before. It gets an input it's never seen before. An input you, the designer, didn't expect and so you never told me what the software should do in that situation so I can't help you, right? And that's even in the simple case of software that is nice and deterministic and takes a certain kind of input behaves a predictable way. When we're talking about these big inferential engines that we get in AI, this is an extremely difficult problem. And I think that that's really the question that we should be asking here is what can we expect from AI? What should we expect from AI? And once we place those expectations on AI ML systems, how can we guarantee that they do what we expect? And I don't really have a great answer about what we should expect from AI systems. It's extremely context dependent. But let's, here's some ideas that people in my field have been looking at of things that we might be able to expect from AI systems. So one example of something we call local robustness. So Anand gave this example of adding a little bit of noise to a panda and all of a sudden you decide that it's a different kind of animal entirely. If you've been following the news, there was a big story a few days ago about people putting reflective markings on pavement and convincing an autopilot system to drive in a different direction. I think that's the company that Amy was trying very hard not to name, but I'll name it, it was Tesla. Oh, okay. Right, so the question of local robustness is one where I know that the system behaves as I expect for a given kind of input. If I perturb that input slightly, will the system still do the same thing? Will it give me the same answer? Because I really expect when I say that the system should behave a certain way when it sees a road that looks a certain way or when it sees an animal that looks a certain way, I don't mean exactly that configuration of pixels or exactly that road looked at from exactly that angle and exactly that lighting. I mean a whole bunch of inputs that look kind of like this, the system should do the same thing. Now, this is not a complete solution because how do I know what I mean by local robustness? How do I define what it means for two images to be similar, for two inputs to be similar, proving that local decisions are always correct, doesn't really tell me anything about the end-to-end behavior of a system, but this is one thing that we could try to do. Another thing we could try to do is interpretability. I know that Bill kind of poo-pooed the idea of interpretability in his presentation, but one way to engender trust in artificial intelligence and systems in general is to provide answers that humans can sort of reconstruct that we can show your work, all right? You've all had the experience of being in grade school and writing an answer to an arithmetic problem that turned out to be slightly wrong and you get no points for it, but if you show your work, people are more likely to believe that you sort of know what you're talking about. And I think that there's a similar story we could tell for AI, and so there's been work in my field looking at different ways that we can take these complex neural net models that are big piles of weights that do all sorts of things and turn them into other kinds of models. For example, programs that we might have a better chance of understanding, of interpreting, and saying, yeah, that looks right. The system is doing what I expect. And the last one I'll leave you with is maybe the hardest, thorniest, most difficult one, which is one of fairness, right? How can I, the decisions that my systems are making, how can I say that they're fair? And even defining what fairness means is an incredibly challenging problem if we want to validate or prove that our systems really are providing fairness. One possible example might be something like if I have a group of people and I'm trying to make selections from that group, that I'm not gonna be more likely to select minority groups and majority groups if on all other dimensions, these things look the same. So if I can't, if the only thing that distinguishes two groups of people is, say, their sexual orientation, my system shouldn't behave differently based on that one feature, right? So this might be one definition of fairness but defining fairness is extremely tricky. But the thing that I wanna leave you with is this idea that if we can define what we expect from our systems, be they software systems or AI systems or ML systems or what have you, then we have some hope of being able to trust them because we have some hope of being able to verify that they do what they're supposed to do. Thanks. Okay, thank you all. I think that was a great set of opening statements. Maybe we can have a quick round of if anybody has some thoughts that any of the other presentations triggered in you. Quick round of responses. If not, we'll move right on to taking questions from the audience. I wanna make sure that you have the chance to be heard. So why didn't we get started? Anybody from the audience? Any questions? Yes, could you walk up to the mic, please? I think we're still being recorded, so. So thank you very, very much. So I guess just to start the questions, I'm gonna start from the very last one and kind of move. Can you describe a little bit how much progress we've made in those areas? And I think you all talked about very similar things. So if I look at, for example, the last two decades or one decade, and I'm making it a bit harder from one to 10, where are we today? That's for all of you. Man, so I can say that in the simultaneously broader and simpler problem of can we trust software, I, my colleagues in formal methods might kill me for saying this, but I think we're at like two and a half out of 10. We can do things pretty well in systems that are very well-defined, that are very small, that have predictable inputs, predictable behavior, that have deterministic behavior. None of those things describe the kinds of ML systems that we all wanna be building. And so I think we're not close. So I guess I would say, from a machine learning or AI system perspective, I don't know what number to give it, but I don't know what the scale is. But I think we, like Millen was just saying, I think that we've had success in very narrow, well-defined areas. So something like credit card fraud, transaction-specific credit card fraud identification, we are fairly good at that. Identifying whether something is a spam email where maybe you're maybe not so good at that because that's actually an adversarial, I guess both those things are adversarial situations where as soon as our models do well, then the adversaries adapt and start doing new behavior. But those are still very well-scoped problems just like plain go or plain chess are. And so I think in those cases, we have very well-defined outcomes, very well-defined actions, very well-defined data that's going to be input to those systems. And I think we're doing a great job of developing models that can predict in those environments. What's, I think everybody thinks about in this kind of panel is how is that gonna work in a self-driving car? Or how is that gonna work in a system like Vint described where like a human learns sort of autonomously over time, learning new concepts and new objects and new ways of acting in the environment. I think we're very far from that. And combined with that, we don't know how, if we can't make the software work when it's not an AI system, maybe we should be afraid of what AI software is going to do. I guess I'm out of here. Can you hear me? Okay. I think in a certain sense, we've made enormous progress. Number one, there really have been these huge events that occur and they get most of the publicity. It's sort of like one-off achievements like alpha zero. And the other, don't forget Kasparov lost, who was easily the greatest chess player in the world at the time lost the chess match with a computer built by IBM. So that was a major achievement. Now it is true that Kasparov got spooked and sort of fell apart because he really thought that IBM was cheating and he was completely unnerved. So a computer that's not capable of being spooked beat a spooked world champion. Okay, so maybe that's, it's a good example though. So we've had a lot of things of that sort. But those are sort of one-offs. Day to day, we've made huge advances in our ability to be able to analyze larger and larger data sets. I mean, the big data mania that broke out around 2010 to 12 really did make a lot of accomplishments and we can analyze through the development of computational environments for data analysis. And we use them routinely now like Hadoop and Spark and now there's a new one in their desk. So we can analyze much better, much bigger data sets than we could 10 years ago. Over 10 years has been all that progress. And the magic of it is that we've been able to compute in parallel. Now before that, we were having to deal with big data by saying, okay, I gotta make my analytic methods run very, very fast and they have to use a small amount of memory and you had to think about each individual analytic method. But the great thing about the computational environments is that you're really now can achieve parallel computation and that actually gives you dramatically higher performance than trying to develop algorithms that will provide better, you know, better computational performance. So in the space of video processing and processing of visual information, I'd say in the last 10 years it's really exploded and that's in part due to the success of deep networks and AlexNet and so forth and the ImageNet object recognition. But in reality, I think that we're, again, that's very much in its infancy because the way we process visual information is so much more sophisticated. We have more sophisticated models. It's the way we, being humans, process it is just so much more sophisticated than the machines have right now and already it takes huge amounts of machine power to process that now. So I would say we're, two and a half sounds like a good number. There's a lot more that needs to be done to take this into account and one of the things that's really interesting to me is how when I spoke to a machine learning expert a month or so ago and I described my goal of trying to mimic a farmer who's driving a combine to harvest the wheat or the corn and I described the situation and I said, well, how can I design a reinforcement learning algorithm? How do I design my reward system? How do I design this? And he says, oh, it's really easy. You just build a simulator of your combine. And I thought, wow, that doesn't sound easy. So if that's where the state of the art right now is how do we take real world problems and translate them down into the case where we can start to use these techniques that the machine learners are creating, I think they're really powerful techniques but we're still a long way to translating them into reality. Yeah, maybe let me just add for context in case you don't know, people have been working on game playing AI systems for 50 or 60 years, right? So it's not like overnight that people wrote a program to beat a go master. It's been a long time coming and how long have people been working on self-driving cars? 10 years maybe? And also just some history in the field of AI, we have typically underestimated how hard a problem is. So people thought that you could solve the computer vision as a summer project that grad students could do as an internship and that also was 60 years ago, right? And so we have made a lot of progress but at the same time we always think things are coming faster than they really are. The term artificial intelligence and its visibility both among technical people like us and the general population is sort of like the old faithful geyser. I mean it just spouts up and then everybody's talking about AI and then it's sort of, the first time I guess was Carnegie Mellon when they started right back in, anybody know the dates, 70s, saying we're gonna do, yeah we're doing artificial intelligence and that's going to let it work. Then it went, well it didn't work out too well. Ken Thompson came along, they had chess programs, Ken Thompson came along said this is an algorithm, a computational problem, chess, forget it, all this stuff of trying to copy how a grandmaster thinks that's silly. I can do better and he did, he just beat all the other systems by algorithms and he actually built a special hardware to do that. Anyway, so and then next was expert systems and actually some of those worked pretty well but it sort of again, it declined and well at least people stopped using the term and I think a lot of them didn't work out as easily as everybody thought it was going to. But this stuff seems more real to me having gone through these geysers. I think it seems a lot more real but again it's about algorithms and analyzing data. Just to be a little cranky about that, how many people in the middle of wave three or four thought hey this seems real this time? A lot of my friends from grad school at the time now have jobs doing other types of software engineering because they could not get jobs. Okay, thanks. Oh yeah, I'd like to mention yeah, Ken Thompson also did his touring award lecture talking about reflections on trusting trust. You can't even trust your compilers but what I wanted to talk about was right now you guys are, a lot of times you're funded by the state and you have all these states, China, Russia and United States governments funding an awful lot of research and you wonder what is the intention? What is the, what are the interests of this state? Are they to help with our economy and help the common good of my country versus this other country? Or is it trying to affect social control? So if you think about what China's doing with their social credit system, I can imagine that being a real target for artificial intelligence where you have this gentle form of social control. So if you're not paying your parking tickets, you're doing credit card fraud, then you can't travel, you can't get a job you can't do anything else. And I'm wondering what your thoughts are on the trade-off between this power of social control versus your research efforts. And is there a conflict? I'd like to say a word about them competing nations, okay. Number one, there is absolutely no doubt but what the United States government is very concerned about AI for military systems. I know this in part because I was on a panel to develop a plan for the future to 2030 actually for the Air Force to inject much more data science into its, well, its environment. Let's put it that way. Just to put it into the Air Force technical people. And there was statements about competition with other nations and there's no question but what the military is watching carefully and wants to stay out in front. And that of course is a lot of economic and there's a lot of security things that come up with foreign nations. And then there's economic matters that come up. So there is no question but what this stuff is deemed because AI is being very important by the United States government and clearly the Chinese government too because they're working very hard on developing AI. So, and by the way, Wen Wentong was down there and I are headed to China to work with weather people but it's not nearly so competitive. It's like they want to predict monsoons and so we're headed there under much, under daytime, let's say. So I guess I can answer that a little bit. I guess I would point out that there is a difference between the research on the methods and the policy of how to roll out those methods and you may be concerned about the government funding this kind of research but I think that's valid and should be appreciated but companies are also doing this research and they're rolling out things with other kinds of applications in mind and my historical story about this is that when I was in grad school I worked on a program from DARPA called Total Information Awareness and that got shut down by the senators that had problems with the implication of the government collecting data on individuals and using that against US citizens to modify their behavior. So the government stopped doing research on that but who kept doing the research? Well, Facebook kept doing the research and other companies, maybe I shouldn't name because of the speaker right before us but Amazon is doing the research and those companies have basically rolled out the very things that people were afraid of that the government was doing and so it's not really simply a government issue. I think people will be doing research on these methods and trying to stop doing research is worse than trying to, you'll have companies or countries like China overtaking us and so really we should be doing the research but really thinking very carefully about the motivations of when the methods are rolled out and how they might affect society. Yeah, I think that's exactly right which is that it's really important to distinguish between the tools that we're building and what people choose to use the tools for and I'd want to be very careful about over-correcting into a world where we say because the research that we're doing has the potential to be used for bad things we should stop doing it entirely. I would say. Maybe I'm talking my own book there but. I just wanna add that in my case, I feel like I have the ability to choose topics that actually can help the entire world and so if I'm looking for sustainability of food production or improving the safety of the food processing chain then I think that that's something that can help everybody and maybe not is a lot harder to weaponize or make people's lives worse so that's the direction that I'm trying to push my efforts. I think there was a question there. Did you have a question as well? She was waiting from before so if you could. So I just had a question about active learning. When what kind of parameters are we looking at when we're trying to execute a competent algorithm for that? Are we trying to model based on how humans learn or how do we decide what's the next node that should be learned if you're trying to do active learning? I guess maybe I should take that. Let me just say what active learning is for people in the audience who don't know so active learning is a setup in machine learning where you're simultaneously trying to learn the model without enough data and so you learn what data to gather to increase the speed of your learning and so there's actually a whole host of different ways that you might be gathering data. You might want to get new instances. You might wanna get labels for instances. You might wanna get new features and in all cases what we try to do is there's a cost associated with every action or query that you would do and so an example of this is that if in healthcare it might be very costly to do a certain test on a patient, right? And so you want to assess is the value of the information that you would get from that test result really helpful to do a diagnosis in the end. And so your question was how should we think about when to acquire new information or what information to acquire? So like the when and the how. So are we, what kind of models are we using to decide what nodes should be learned? And like you said, there's also a factor of when is it like worth it or not? But also how are we doing it and how can we trust that method? Yeah, so I think your question of exactly how we should frame that shows the gap between what people might want to do and what our algorithms actually do. So what our algorithms actually do is a priori somebody specifies exactly what are the actions, what are the cost and then we optimize when should we gather that information and it's different and different scenarios but the larger goal would be to move to a scenario that's more continuous learning like humans do, that we don't necessarily say to ourselves should I take a step forward right now or should I turn to the right? We just have automatically internalized when we should do certain actions and how we would gather data from that. And that's a much more open ended learning kind of environment where there's many different things that we could do and many kinds of information we could gather. We're nowhere near that kind of formulation. And so researchers right now would formulate active learning differently in many different environments. So for example, something that I've worked on is that I do learning in graphs and you might want to predict what are the topics of web pages on the web but you have to go gather the data so we just don't have access to the whole web even if you're Google, even if you're Microsoft you have to actually crawl the web to see what's there and so the query then is should I go across this hyperlink to see what this page is on the other side to then find out what its topic is? And so then we frame that as an active learning problem where we're simultaneously trying to decide where to crawl and learning to make predictions at the same time. That's a very narrowly scoped problem where we only have certain actions available to us and then we can actually just look at the data and learn whether we should take those actions or not in hindsight after we've accessed some of the data. So I'm not sure if that helps exactly with respect to your question. No, it helps a lot. I think there was one from Nikhil as well. Go ahead. My question is about the possibility of AI. So given a task, humans are great at adapting to situations that the task might throw up. So how much this is going to be a challenge or even possible for systems that expect precise specifications? I think it's worthwhile to consider putting that into the design of your system so that it has some self-checking involved. So it says, hey, wait a second, I don't know what to do. Maybe that's a point when you go query an expert or you ask for more data or you go get help. And I think that systems can be designed to be capable of understanding their capabilities and when they're not very strong and being able to address that situation. Yeah, actually that's a very good question because one of the things you can do is up front and you have to be a subject matter expert, you go and you actually run designed experiments on systems. Actually, that needs a lot more work because a lot of these parallel distributed systems that I just mentioned doing that work is actually very challenging. And so not too much is being done but you can go and run experiments just like you run experiments for medicine. You can run experiments with these systems and say change, for example, configurations. I mean, the systems need configurations. We're talking about computer systems, right? That do things for people. You need to configure those systems to conform with the tasking. And you can run experiments to get information about how to do that generally. Now to make changes in real time that is when a problem arises, you run out of memory, what should happen? Can the system adjust? Well, it can be, the system in some cases can in other cases, you abort. So it's a tough one. It deserves work. That'd be a good thesis topic, by the way. Any other questions from the audience? Go ahead. Assuming you have a general artificial intelligence system, would you ever be able to trust the system for a problem where there isn't a subject matter expert or an open problem sort of situation? I mean, define general artificial intelligence. I trust a lot of people that are general intelligences to do things for me, right? Even if I haven't specified precisely what I want them to do. So I guess it's a question of what do you mean when you say you have a general artificial intelligence, right? So a system that might be able to produce some solution to what the best course of action for a problem is or maybe how to solve a problem that we don't have a solution for yet or don't know the best solution for it. Well, most of the cases we have here, you're working on a specific thing that you want to achieve. And I would say there isn't, when you do that, there can be some amount of generality, but by and large it's not, especially with AI. If it's just analyze the data and understand them, then you can start thinking about general systems because there's a lot of tools that will work across many different subject matters. So you can talk about, so we do have general systems for data analysis. We have Python that has libraries and we have R that has libraries and you build that system and that will handle a lot of different kinds of problems. But with these AI problems, they get delicate and they get challenging and you gotta, usually it's, you know, you focus on that. And I don't know that, I think Jennifer just said this, there probably aren't things that you would call general systems for AI. I guess I would say if you're talking about in the future when we do have, we've gotten to that level or we've achieved something that we call general AI, I would trust it if, because if we'd actually gotten to that level, then we should have encoded in the system the things that allow us to trust these other intelligent systems that Milan is referring to, right? So we have people that we trust to make decisions. But if you think about how we feel about people and how they make decisions, not everybody will make the same decision. And so it's not always clear that there's gonna be a deterministic decision-making process for which there's always going to be a single right answer and we're going to be able to decide if the agent has done that answer or not. We are going to have to learn how to infuse the things that we currently train our human agents through social norms, laws, policies, to behave in ways that we think is acceptable to our society. We're gonna have to somehow encode that in an algorithm to put into these AI, general AI agents. And once we have done that, then I think we will be able to trust them. The question is whether we know enough about ourselves to even put that into some form of algorithm. That's what I think we are not really introspective enough to know how we ourselves make decisions and how we value trade-offs in particular scenarios. And so the example that I always like to give is that, we send 16-year-olds out in cars driving once they have a license. Do we take them through all the same situations that we're taking self-driving cars through? So what if you're driving down a road and it's cloudy and your grandmother's at the side of the road, but then there's oil on the road and then you slip. This is something that we don't teach them explicitly. We teach them general principles and we hope that in the moment they're going to be able to figure out how to act and behave in the right way or at least in a way that kills the least amount of people. And so we have to figure out how to sort of train our algorithms in the same way. I know. This is when I make my pitch for interpretability again. Part of the reason that we trust other people to make decisions is that when they make a decision that we don't understand, we can ask them to explain why. And we rely on them to be able to teach us why they made that decision. And the amount that you trust somebody, I think to a large extent, is related to the amount that you trust them to be able to explain to you why they're doing what they're doing. We've all had this reaction of like, oh my, why did you do that? That's irrational. And then you immediately just stop trusting what that person is doing. Thank you. But by the way, I just want to clarify. I didn't poo-poo causal- I know, I was being a little bit flim. Yeah, yeah, yeah, no, no, no. I just said there's times when it doesn't matter. It's not causal. The request on the part of the person who needs the service just doesn't care how you did it. They just want it to be accurate, okay? And sometimes the explanation is because I'm your mother and I said so. Yeah, yeah, in the last couple of years I've started deploying that one more. And by the way, there aren't bad algorithms and bad analytic methods. There's bad people. Okay, so let's be clear about that. And we're never gonna, it's not our job because we can't. I say it's not our job. We're not gonna stop bad people. So, I mean, if you listen to what was said, I mean, a lot of things developed for the internet just got turned around and, you know, I would have liked to have asked Vint Cerf if I wouldn't let the students come up and ask though. Is he, you know, when the internet was first developed, it was, there was the usage of the internet was among technical organizations and companies and universities and that was it. That was a population using it. And if you ever, if you ever sent anything out by the way that was commercial, then there was a spam attack. Only it was people all over the place that had seen, oh, you did commercial. I'm gonna get you, I'm sending you 50 emails. Anyway, so the internet was not built for security and when it first came out. And a lot of those protocols that are still running weren't built for security. So the bad actors are having a wonderful time. Go ahead. I have just one other question or comment. You brought up the notion of cause and of course I follow on Twitter this going debate about the causal stuff from Udaya Pearl versus the statistical version of cause. And I know that for me, the causes of things are very, very fuzzy a lot of times. They're in statistics, they're our confounders. And for a lot of things, we have superstitious behavior. I mean, how many times have any of us rebooted our machine because it was behaving badly and now it behaves well. So how? Just gonna say that's an engineering principle, no. Yes. So the question is how much of these future general intelligence systems are gonna have to behave heuristically because they're not gonna have any data versus probabilistically? Yeah, well, yeah, by the way, I didn't say that causal learning was easy. It's very, very challenging. It is in many ways much harder than it is the predictive, okay? So it's tough stuff. Anyway. I guess I would say that in the field of AI, there's actually two types of AI that we talk about. There is AI that behaves rationally and then there's AI that behaves like humans. And so when we are talking about generalized, when people use the term generalized AI, they really mean AI that's gonna behave like humans and humans are not rational and probabilistic models are very unlikely to be able to fully specify and behave and follow the rules that we behave by. So I think if you want to try to make machines behave like humans, then we probably will have to have heuristics to model the kinds of sort of bad or rational snap decisions that we make. And up until the last few years, most of the research community in AI has been focused primarily on producing rational agents because those are the things we can quantify mathematically and we can be more sure about what the results of the algorithms will be. And so I'm not actually sure about as we move towards having more causal reasoning, if we're doing that in the realm of how humans make decisions, it'll be interesting to see how that sort of those two ideas come together. Behavioral economics, people have been looking at this for a long time and there is a difference between the people that focus on sort of an information theory, rational behavior perspective and then people modeling actual human decisions. So it's a very interesting question. Let me throw in one more thing. Remember, we've got subject matter experts who come to the table and they've got data, but they've got a model. Okay, especially people in the physical sciences. Okay, so that's where you start. You start with a causal model that they've brought to the table. And so that's a lot of fun doing that actually. It's easier than trying to do it where you've got no model at all. You know things that are likely to be important, but you don't know the relative importance and how much they interact and things like that. So that's the tough situation. Okay, as we maybe head towards the closing part of this panel, maybe I'd like to invite the panelists to have a final word or final few words. And maybe obviously we've dealt with a range of questions some longer term, some more challenging, some more specific. Maybe if I could invite you all to answer 50 years from now that's a millennium in the AI timescale I guess. Maybe feel free to pick a different timescale that you feel is more appropriate. Looking backwards, if you were to gather again, would we say specifically looking at these specific problems that we brought up, the robustness, adversarial attacks, explainability and so on. These went away and AI systems are deployed in critical applications, we solved them. That's one extreme. The other is while this really became a showstopper that limited the use of AI, or where do you think the reality is gonna be? Gaze into the crystal ball and let us know. I do think there'll be a lot of advances in the visual based processing using machine learning and improved sensors and improved representation of the visual space. And so I think that we can anticipate great strides in that area by 50 years from now. Yeah, robustness is a tractable problem. There's been a huge amount of work done in statistics for statistical methods, implementing robustness. So everybody talks about least squares, so that's wonderful. But throw in 5% of your data that are outliers and it'll blow it up, okay? So, but there's methods out there that when you know that that's a possibility and you wanna protect yourself, there's methods of, they are called robust methods of robust analytic methods. So I would think that's an attackable problem now for these, for well, deep learning. Okay, I think that we are going to make vast strides from the technology perspective and the algorithm perspective. I think companies will continue to see the benefit of rolling out these kinds of methods and models. Researchers will continue to work on them because they're very interesting technical problems. I think we could end up in the wild wild west because laws and policy will not catch up with it. And so just as we've had issues with laws in the past and maybe currently deciding whether corporations are people and how should we regulate them, we're going to have the same problem with respect to systems that are making automatic decisions. And I think that there is not enough focus on what are the legal and social implications of these things. And so that is really what I'm most afraid of. But if legal scholars and philosophers can try to help define what we would want out of these systems, I think that the technologists can build the systems that live up to those standards. We just need to have help actually defining what the goals should actually be. Yeah, basically what she said. Because I think that we can define the technical problems that we need to solve well. And I think what we as researchers are really good at is if you give us a technical problem that is actually solvable, we will eventually figure out how to solve it. So robustness and all these other things are tractable. But we're not very good at defining what we want from these systems. What do we expect these systems to do? What should the systems do in various circumstances? And that's not a decision that I think should be made only by the researchers or people sitting on the stage, the other people that we work with, the people in the broader research community, that's a decision that's a societal thing, right? What are we willing to delegate to AI? What are we willing to, what won't we give up? What are we gonna make sure needs humans in the loop? Those are all decisions that need to be made at a much broader level. And like Jennifer said, I don't think we're not talking about that enough at all. So I'm quite skeptical about that part. Okay, great. Well, thank you all. Panelists and those of you brave souls who stayed back with us. Thank you again.