 My name is Mung Chang, I'm the John A. Erison Dean of the College of Engineering at Purdue, and I would like to welcome all of you to today's Engineering 2169 and Idea Festival special event. As President Mitch Daniels reminded us that the best way to sing happy birthday to Purdue is one-fiftieth is to engage in intellectual dialogues and conversations about the important topics for society. And as the College of Engineering aspires to the pinnacle of excellence at scale, as we break records in our research, academic, enrollment, and philanthropy, we are delighted to have these opportunities, to have conversation with those individuals who have achieved the pinnacle of excellence in his or her own career. And today on April 5th this afternoon I am just so excited to welcome a personal hero of mine and the national treasure, Vint Cerf. Now Vint is widely regarded as a father of the Internet. He co-authored with Bob Kahn in 1974, the landmark paper that described the foundation of the Internet as we know it today, the TCPIP protocol. And since then, over the past 45 years, Vint, whether it's at MCI or Google, whereas the chief Internet evangelist today, or going from DARPA to ICON, has made a difference in how we live, communicate, and work. And for that, Dr. Cerf has received many, many awards and recognitions. I know you are here to listen to him and not to Wikipedia demonstration of his accomplishments, but let me just summarize briefly that Dr. Cerf received the Turing Award, received the Japan Prize, and the Quinn Elizabeth Prize from UK. He also received from President Clinton the National Medal of Technology and Innovation, and from President Bush the American Presidential Medal of Freedom, the highest award that can be given to a civilian in the United States. And today we will divide this one hour into three parts. First with Vint giving speech on the timely topics of software and data, and then I will have the distinct honor to sit down with Vint and ask him pre-populated questions. So the particularly harsh questions coming from the faculty and students in engineering and rest of campus, and then we'll open it up to those who are in the auditorium knowing this is also live streamed and recorded as well. Maybe using Google platforms, I don't know. So we're going to start with the first part, and please join me with a big hand to welcome Dr. Vint Cerf. Thank you. Good afternoon. Thank you. You know, when people clap before you've said anything, it's tempting to just sit down because it won't get any better than that. So first of all, I'm not using slides at all. My favorite expression is power corrupts and PowerPoint corrupts, absolutely. So you actually have to listen to what I have to say. Second, I thank you very much for a lovely introduction, Dean Chang. Of all the introductions I've had, that was the most recent. I feel compelled to give you a quick summary and explanation of my title at Google. It's Chief Internet Evangelist. Some people misunderstand that as a religious title. And in fact, when I hear questions like that, I just tell them I'm geek orthodox, and that seems to work out okay. In fact, I did not ask for this title when I joined Google a little over 13 years ago. Larry and Eric and Sergey said, what title do you want? And I said, Archduke. And so they went away, and they came back and they said, you know, the previous Archduke was Ferdinand, and he was assassinated. In 1914, and it started World War I, maybe that's a bad title to have. Why don't you be our Chief Internet Evangelist? And they said, okay, I can do that. Well, the topic, oh, by the way, anniversaries. I have to tell you that in addition to your 150th anniversary and the 50th anniversary of Neil landing on the moon, there are two other 50th anniversaries in 2019. One of them is the ARPANET, the predecessor to the Internet. It was actually the installed first four nodes in 1969. And so this predecessor to the Internet is having its 50th anniversary this year. And in addition to that, in case you didn't know, Sesame Street was also started in 1969. So I called DARPA, and I suggested we should do a joint celebration of the ARPANET and Sesame Street. We could have Big Bird come out and do something. So we'll all be celebrating various assembly things this year. The topic of this, my brief introduction here, is about artificial intelligence. And I will confess to you that in all the years that I've had light, I'm not an expert in this space at all, but I had light contact with people working on artificial intelligence. And I always thought that AI might be better described as artificial idiocy. As on the whole, an awful lot of it didn't work out very well. For example, in the 1960s, when I was an undergraduate at Stanford, we had the naive idea that if we just put a Russian-English dictionary into the computer that we could do language translation. So we did that. And then we put in a fairly simple expression out of sight, out of mind. And we did the translation into Russian, then we translated back into English, and came back invisible idiot, at which point we realized there might be more to this language translation than just the dictionary. Now, it's pretty clear that we have come a very, very long way in being able to do language translation. In particular, we went through a series of steps. One of them was very sort of rules-based, semantic kinds of things. Then we went into statistical translations where we had large samples of the same document in multiple languages, and had relatively good success, but still rather weak. And then, of course, along have come multi-layer neural networks, which had made a transforming advance on our ability to do not only language translation, but a number of other things as well. And I hope you've noticed that the Turing Award this year went to three pioneers in multi-layer neural networks. It's well-deserved because the results have been nothing short of spectacular. However, part of my homily today is to say that machine learning and multi-layer neural networks are not necessarily the same as the sort of general artificial intelligence that people write scary stories about. General artificial intelligence, at least as I would interpret it, has to do with the ability of a system, that includes us, I mean, we're systems too, we're just biological systems, to take a lot of input and formulate a model of the real world, or a model of a conceptual idea in order to reason about the model. And if you think a little bit about how quickly human beings can abstract from concrete sensing, concrete examples, it's really quite shocking. I want you to think about a table for a moment, and what it is you know about things that we call tables. The fundamental essence of a table is that it's a flat surface perpendicular to the gravitational field. My guess is most people don't think of it that way. But what we recognize implicitly is that this flat surface will support things against the gravitational field, particularly if the bottom of the object that's supported is flat. So we very, very quickly, even at age one or two, recognize the properties of this thing, even if we don't have the vocabulary to describe it as flat surface perpendicular to the gravitational field. What's interesting is that once we haven't taken on board that concept, then we recognize how many other things can be tables. Your lap can be a table, a box can be a table, a chair can be a table. And so we generalize very quickly. So we take real world inputs, we abstract from them, we build models, we reason about them, and then we apply what we've learned, again, applying it to the real world. And I find that to be missing in large measure from most of the artificial intelligence that we enjoy today. Some good examples, of course, of recent excitement coming out of our DeepMind subsidiary at Alphabet, which is the holding company for Google, DeepMind, and the other companies that are associated with it. You all remember the success of AlphaGoes and the successful four out of five games that it played against Lease et al. And subsequent examples of the intelligence of that, well, intelligence, special intelligence. It's narrow intelligence of those multi-layer networks. What was quite shocking, of course, is that after having done AlphaGo, the same company, DeepMind, did AlphaZero. And in that case, they started out only with the rules of Go and nothing else. No examples, no playing of games with people. But simply playing against versions of the game back and forth. And in a few days time, the AlphaZero system was able to play Go as well as the previous AlphaGo system did, but with much less manual training. This rapidity of learning, I think, is quite exciting. But we shouldn't allow ourselves to be overly enthusiastic about this. Because these techniques, in my view anyway, are very deep but very narrow. And so the kinds of things that we're capable of doing with multi-layer neural networks, while they're quite spectacular when it comes to image recognition, looking at cells to decide whether or not they're cancerous or not. Or looking at other images to do classifications. We also know that they tend to be brittle. And they're brittle in ways that we don't always have the ability to predict. If we could, we could probably build the systems to be more robust than they actually are. The thing that's a little scary about the brittleness of these algorithms is that if we can't predict how they're going to be brittle or how the brittleness will be manifest, then we have some, should have some concerns about how we apply these technologies and what we rely upon them for. And just as a side note, I am not necessarily arguing that all machine learning is dangerous or bad or that artificial intelligence is going to cause a great deal of problem. But I think we should be aware of the potential. Now, in addition to that, quite apart from artificial intelligence and machine learning and everything else, I actually worry about software in general. And so here we are in the middle of this avalanche of Internet of Things, devices that are programmable that can communicate. And we hand autonomy off to those devices and let them make decisions without anybody in the loop. And it might be handling security, it might be handling the heating, ventilation, and air conditioning. It might be handling the microwave oven or the refrigerator. And we just sort of expect them to do what they're supposed to do. The problem is that programmers make mistakes. I know, that's shocking, I'm sure that most people would just be very alarmed to hear that. If you've ever made a living or survived in your computer science classes writing software, like me, you probably have a little dent in your forehead from the many times when you've gone, how can I make such a dumb ass mistake? And the problem is we do make very stupid mistakes and we don't have very good programming environments to prevent us from doing that. So I'm actually more worried about little pieces of software that somebody grabbed off of GitHub and jammed into a, could be a video player or a camera, a webcam or something like that, which is vulnerable. So I used to say, the headline I'm mostly worried about is 100,000 refrigerators attack Bank of America. And, ha ha ha, I used to think that was funny, it didn't funny anymore. And the reason, because it was whatever it was, 500,000 webcams attack Dine Corporation. And the thing is that we won't even know that our devices are complicit. The refrigerator is keeping the ice cream cold, it's not melting. There's no indication that it's been invaded because it's doing everything it's supposed to do plus this other little thing that it wasn't supposed to do. So we actually are faced with a kind of fragile future if we're not thoughtful and careful about this. Every single person in this room who has anything to do with producing software should feel a real burden to be thoughtful and careful about the software we write and release into the wild. I would like to see the academic community build much, much better tools to help us surface mistakes that programmers typically make. And so, whatever I might say about artificial intelligence and machine learning applies double to random pieces of software that become part of the Internet of Things. Everybody here by this time has probably heard about generative adversarial networks and the thing which is so shocking about the results that have been displayed is the ease with which these multi-layer networks can be made to misconstrue the information that they're getting. If you've seen the examples, sometimes changing just a few pixels in an image that looks like a cat will cause the system to declare that it's a fire engine. When we look at this and we say it's not a fire engine, how could you possibly come to that conclusion? And the answer is there are dependencies on the values of the pixels in this multi-layer network that cause it to make mistakes. And I kept trying to think of a cartoon model that would explain this. Now, this may not be correct, but it's helped me in trying to appreciate why this could be a problem. So I want you to think about a three-dimensional chart with X, Y, and Z coordinates. And you can represent every single point in the 3D space with a vector that has three numbers, the X coordinate, the Y coordinate, and the Z coordinates. That's kind of easy to remember and to visualize. Now I want you to imagine a 100 by 100 pixel image. So there's 10,000 pixels. Imagine that we're dealing with a 10,000 dimensional space. Now, right away, your brain explodes. But forget about that. Think for just a moment about the three axes, but imagine that the vector is now 10,000 values. And every time you change a value, every time you change a pixel, the vector moves around in this 10,000 dimensional space. So we cluster in this 10,000 dimensional space, vectors that are associated with cats, dogs, and kangaroos, and so on. And what we've done in the clustering is to create hyperplanes in the 10,000 dimensional space. Now imagine somebody changes one or two pixels. Let's suppose that the values are zero and one black or white to make it easy. So a change of a couple of pixels causes the vector to move in the space. What if it moves right past the hyperplane that separates cats from fire engines? And so that's the kind of thing that happens, at least in my little cartoon model. And it's very hard to predict ahead of time how those things will manifest themselves. So I have a list of problem spaces that I hope that you are concerned about. And I bring them up because the real world public and also legislators and policymakers have now become conscious of potential hazards. And of course, that sits very heavily on the shoulders of the computer science world. So deep fakes, things that are generated that look like they are real. But are in fact fictitious events that might look like a person that we might care about whose opinion might be important has said or done something. Disinformation in general has become a huge issue. And we're seeing a lot of that flowing into the net deliberately from, for example, Russia that's trying to essentially disrupt the social fabric of the United States by forcing people to come to grips or come to blows with each other, botnets, which are easily built because the vulnerabilities in the operating systems are quite evident and easily taken over. So the social and economic side effects of computer science, of networking, of the kinds of things that we build today are becoming increasingly apparent and for many people extremely concerning. And so I'm not gonna end on such a negative note, but I want you to appreciate that in a sense the success of computing and networking and small devices that are programmable leads to a variety of problem spaces that need to be dealt with. Now on the plus side, so we can finish this part of the day with a positive view, the ability to gather and analyze information is better now than it ever has been. And this is not just about big data. This is about being able to take substantial information, high resolution information and the like, and to do something useful with it to analyze it. And the fact that we can do medical diagnosis better than we could before. The fact that we can find correlations in otherwise dense and complex data sets is ready for exploitation. And I mean that in the most positive sense of the word. So in a way, we're at the beginning of an era where information becomes the paramount value in our society. Our ability to manipulate that information to extract value from it is going to be indicative of certainly the remaining coming decade in our social history. So I'm actually excited as much now as I ever have been about the possibilities. But I'm also very conscious of the fact that the same technology which can do all these wonderful and good things, which I rely on every day, also has the possible potential for harming people and society. And so that's the challenge for us is to try to make sure all the positive stuff happens and the negative stuff doesn't. So I'll stop there and I think we're ready to do some Q&A. And in any case, thanks so much for joining us this afternoon. Well, first of all, let me ask you the I think the most pressing question of the day and the most important one that I've heard collecting questions from the students. And that is, what phone do you use and what is the app that you use most of? Well, I have here a pixel which I've turned off so that it doesn't generate funny noises. And what app do I use them? To be absolutely dead honest, no pun intended, I use Google. And I am a persistent user of Google and Google Assistant. I'm always astonished at how much information it's possible to obtain quickly. I remember in a, we had an electrical outage at one point. And I remember sitting down with a pad of paper and a pencil intending to start writing something that I needed to finish. And after about a sentence or so, I gave up. And the reason I gave up is that normally I'm sitting at a keyboard. And I've got Chrome browser running. And I always have a tab open to get to Google because two sentences in, I have a question I want to get the answer to. And I couldn't because it wasn't accessible and I found myself just stuck. Now, I couldn't get any further until I got the answer to that question. So I'm hugely dependent on that particular class of application. Well, I have here a paper entitled A Protocol for Peckin Network Intercommunication. I remember that. By Vinten G. Serf and Robert E. Kahn. And by the way, it's one month shy of 45th anniversary of the landmark paper that described what is now known as TCPIP, the thin waste of the glue that gave us the internet. And it started by this short to the point abstract. That the protocol that supports the sharing of resources that exist in different packet switching networks is presented. The protocol provides for variation individual network packet sizes, transmission failures, sequencing flow control into an error checking. And the creation and destruction of logical process to process connections. Some implementation issues are considered, it's a modest way to put it. And the problems such as internet work routing, accounting, timeouts are exposed. Now, first of all, I wish all the papers are written with such concise abstract these days. And may I also highlight that, by the way, you can all Google this paper. It's free download. The pictures at the end showing Vint not in a three-piece suit. That's correct, yes. That was back in the Stanford University where it wasn't required. We were three-piece suits. And while those Californians who are in the 70s, well, a lot of people are cured. Could I just make a, since you brought up the three-piece suit thing, I want to explain how did that happen. So I met Stanford and the guys at DARPA say, come and run the internet program in Washington. And my wife, who's from Kansas, says Washington DC, three-piece suits. And so she goes off to the Stanford shopping center and she buys three three-piece suits from Saks Fifth Avenue. One of which is a Searsucker outfit because she knows it's hot and muggy in Washington in the summer. So I show up in my three-piece suit at DARPA and I'm asked to go and testify before some committee, and I don't remember which one it is now. And I happen to be wearing the Searsucker three-piece suit. So I did my testimony and I came back to DARPA and I didn't hear anything. And a few weeks later, the director of DARPA says I want to talk to you about your testimony and I'm thinking, my government job is over, I screwed up, whatever. So I show up and he's got a piece of paper in his hand. He says, well, I have a letter here from the chairman of the committee. He says, thank you very much for Dr. Serves testimony. By the way, he's the best dressed guy from DARPA we've ever seen. And I took that as positive feedback. So I've been wearing three-piece suits ever since. Well, if it secures the founding of the internet, then so be it. So the question of int is, by the way, thank you for sharing that history. I never knew it before. What is the part of many outstanding engineering designs here and architectural decisions that you and Bob made back in 73, 74? Which one did you like most and which one the least? OK, so the thing that was probably most effective is we borrowed from the ARPANET experience the notion of layering. And incorporated in that notion of layering is the idea that below the IP layer of protocol, although at the time it was combined, TCP did both end-to-end addressing and also the flow control and resequencing and everything. We split that out a few years later. But at the IP layer, below that, we ignored a lot of the details deliberately of what the technology was that supported the flow of packets. And we did that very much in the expectation that there would be new communication technologies coming along that we didn't know about and couldn't predict. Now remember, this is 1973 when we're writing the paper or 74 when it was published. Six years later, optical fiber starts to become part of the telecom environment. So our isolation, deliberate isolation from the underlying technology was very much deliberate, even though it might have meant that we didn't necessarily extract from any particular underlying layer, ethernet or X25 or what have you, the maximum possible efficiency. But we made ourselves more future proof than we would otherwise have been. So I think that decision was a very important one. The other one, which is I think equally important, is that we chose not to assign addressing structures that were based on any national boundaries. Unlike, for example, the telephone system which adopted country codes. And we did that partly because we were doing this work for the Defense Department. And we assumed that this technology would be used for command and control. That was part of the motivation for DARPA's funding in the first place. And we thought through a scenario that said, well, let's see. If we use country codes, imagine what would happen if the US were confronted with having to attack country B in a couple of weeks and they were using country coded address space. What would you do? You go to country B and say, hi, we're planning to invade you in a couple of weeks. Can we get some address space or a command and control system will work? And of course, the answer to that is probably no. So we said, OK, can't use that, has to be topological. So the networks were just numbered based on when they were created, roughly speaking. Now it's gotten more complicated now with CIDR and other kinds of addressing structures. But the whole idea was to be non-national and so I think those were very good choices. Now the thing that I thought was super important, it turned out not to be, is the ability to take an internet packet. And if it got to an intermediate router somewhere that was going into a network that couldn't take a packet with that size payload, the idea of fragmenting the packet in such a way that it could be reassembled at the other end. So even if the pieces took different paths after they were broken up, it didn't matter, we could piece them back together at the far end. I thought that was going to be critical because we couldn't know what packet sizes were going to be available in the future networks that the internet was supposed to lie down on top of. Well, in the end there's even an RFC called Fragmentation Considered Harmful. It has occasionally saved ourselves when you've run into a serious problem. But for the most part, fragmentation led to the reassembly problems that often show up, you have to have enough buffer space to put everything back together again. If you have too many pieces of packets that are not fully reassembled, you run out of buffer space and the system jams up like a Mexican standoff. So that turned out to be not as important as I thought it was. Well, we will have internet in commercial space. I know that meant in the 60s we were at JPL and here at Purdue and of course we're celebrating July 20th on Armstrong's Lunar Walk. And we hope that Purdue will continue to play a critical role in the future of commercial space. But people ask me, but would I be able to check my email when I go out to the space tourism spots? So I'm confident the answer is yes and the reason I can say that is severalfold. First of all, in 1998 a team at the Jet Propulsion Laboratory and I got together and said, why don't we design an interplanetary internet? And it wasn't just, you know, let's see what happens if the Martians show up. That was actually motivated by the belief that we had been doing space exploration using point-to-point radio links for command and control and for data recovery. And that seemed so wimpy and weak and unrich and not robust and resilient. So we started discussing this possibility just after the Pathfinder landed in 1997 on Mars. And some of you will remember that the previous successful landing was in 1976, the two Vikings that landed on different places in Mars and then we had 20 years of failure. So it was pretty exciting to finally get the Pathfinder delivered. Although I confess to you, do you remember how they delivered it in a big bouncing balloon? I was thinking if I was the guy who was running the program at the time and somebody walked into my office and said, we're going to deliver your $6 billion thing in a bouncing balloon, I would have thrown them out of the office. That shows you how much I know. So we got together and said, what would it take to allow an interplanetary backbone to evolve? And so the idea was not to go build one ab initio, but rather each time a space mission went out, we would carry on board the interplanetary protocols. And once the scientific mission had been completed, it could be converted, the spacecraft could be converted into another node of the interplanetary backbone. We started out thinking we could use TCP-IP to do this because it worked on Earth, it ought to work on Mars and it would. But between Earth and Mars, the distances are sufficiently large that the speed of light is too slow. And it takes anywhere from three and a half minutes to 20 minutes for a radio signal to get from Earth to Mars depending on where we are in our respective orbits, which means the round-trip time is double that. And we started thinking our way through, what does DNS look like with a 40-minute round-trip time? So you do a DNS lookup. You're on Mars trying to get something on Earth. You do a lookup. By the time you get an answer back, whatever it is has a different IP address because it just moved into another network, so that's not helpful. Then we ran into the other problem that the planets are rotating and we don't know how to stop that. So, you know, if you have something sitting on the surface of the planet and you're talking to it, the planet rotates, you have to wait until it comes back around again. So we have it, or satellites in orbit, same problem. So we have a variably delayed and disrupted environment. So we said, okay, new regime, delay, and disruption-tolerant networking. So we developed a new suite of protocols that we call the bundle protocols in order to deal with a lot of those things. And believe me, it's not as easy as it might sound, especially when you get to network management because Ping is not your friend anymore. You know, if you don't know how long it's going to take to get an answer back, you don't know how old the information is, or if it comes back really late, it's useless information because the state of the system might have changed pretty dramatically. So we had to develop a whole new sense of network management in addition to everything else. The team is very excited about the current state of affairs. We have standardized the interplanetary protocols. They are running on the International Space Station. They're being used every single day by the astronauts and some of the experiments that are on board. We have prototype software that we wrote before what's running on the space station is still in operation on Mars supporting the rovers and the orbiters that are already there, starting in 2004. So I'm honestly very excited about this. NASA has agreed that all missions in the 2020s and on will have on board the interplanetary protocols. So over time, we expect to accrete an interplanetary backbone as the missions become repurposed as nodes in the network. It's exciting indeed. I thought Google is already working on the fixing of the planet and not to move anymore, but we'll hear that from the press. There may be some side effects that we'll worry about that. Well, I know that some faculty believe that their message to university administrators such as myself, the round trip time is like 45 years. Now, let me ask you, Avinth, that in addition to engineers and scientists designing artifacts, they also design models. And a model is sort of a hard thing to do because you don't know where is your bathwater assumption with your baby assumption. You've got to throw some out of the window. And you mentioned the pitfalls and the perils of these models. What do you think about our fragility of reliance on models? Well, first of all, anyone who's ever done simulation, for example, will know that the simulation is only as good as what you were able to model in the simulation. So if you say, you know, the thing works great, I simulated it, everything's perfect. You go out there in the real world, you try it, something doesn't work. And the reason is that that one thing you forgot to put into the simulation because you didn't know that that was important or you didn't know that was in effect. So if you're gonna build models and rely on them, a very high priority is to validate the model. And you have to validate it in as many different conditions as you can in order to be reassured that the model will accurately reflect what happens. Now, we have a subsidiary at Alphabet called Waymo, which makes self-driving cars. And we've driven those cars, variations on them, in the San Francisco Bay Area and in a few other places for about four million miles or so, real driving miles. We've also simulated several billion hours or miles of driving in a simulated environment. The reason that that's turned out to be important to us is that, and the quality of the simulation is very important, is because we are trying to test edge cases that we wouldn't normally be able to test in the physical real world, either because we don't have enough assets to do that or we don't wanna run over the three-year-old who's running after the ball and say, oh, I guess we have to change that algorithm. So we're depending fairly heavily on being able to feed manufactured data into the software as if the sensors were delivering it. And we hope that the fidelity of that simulation will be sufficient so that we can reassure ourselves and, of course, the passengers in the cars that it's safe to use them. So I am a big believer in simulation, but again, with the caveat that the simulations need to be carefully validated for their accuracy. Well, fragility of software. Now, first of all, I don't worry about my codes because my codes made so many errors that they don't compile. So there's no impact. But our students here, they write codes that work. And... And... Work, okay, that's it. Now, the question is... Is that for some value of work? Well, but the question, as you posted just now in your inspiring opening remarks, is when you put the virtual with the physical, and by the way, the Purdue Engineering would like to be the best in the world at the interface between what we code and what we touch. But when you put the two together, what works may not work anymore. What's your advice to our students on that? Wow. Well, it could be going to insurance. You know, like... Yeah. Cyber insurance, right? That doesn't... That's an idea for some startup of mine as a student. I'll tell you, cyber insurance is don't even... Well, the problem is some people think that if you buy cyber insurance, that somehow you're protected. It doesn't protect you from the mistakes that are made. All it does is possibly protect you from the financial consequences of mistakes. I mean, think about Boeing and the terrible accidents that have happened recently. Because of software errors, no amount of cyber insurance will save the lives of people whose lives were lost. So let's talk a little bit about software production, though. One of the things which I think is evident is that a large fraction of the errors that programmers make are fairly simple mistakes off by one bug in a reference to an array. Or a buffer overflow where you read in more data than you allowed the space for in the program and it overlays some part of the rest of the program. And then a smart guy will execute the data that was overlaid. There are a whole variety of mistakes like that. And some of them are detectable. If you use static analysis, sometimes, for example, is there a reference to a variable that was never set? That's usually a good sign of a mistake, especially if you make a conditional expression out of that and you branch off to some random place in the program. So a lot of those things can be analyzed statistically. Another thing, or I'm sorry, statically, that there are even more elaborate dynamic analyses where you actually try various values. You made you what we call fuzzing where you assign a set of values to a collection of variables in the program and you ask what happens? Does it get the right result or not? Does it go where it shouldn't be? What I would like more than anything is the ability to model the program's intended behavior sufficiently that I would have kind of a programmer's assistant sitting on my shoulder. And what I would like it to do basically is say something like you just created a buffer overflow. What do you mean I created a buffer overflow? Look at line number 23, rats, you know, or some other expression. So in theory we should be able to do things like that. We don't have the kind of programming environments that I think we should have. So that's a research problem. It gets worse and you implied it in your question. Today we have the ability to run software in a wide range of devices concurrently. And so now we have potential for race conditions and a variety of other things that are even more complicated to analyze than a single threaded program. We have parallelism, which we like to take advantage of. And if any of you have been looking at the hardware design mistakes that have popped up recently, like meltdown, for instance, among others, the problems get down into the guts of the hardware that's trying to do things cleverly. Like why don't we execute both paths until we figure out which one is the right one? Which sounds kind of cool on the surface, but when you dig in, you discover that the mechanisms that are required to do that lead to some side effects, one of which is exposure of cryptographic keys because you have a side channel that those systems can expose. And so even the hardware design has to be extremely carefully thought through. And again, we need to be able to maybe simulate of the environment in which some of these things are gonna be executing. So I'm hoping that the research community will actually come up with better programming environments than the ones we have today to help us avoid making really bad mistakes. Well, we want to leave time for the audience for live questions, but three additional questions pop up just so many times from the feedback we collected. And one is 5G and IoT. These are hot buzzwords these days. And one could argue that higher throughput, lower latency, maybe edge intelligence is gonna be helpful for many things. But what is 5G in your mind? Well, it's not 4G and it's not 6G. Ha ha ha ha ha ha ha ha ha ha ha ha. Whatever they are. You know, I've given up trying to figure out what people mean by 5G, other than is the thing after 4G. And people have various definitions of what that might be. Most of them have at least one characteristic which it's faster. So higher bandwidths that are available. The question then will be well, how do you get those higher bandwidths? One answer might be running at higher frequencies. Another one might be making cell sizes smaller so that you get higher data rate per more bits per hertz out of the signaling system. I don't think that it's a well-defined thing. And so it will end up being something that people will label 5G. And I suspect that we will see multiple instances of things that may not even inter-work that will be called 5G because the standards for it haven't been set. It's also a highly competitive space right now. And we see this all the way up at the national level where we see people in the US worried about the Chinese initiative 5G, which may turn out to be financially successful because they made it a substantially good market out of internet-related devices, Huawei and ZTE in particular, are selling a lot of equipment, especially in the developing world where the price is an issue. So there is a lot of contention here and it's too bad in a way because if you remember the GSM, the success of GSM, which came out of the European initiative, created commonality all around the world. And if I'm about anything at all, it's about interoperability. I really, really want things to inter-work. In the IoT space, I have a similar kind of worry about interoperability because the fact that it's labeled internet of things or cyber-physical system doesn't confer automatically on it commonality of protocols, commonality of data structures and data definitions. And yet, if you're on the receiving end of a bunch of IoT devices, you put them in the house, you would expect that regardless of brand that they would be made to inter-work except a lot of people are just out there to get the equipment out the door, get the money from you and then say goodbye and good luck. I worry, for example, about people maintaining the software that goes into IoT. Some of these devices may be large physical things like heating, ventilation and air conditioning could last 10 or 20 years. And so who's gonna maintain the software that's associated with those devices? We know for sure that there will be bugs. We know that the bugs should get fixed. Somebody should take the responsibility to fix them. We also know that when the device, if it's capable of doing it at all, when the device downloads a new piece of software that's supposed to fix above, the first question you have to ask is, where did this upload come from? What's the source of the new update? And the second one is, did it get altered before it got to me? And so suddenly digital signatures may be your friend in many ways. The question is, will the people who make these devices go to the trouble of building in digital signatures and crypto checking and everything else? And will they commit to maintaining that software? So I realize I've shifted away from 5G to IoT, but I think both of those buzzwords contain a lot of potential hazards for all of us. And when you think about how rapidly this is likely to proliferate, if you look at how quickly smart phones proliferated, they're only 12 years ago, and now everybody has a smart phone, or at least practically everybody, sometimes more than one. So in just a little over a decade, this technology has become very, very penetrant all around the world. The same argument might be made for IoT devices. And so once they become deeply penetrant and we become deeply dependent on them, then whatever weaknesses they have will be visited upon us. And so these are areas that deserve a lot of attention. You mentioned EDGE, by the way, so I thought I would come back to that for a second. Please. There's this funny sequence, right? We start out with big, giant, time-shared machines. They weren't even time-shared, they were batch processors. Then they got time-sharing, so 1,000 people can use the same big machine. And then we figured out how to make machines that were less expensive and smaller. So we got departmental machines like the ones from Digital Equipment Corporation. And then we got workstations. Then we got desktops. Then we got laptops. Then we got pads. Then we got mobiles. And while all of this is shrinking, this is a classic example. New technology, big, expensive, only a few people can afford it and we share. And then it gets less and less and less expensive until finally you stick it in your pocket. So now what's happening is that we've got giant clouds of computing that are really basically laptops that have been stuck in giant racks and stuck in some big data center. And there are many, many of those gathered around the world. So we have these giant cloud computing systems. And we have all those devices that you and I use at the end, the other end. And then we discover that it would actually be beneficial to have some computing capability at the edge of the net that's local. And why would you do that? Well, some of the little IoT devices are not too good at protecting themselves. Light bulbs, for example, probably don't have a whole lot of computing power. And so it could very well be that a number of IoT devices will depend on something between themselves and the rest of the internet for protection and for control, for configuration and the like. And so I see edge computing as an opportunity to do local computation that offloads the devices that are maybe in a residence or in a manufacturing facility or something, protecting them and providing better utilization of some computation, which leads to the very interesting question of where should I do the computing? Do I do it in the device? Do I do it in the edge? Do I do it in the cloud? And there's some interesting optimization questions that come out of that. Well, you know, I'm biased on edge computing so there's music to my ears. And Vint, you mentioned interoperability across domains and maintenance across time. Now, what about freedom of access to information in the first place? Some say that perhaps people are born free or should be, but access to information is not free across the world today. What do you think about, for example, programs like TOR? Oh, that's interesting. Well, now you've kind of conflated some things in a very peculiar way, right? Because TOR is... Leading the witness. Yeah, well, no, I wasn't thinking that so much as the conflation. So I'm gonna deconflate this for a moment. TOR is the onion router system, and that's a way of allowing people to get access to information without revealing as much as possible who they are or where they are and how they got access to the information in the first place. This was actually developed by the Naval Research Laboratory, and it was intended to help people get access to information that they otherwise might be harmed if they tried to get to it or to exfiltrate information from someplace to let the rest of the world know what's actually going on. So that was the original intent. Of course, it also gets used for a bunch of other purposes that most of us would disagree with, but that's what happens with technology. It doesn't know good, bad, or indifferent. It just does what it does, and you can use it in different ways. But let me go back to access to information in general. I'm not a believer that all information should be freely available. I mean, I suspect you would agree with that. There are some things that you would prefer to keep private, like your email. Or the Google Docs that you create assuming you use Google. So there's... We all do, I think we all do. Well, I'm glad to hear that. And if you're not, shame on you. So the point here is that there is good reason for some information to be widely and readily accessible. An example of this would be research funded by a government, certainly the U.S. government. There is a real interest in making sure that the research results become readily available to everybody, and so the National Science Foundation has rules about commitments of that type. At the same time, there's other information that you want to make sure is access-controlled. I mean, the obvious cases are classified information where you absolutely want to control who has access to that, but there's payroll information and medical information and the like. The one thing that worries me is that in our zeal to protect privacy, we may actually inhibit our ability to take advantage of aggregated information. So medical information, for example, if we knew more about people's medical conditions and the diseases that they have encountered, we might understand better if we had access to all that what it was that causes somebody to contract a particular disease. And so we would want that information to be shared but not personalized. And so finding our way to make information shareable and accessible under the right conditions, I think is a very noble objective and it's one that we should strive for. But I think we have the full spectrum to deal with. It's things that should be kept very much controlled and things that should be very, very open. Well, at the risk of a conflating multiple questions, because I think I only have time for one more question for you, Vint. And let me compress multiple questions into this one. The limits of AI, now we've got a great team, Kaohsiung and Anand and the whole team here winning SRC DARPA National Center in a brain-inspired computing where they're pushing the energy limit so that can alpha go beat human brain with one millionth of the energy that is currently using. We're far away from being able to live. When you consider the amount of energy that this little pile of gel uses to do what it does compared with what we would require, I won't say it's hopeless, but we are very, very far away. In spite of the fact that the tensor machines that Google has been making, tensor flow machines, they're actually pretty damn powerful machines using less and less energy with each new iteration. But it's nowhere close to the small amount of energy the brain uses. Brain takes advantage of parallelism in ways that we don't fully understand. Well, maybe we should write another proposal at the same time now. There's also the functionality limits of AI. I remember a few years ago, I was moderating a panel with Eric Schmidt who was the Google former chairman and CEO around the time of alpha go and people asked, what can AI not do? So one question is translation. You mentioned at your opening remarks, what about translating poems from one language to a very different language? And well, you know, Eric's answer at that time, that well, give me enough poems and I'll be able to learn how to translate poems. So my question to you, Vint, is, can AI appreciate beauty? Oh, well, I think the honest answer is no, if I appreciate, you know, we have some funny expectation of value that the program calculates. I think the answer is no. But can it produce something that a human would appreciate? The answer is yes. And we've seen some fascinating examples of this where in fact, we did put something up on our Google Doodle recently, which allowed you to compose a melodic line and then produce a Beethoven-like harmony that went with it. And you know, I didn't get a chance to listen to everybody's creations or anything like that, but I tried a few and I was surprised at how well the system was able to produce a recognizable Beethoven-like harmony to go with a particular melodic line. The same argument can be made for art, especially where you look at a particular artist and you've built a neural network which is able to ingest the essence. So like Van Gogh, for example, would be a pretty obvious dramatic kind of treatment. But we've done this with a variety of different artists and shown people, you know, take a picture and then run it through our little dream system and it produces images that are fascinating and in the style of. So although the program may not appreciate what it's done, human beings might appreciate what the program has been able to produce. Well, I would like to remind all the students out there that computer-generated answers to your homeworks may or may not be allowed by our instructors. So please check with them. Now, I will be receiving a whole lot of angry emails from students if I don't at least provide another 10 minutes if you don't mind being to stay on the stage another 10 minutes to get questions live from the audience. Okay, so just so everybody knows I'm not listening to the baseball game, I have two hearing aids and sometimes the audio is not always easy for me. So in theory, if you're talking to a microphone I'll be getting it straight in the ear, so to speak. So you're not screaming the questions out? No, I wasn't planning to do that. So questions from the audience, please. Okay, wait, let me get myself set up first. I don't know if we've got a mic here. All right, we've got two mics here if you don't mind lining up. Oh yeah, you have to get out of your chair. That's tough. Could we just hand the mics around? This is like hot dogs in the ball game. All right, please. Yeah, go for it. Thank you for the great talk. And you mentioned Vemo and you also mentioned how it's very, very important to think about all the use cases of software and prevent injuries. So what do you think of autopilot? What about, yeah, I don't use it. My wife and I have two Teslas and I don't use autopilot. First, I think that autopilot is misleading and it may lead people into thinking it really isn't autopilot and it isn't. It's a level three or maybe the best level four where level five is completely autonomous and level three and four have to do with assisting the drivers. And we've already seen a few cases where it's been misapplied and people have been killed. So I think that it can be quite helpful, especially if you have all those sensors that can see 360 degrees around changing lanes could be a lot safer with the system that can sense whether there's a car or another vehicle in the way. One thing that is missing from today's environment, especially as we contemplate self-driving cars, is a communications capability among the cars. I kind of like the idea of a car announcing to the others in the area, I need to change lanes in the next half a mile or so and have the cars adjust and adapt to that. I don't expect to see road rage showing up among the self-driving cars. At least I sure as hell hope not. I mean, that would be really a nightmare. So the answer to your question is we should be very careful about this. Even the guys at Waymo recognize how hard it is to make a really self-driving vehicle. Okay, let me also switch back and forth here. Thank you for the question opportunity. Currently, machine do the jobs that we assign them to do. If a machine has the capability of thinking, what if they reject the job that we assign them to do? Oh, I love that. I mean, so the robots go on strike. Well, my guess is that we would have to have advanced rather far along before we get to that point. Think about for a moment trying to program something deliberately where you present, first of all, you're saying, does it reject the job that you've asked it to do? So first of all, you have to have a standard way of telling the machine what it is you want it to do. Think for a moment about Google Assistant because it in a sense is taking spoken requests and spoken commands. And it has to translate that first from spoken language to text and then it has to analyze the text to figure out what was the question or what was the request. And so we're getting closer and closer to the scenario that you're describing. The standardization and the ability to evaluate the query is part of the step. And then the next step, which is to do something about it. And I suspect if you've used either Siri or Alexa or Google Assistant, you've encountered the, I don't know how to do anything about that now or let me look that up on Google. Well, Alexa wouldn't say that, but it would be sort of, let me look that up on the web. And my reaction, of course, is if you're idiot, I could have done that myself and looking for something better. So the answer is that for a very narrow class of tasks, you might actually get back something but it isn't because the device is rejecting you because it's angry or something. It's more like it didn't understand what the task was. You could deliberately program something to appear to be upset and angry and reject everything. Some of you may have heard about, remember Joe Weisenbaum and the Eliza program from the 1960s, okay, this is talk about anniversaries we were talking over 50 years ago. Joe Weisenbaum and MIT hated AI and he thought that it was a silly waste of time. So he wrote a program that was essentially a transformational grammar. And so you could ask it a question and it would react. And so if you said, I'm very tired today, it would come back and say, how long have you been very tired? If you had said, I am very banana today, it would have said, how long have you been very banana? So it was just a transformational grammar. I didn't understand what was going on. What was amazing is that a lot of the conversations that the system had with human beings look like normal human conversations, which means that there was almost no information being passed back and forth between either the computer or the human. So why do I tell you that? It's because we are far from the case where we have autonomous thinking that would be required for the kind of scenario you're describing. So I'm not anticipating the robots going on strike anytime soon. But what if robots have the ability to think? Do you think that will cause even more job being replaced by the robots? Actually, so the answer is yes, but you left off the other part of the equation. And my belief is that new jobs are going to be created. They always are when new technology shows up. Somebody has to program the robots. Somebody has to build the robots. Somebody has to figure out what the robot should do. Somebody should be deciding what tasks would be useful for them to do. There are a whole raft of new jobs that whose possibilities are created. If you think about the work that people do today and you ask, did these jobs exist 10 years ago? A lot of the answers are no. So I'm not very worried about that. The thing to worry about though is that if your job got automated, then you're gonna have to do a new job. And the question is, how do you have the capacity to do the new job? And that's what we should be concerned about. Just to, can I go on just a little bit more on this? Absolutely, please. So I want you all to imagine that you're gonna live to be 100 years old. Some of you will. And imagine your career last 80 years. Now if you remember that the smartphone was just introduced 12 years ago. So if you imagine a career that last 80 years, it's almost certain that you will need to learn new things to be relevant and to continue to have a career over that eight decade period. Which means that the most important thing we can do is learn how to learn. And also to want to learn. By the way, that's not so easy because if you feel compelled to have to learn something or feel like you must learn something, that means that something has changed. People hate change generally. That means they have to do something different. They have to learn something new. Why can't I just keep doing what I've been doing? I was happy then and then I'm not happy anymore. Because I have to do something different. We have to learn how to embrace learning and embrace change because it will visit us for those eight decades of our career. Okay, we need to go on to the next one. Let's go over here. I'm not hearing you anymore for some reason. Yeah, if you talk to the mic directly. I was wondering. No, that's still not. Is there something wrong with the mic? You could just run over to this mic. It was working a minute ago. This is the engineer's solution to everything. Oh. Hi. Yeah, that's much better. I have a simple question. I just was wondering what's your favorite programming language and what's your favorite text editor? So if I tell you Fortran is my favorite programming language everybody can laugh at me. Actually, I haven't had to write very much in the last little while. That's what programmers are for. So. So the thing that I've actually gotten interested in more recently, Google has a programming language called Go which we're quite interested in. And so I'm looking for higher level languages that make it easy to describe what it is that I'm trying to get the system to do. Now I missed the last question though. What was that? Text editor. Oh, text editor. Actually, I'm a fairly heavy user of Google Docs. And the reason is not that it's particularly elaborate in terms of formatting and everything else, but the fact that more than one person can be editing the document at the same time. So I'm in the middle of writing a book about the internet for Oxford Press with a colleague in Mexico City. And he was at my house for three days and we went cranked out about 60 pages of material. What was fun and exciting is that he was editing one part of the document while I was editing and editing another and then we could go peak and see what the other guy was doing. And then we could have conversations about it and argue whether we agreed or didn't agree on whatever was being written. But the part that I liked was first we could do it at the same time to the one document so that we didn't have this problem with multiple copies of the document and trying to merge them together. But the other thing that made it so interesting is that we continued the work after we got back to Mexico City because we just set up a Google Video Conference call and joined ourselves on the same document. And we had exactly the same working arrangement that we did before. And that was very attractive. So I'm a heavy user of Google Docs. Well, I am concerned about two things now. One is that I heard Mitch is about to automate engineering dean's job. So and I'm not capable of doing anything else. So thanks for reminding me about that. The other one is the time check. How about we get just one more question from the audience? Well, why don't we let them ask a bunch of questions and then I'll see which ones I can answer. All right. So any questions, one answer, please. Yeah, go ahead. Go ahead. So my question to you is so what's your take on the requirement of new hardware for the progress of artificial intelligence? Is the requirement like far out in the future? Or do we require it right now? Well, we already have a small example of that. If you look at machine learning, you can see the TensorFlow stuff is intended to reduce the amount of power that's required, reduce the resolution that's required in the computations. So the answer is no question. There is more to be done in the hardware design. And I can hardly wait to see what the guys at Google come up with next. Hello, doctor. My name is Tosa Nogan-Jobi, and I'm a freshman here. And I'm sure you didn't know you were going to invent the internet until you probably did it. So my question is what advice that you know now would you give yourself to when you're a freshman in college? First thing I would, oh, as opposed to what advice would they give myself about the internet design? We said, OK, that's not the question you're asking. I would have done IPv6 first, so we didn't have to do it. Except it wouldn't pass the red face test, because 128 bits of address space to do an experiment, give me a break. OK, so the answer to your question, though, about what advice would I give myself? Honestly, the most important piece of advice is read as much as you can that isn't technical. And I don't mean science fiction and stuff like that. Culturally valuable things like Rousseau and things that are, you won't have time to read when you get older. So that's first point. The second one, I think I would have valued being told that in order to get anything big done, that you need to be able to sell your ideas to other people. The internet didn't happen just because Bob Khan and I did that original paper. It happened because we were able to convince people that they wanted to do what we wanted to do. That's called salesmanship. And it turns out selling your ideas is the best way imaginable to do something big. So I think understanding that and recognizing that you need help to do a big thing is very important. Yes, sir? So your story about how long has your banana been? My story presents my question of chatbots and the idea of the ever-progressing thing of we talked to this robot and they responded back to a more human type of manner. How long do you think it could be until it becomes a point where we won't even realize we're talking to a robot anymore instead of an actual person? This is a great question. Everybody here might know about Alan Turing. And Alan had this test. And the idea was a human being was trying to interrogate a human and a computer. And the task was to distinguish between the two. And if after a series of exchanges, the human was unable to tell the difference between the human and the computer, then the computer had passed the Turing test. Today we have Turing test two, which I've just made up. Turing test two, a computer is interrogating a human and a computer. And after the computer has finished interrogating the two, if it can't tell the difference between the human and the computer, the computer has failed Turing test two. And the reason that's important is that we're confronted all the time at Google and with the other people who are interacting with what we think are people on the internet would turn out to be bots. And so the ability to distinguish a human from a bot turns out to be really important because of the potential hazards that bot nets create. And so I don't, I think that actually we will be tricked very, very quickly into mistaking a computer for a human being. We're getting awfully close to that right now. So, and the reason this is such a problem is that humans kind of want to believe. So there's a book called Alone Together by Sherry Turkle and MIT. And she talks about human interactions with humaniform robots. Even if they just have kind of cartoon faces, people project all kinds of social intelligence on these robots that they don't have. And the result is that the human feels rejected if the computer ignores them. Of course, the computer doesn't know that. It just didn't understand what you were saying. But people react very badly to this. So I think that we're already almost in this spot where we don't carefully distinguish human beings from things that are stimulated people. Oh, can I just say one other thing? If you've ever looked at a robot that's intended to look like a human being, the more realistic they are, the more creepy it gets until they look like they're dead bodies, zombies that are talking. And it makes people super uncomfortable. So probably better to stick with very cartoon-like, simple, appearing things. Because if you try to make it too real, it's really creepy. I'll assure you, this is the real vint serve. Well, how do you know? Actually, now I'm wondering, well, let's still get to the last two questions. Hi. My question is, are we living in a simulation? Mm-hmm. And the next question is, am I the architect? Mm-hmm. Which is, yeah. So the physicists are saying, yes. That if you think about the way the universe functions, they're telling us that it's very possible that this whole thing is a gigantic simulation. I hope not. But the fact is we might not be able to tell them that we were. We might not have enough information to tell whether we're in a simulation or not. So apart from the movie The Matrix and all that, there was a book called Simulacron 3, which I think was by Robert Gallo that was exactly about that. And in fact, it could be The Matrix was taken from that book, I don't know. So the answer is, maybe, but we can't tell. So just enjoy it. You have a less creepy question that's lost. So my question is, you talked about the fragility of software and how bugs are created by human errors. So what is your take on the future of using your networks that are trained to generate code and actually do the job of coding as opposed to humans? So I am suspicious of this, partly because I'm not 100% clear that the software that's produced is necessarily assured that whatever algorithm is being used to generate the software may have the same kinds of weaknesses that human beings do. Partly because when we try to write an algorithm for generating the software in the first place, unless there are some fairly rigid rules, the algorithm may actually be as strong or as weak as the class of programs is capable of writing. And so now we get into this question of what's the scope of the programs that can be written by algorithm. The same kind of argument might be made. Think about a compiler for a moment. It takes a well-defined language. In theory, you should be able to write a compiler that does exactly what it's supposed to do. Every once in a while, I've encountered compilers that don't do what they're supposed to do because the writer of the compiler misunderstood the semantics of what the language meant. So I'm not sure that automatic programming will help much, although high-level languages and compilers can be quite helpful. But I'm still a believer in the kind of checking arrangements that at least some Turing Awards have gone to in order to validate the actual produced code. So I wouldn't rely on somebody writing. In the end, you're going to have to give a description to some automated programming system of what it is you want to have done. And so then the question will be, is the language of the description adequately precise that you can't make a mistake in the specification? And I don't know about you, but I've seen lots and lots of specifications that have bugs in them. And so it's just a recursive problem just expressed in a different way. I guess that's all the time we've got. Thank you so much. I wish that we could have an infinite amount of time to engage in many more questions. There is a faculty panel on AI coming right after this. But I think this hopefully real Vint Cerf needs to go to the hopefully real, maybe it doesn't matter, Washington DC, where the three-piece suit and your wisdom and intelligence is so much appreciated. And so are they here. Thank you so much, Vint. So thank you very much. All right, so let's get started. So thank you for coming to this panel of experts on AI. We're certainly very excited to host a series of those events where we bring luminaries in the different areas under the Ideas Festival. My name, by the way, is Dimitri Perollis. And I represent here the College of Engineering. So I'm very delighted to have four of our colleagues in the university debate a really, really interesting question. Can we trust AI? I know we all want it. And I know we are kind of rushing to integrate it in our lives. But when decisions about safety and saving lives matter, would you trust AI? So this is a question that our distinguished panelists will have to debate. And let me start by introducing Professor A.B. Riebmann. Professor Riebmann is actually with the School of Electrical and Computer Engineering. She has research interests in the image and video quality assessment and video analytics. She's an IEEE fellow. And she was a distinguished lecturer for the IEEE Signal Processing Society. Aimee, just took her seat. Thank you. We have Professor Bill Cleveland, the Shanti Gupta distinguished professor of statistics, and computer science. He was with a member of technical staff at Bell Labs. For 12 years, he was the department head. And his research in the area of statistics, machine learning, data visualization, data analytics, and high performance computing. He has received the 2016 Lifetime Achievement Award for Graphics and Computing from the America Statistical Association, the first one since 2010. So thank you so much for joining us. We have Professor Jennifer Neville. She's the Miller Family Chair, Associate Professor of Computer Science and Statistics at Purdue. She's leading research in the area of data mining and machine learning techniques for complex relational networks, including social information and physical networks. She has been recognized in the past from IEEE as one of the AI's top 10 to watch. And in 2007, she was also elected a member of the DARPA Computer Science Study Group. So thank you very much. Our fourth panelist is going to be Professor Millind Kulkarni. Professor Kulkarni is working in the area of programming languages and compilers. And specifically, he's developing languages and compilers that support efficient programming and high performance on emerging complex architectures. He's a university faculty scholar, and he has received the Presidential Early Career Award for Scientists and Engineers. He's also one of the Associate Directors for the Center of Resilient Infrastructures, Systems, and Process. Thank you very much. And to moderate our panel, we have Professor Anand Raghunathan with the School of Electrical and Computer Engineering. Professor Raghunathan is the Associate Director of the newly founded Center at Purdue on Brain Inspired Computing. He has worked for 12 years in the industry, leading a research group at Neck Labs. In 2017, he really designed the fastest pedoflop single node server for training deep neural networks with Intel. And his area is basically a new generation of computing hardware. So thank you very much, and let's welcome more of our panelists. Well, thank you, Dimitri. I guess we can promise to take at least one question from each of you. So I'm glad all the panelists agree to be here. Thank you again. I know I was a bit of a nuisance when trying to recruit you for this panel. The question we are trying to address, it already came up a number of times in Min's talk. And Min and I were just joking. We need to assure the audience that we didn't make our slides while listening to his talk. And I'm sure you've heard of this ahead of time. Can we trust AI? So we're certainly proceeding, whether we like it or not, at breakneck speed towards this AI-driven world, where AI drives and controls a lot of things. More pertinently, AI is increasingly used in a range of critical applications, where it's making decisions that have huge financial impact, have impact on safety, and the extreme case on lives. So the question is, are we ready? Can we trust AI? Can we hand the keys to the AI system metaphorically to make all of these critical decisions? Why not, you may ask? Why should we even be asking this question? Well, as anybody who's familiar with the field of AI knows, there's many concerns. Vint, again, allured to some of them. We'll try to go into them in further depth and hopefully debate to what extent there are concerns or need to be concerns in this panel. So the first one is the concern of explainability. These are black box systems. They often don't produce a reason for the decision or result that they produce. And perhaps the earliest example of an AI system that was a black box with an explainability concern was back in, does anybody recognize this quote? It's from the Hitchhiker's Guide to the Galaxy by Doug Adams. So it's probably the earliest example of unexplainable result from an AI system. This is the deep thought computer producing the answer to the ultimate question of life, the universe, and everything. And of course, we know that answer is 42. But why? OK, today's AI systems are actually not that different. So again, when a doctor is using an AI assistant that's to make a diagnosis, he or she doesn't always necessarily understand why the AI system is making the recommendation that it is. We need to open these up from being black boxes. This is a very active, ongoing area of research. Some of us are working on this, but it is a challenging problem. Second concern, adversarial attacks. Many of you have probably seen this image of a, this is a panda bear that, through the addition of very small amount of denitrally added noise to the pixels, appears to be a given to the AI system, although no human would really think of these two images as different. And it's unfortunate that Mung left, because I was actually going to put up his name here. So this is an example of research from Purdue where Mung was involved. And here, the researchers actually showed that they could look at signs of, for example, KFC, and by making very, very small changes, adding adversarial noise to these images, fool autonomous vehicles to thinking that these are stop signs or other traffic signs, which clearly they are not. So this isn't for just some fancy toy images. Researchers have shown these attacks in medical systems as well. And so there's a concern that this can entirely derail adoption of AI. Last but not the least, there's the concern about bias. So in many AI systems are biased. And this can happen in many advertent and inadvertent ways. So the data used to build the AI models may implicitly carry bias. The humans who curated the data or who wrote up the algorithms to train the models may themselves be biased and unrealizingly transfer that bias into the AI. And that may get amplified and applied at a much larger scale. So when such bias systems are used to make decisions, then obviously that can be concerning. So the bottom line is this. If you ask any prospective users of any critical system that uses AI, and we'll come back to self-driving cars as the example, this is a survey of prospective users. And of the top eight reasons why people feel concerned to use self-driving cars, six of them really involve trust in the AI. So this is a very central issue. It can be a showstopper and really decide whether the trajectory that AI takes from this point onwards. So with that introduction, let's talk. So I'd like to invite the panelists in order starting from Amy, Bill, Jennifer, and Melinda to make initial position statements on this topic. And then we'll have a round of responses to the position statements. And as I said, we certainly will be happy to take and look forward to take questions from you. So I'd like to start by just telling you what my experiences are with AI and machine learning. So my work is based on machine learning for vision-based systems. And in particular, I'm interested in agricultural systems and working with an open agriculture technology system and also on vision-based systems to improve food safety and so forth. So I thought it was worthwhile to start talking here as we ask the basic question, can we trust AI? Maybe we should take a look at the definition of trust. So of course, I go online to my favorite sources and I look at Miriam Webster and Wikipedia and I synthesize this definition of trust here from the two of them. So trust is defined to be the assured reliance on the character, ability, strength, and truth of something or someone to the degree that we voluntarily abandon control over the actions chosen by that thing. And so I think in the context of this question, it's worth unpacking what this definition means and in particular focusing on the notions of character, ability, and truth, and also the notion of abandoning control. And so in terms of truth and ability, in terms of trusting AI, I think one of the questions here is does the system create the right answer? Is it actually answering the question that it's been designed to ask? And there are examples come out in the news every single day, it seems. We see, I won't mention any particular companies as I talk about these examples, but we've seen examples of humans being labeled as gorillas in image labeling system. We have an object of texture that winds up labeling all the images from a particular horse enthusiast website as a horse that's been designed to detect horses and all it's managed to do is detect the logo of that website and it has no idea what's happening in the images. And there are speakers, there are systems that are designed for speaker verification and these have to be robust to these sophisticated speech generating methods. And so the basic question of truth here is is the system even doing what it was designed for? But in addition, in terms of ability, can the system deal with unexpected or difficult scenarios? We all know it's more difficult to drive in foggy conditions or wet conditions. If the system is unable to cope with that, maybe it should tell us that hey, I'm not capable of driving under these conditions. And so in general, we need something that's gonna be robust to the typical environmental conditions like fog, low light, or glare. And we all know how to deal with those as we're driving along, it makes some difficulty. And right now these systems are designed for pristine clean situations where these situations have been curated by humans. Is it gonna be able to work as we go outside that notion? And is it, and many of the systems also as they've been trained for these clean, clear situations, they're not taking into account the impairments, the difficulties that our own technology has created there, like compression of the video and the fact that we're sampling at a low frame rate or we're sampling at a low spatial resolution. And so the computer has sensors that's not even seeing what we, the reality of the world as we see it. And how much does that impact of those sensors affect our ability to trust what the system is actually able to process? In terms of the next topic, which would be the character, I think these get to really philosophical questions. So I think it makes sense to ask the question for whose benefit has this AI system been designed? Was this, if we consider a loan AI that's deciding who to give loans to, are we deciding the loans based on the best for the company to make the most money? Or are we trying to help out individuals so that they can have a better leg up in life? Are we trying to help society as a whole or are we just trying to make somebody's pocket a little fatter? In addition, we can also recognize that when a self-driving car actually killed a pedestrian a year or so ago, there was evidence that supported the fact that the system did exactly what it was designed to do. The system detected the object in the way and the system was designed to ignore those alarms that went off and so the car continued to go ahead and hit the person. So this is a question of the character of the company that designed this algorithm that decided that it was more important not to inconvenience their test driver than it was to actually hit something in the middle of the road. And I think these types of questions as the character of the company that's creating these products, I think that's a worthwhile thing to consider as we go forward. And when we ask the question, can we trust these things? And in the final case here, the system is created in a secure manner or can it be hacked by a bad actor? And so is it doing what it's been designed for? And again, I would say this goes back to the question of character because it goes back to what is the system trying to achieve here? Which human is it trying, whose human values are being replicated in this system? And can it be hacked by somebody that has nothing to do with this company? And perhaps that's the company's problem, they didn't design the system well enough to protect it for their own interests and the humanities interests. So then I think the third, the next topic is about seeding control and this notion of yielding control and that's part of what is required to trust something is the willingness to let go of control. And I think that in asking the question, do we trust AI, we have to ask under what conditions is a person willing to let that AI make decisions for that person? And I can envision that if I want an AI to make a reservation at a restaurant, that I'm willing to trust an AI at this stage to do something like that. That seems perfectly reasonable except maybe if I want to make a reservation for a dinner at a very crowded place that and it's a special occasion for my sweetie and I would like to make sure that this really happens. Maybe I would be wanting to do that phone call by myself. Maybe we would trust the machine to make a screening decision about whether or not to get a medical test but we'd really like to have a human be looking at the result of that test. Are we willing to let a machine drive on a crowded highway without supervision or would we just be preferred to drive along in a golf cart type vehicle in a closed community where there's not, at 25 miles an hour where there's not much going on. And would we like a machine to perform complex surgery on us or would we prefer a human? In terms of that last one, actually sometimes the machines are much better than the humans. So in that case, I would definitely trust the machine but in general I would say I'm not ready to trust AI in all these situations. I believe that AI these days is just at the apprenticeship level so it's very good at some things and we can trust it for a certain set of things but it still needs the master or the journeyman to be looking over and making sure it's making all these decisions correctly before we completely trust it. Thanks. Let's see, I use this. So I'm gonna strike a theme here and it'll be similar to any of these but I'm gonna phrase it differently. It's not about trust, it's about validation, okay? And it's not about trusting AI, it's about validation and do we trust the validation, okay? So think a little bit about drugs, okay? We got pharmaceutical people producing drugs and they use all sorts of tools by the way especially in drug discovery, I'm sure, well of course this is all proprietary information, I'm sure they're using AI tools along with a whole lot of other data analytic tools and discovery tools. So do we trust that process for producing drugs? How do we know? It's the FDA that we need to say we trust because they're enforcing lots of performance testing and validating those drugs. So that's what we should be looking for. So I wanna just address that and just give you a couple examples of that sort of thing. So I'm gonna talk about AI for data science. Now here's just a bit about data science basics that the foundation is the analysis of data and there are technical areas of data science and the purpose of those technical areas is to develop analytic methods and computational environments that will improve data analysis. So we have analytic methods, statistics, machine learning, we got mathematical models for data sets, we have computational algorithms for methods and we have computational environments for data analysis. So work in all three, in all of these areas here as in recent times made big advances in our ability to be able to analyze data and the analysis of data is really in a way the foundation for much that goes on in AI. So there are, for data science there are really two kinds of automation. I'm gonna switch my term now, I'm gonna go from AI to automation. It's really taking, tasking and automating it on behalf of what it is that's going on to make things better and simpler, okay? So for data science there's internal automation and that is automation to enhance the use of analytic methods, okay? I will say a word or two about that and then there's external, which is developing analytic methods to automate a subject matter task based on data. Okay, so it's outside of the framework of data science in the sense of all the technical areas and developing things but it now becomes developing through the analysis of data, something about some other technical area. Now in both cases there needs to be validation, does it work? And validation here means application to a convincingly large sample of data sets. Okay, so, but we have another in all of this we have another entry into this whole topic and that is the subject matter expert. So in terms of the analysis of data, we got lots of tools developed by those systems and by those analytic methods. But judgments made on the part of the subject matter expert is critical to data science to the analysis of data. So we have a hybrid that you want to occur, okay? Because just blind analyses without any knowledge of what is going on, they just simply don't work particularly well. And by the way, this gets, I just wanna address this question that I saw come up here. It's, well, there's a problem here with machine learning AI systems because they can't explain what they do. Well, for decades and decades and decades, at least in the field of statistics, there has been two very different kinds of tasking. And this is well delineated and well understood. One is causal, okay? So when the goal is you need to know what causes things, then you gotta use a whole bunch of tools that are directed toward causal modeling so that you know what it is the drivers are. The other is predictive. You wanna predict something and you basically go, you know, I don't care how you do it. Just tell me next week if it's gonna rain or not, okay? Cause I got tickets to a football game. So those two things need to be distinguished. So if it's really causal, everybody has a right to ask the question, but if it's causal, you're gonna be using other technology. So I'm not sure, I think this is even really a question that is necessarily cogent to all of this. Anyway, okay, so what about, but what about the subject matter expert? Well, the subject matter expert supplies knowledge that's critical to the choice of methods and models, first of all. And they're diagnostic, when you fit a model to data, there are a lot of diagnostic analytic methods used to assess the fit of the model to the data. And the subject matter expert is the best to judge such fitting because you have this, say, surface. You've fitted to some response as a function of 10 variables and you use both visualization and other analytic tools to present the fit of the surface to the data. And you show that to the subject matter expert, the subject matter expert can judge whether or not the model is good enough for practice. So the expert might say, you know, there's a little bit of a problem, you know, but the data over there in that upper top, in that corner over there, but I don't care. Because we don't really care about that. Or, gee, that's the most important thing of all. The model's missing that, and so you gotta go back and do another thing. Okay, so internal, that kind of thing with a subject matter expert, that's actually going on right now, and I'm part of it, DARPA program D3M, data-driven discovery for modeling. It's built for subject matter experts who are not data scientists. Now, in this program, task one is, well, cataloging thousands of analytic methods. And by the way, they're all either written in Python or in R, although much of them call lower level languages like C, but that's the pop, that's the catalog, and you sort of, when I say catalog, I mean, you know, like a library. It's a, you know, it's a, well, it is a library. But, I mean, a book library. So you have to have a lot of information about that analytic method, okay, and what it does. Now, task two is automated model selection and fitting of the data. Okay, so through looking at what has, oh, sorry, up front there, when the subject matter expert walks in, there is, the subject matter expert specifies certain things about what goals are, okay? Task two is model selection. So there are automated methods for selecting a model class to be used and to fit that to the data, okay? So the subject matter expert is still waiting. And task three is now diagnostic methods to display the fitted model and the data, and the SME is the one who judges the fit in this D3M system. Now, I'd like to talk, so that's like, I'll call that internal. Oh, by the way, as we build this system, how are we tested? You know, how do we show that the system works? Well, what DARPA has done is they're calling in a whole bunch of subject matter experts who sit down in front of combined systems for, you know, well, they're using TA1, TA2, TA3, and then they see the results and they then rate it. They say, this was really good, the interface was wonderful, I love it, and the results were good and I'm happy with it and I was able to interact very well, or this thing is awful, I'm gonna leave here and never come back again. So, but anyway, actually we're doing quite well so far. Not perfect, there's lots of good statements being made and there's lots of suggestions being made in this testing process. Okay, so external. So, some time ago, a collection of us got together and we developed a cybersecurity application in which we developed a streaming statistical algorithm for detection of SSH keystroke packets, okay? So we have SSH connections and we wanna see whether or not there's keystrokes and the reason for that is that we wanna know whether or not this is a human that has used SSH and is on that connection or is it a file transfer? So, and what we had was eight variables and we used those eight variables to characterize the, characterize the behavior of a keystroke. I mean, this is, by the way, this is TCP at work. So it's like, you have to know all about TCP but you take the TCP protocol and you say, we have times between packets, types of packets and things like that and we were able to use those, we were able to use those patterns and detect patterns that were unique to keystroke packets and as a result of being able to detect that, we wound up with an algorithm that actually was, it was quite accurate, I forget the numbers but it was something up around 95% in both cases, 95% accuracy and saying there's no keystrokes and somewhat less, about 90%, there are keystrokes somewhere in that. And so this got implemented, one of the people in the project was Carter Bullard who has this wonderful connection level Argus monitor and we implemented it there. Okay, oh and yeah, the key thing I wanna say here is the testing, we tested over a million SSH connections. Now some of them we had very definite information about whether or not this was a human in the loop and others it was more foggy but we used a huge amount of information to look at things and determine, yeah, with very high probability, this is just a file transfer. So it was really many different kinds of testing that we carried out to verify that this was gonna be good enough to put in a system that's carrying out cyber security analyses, okay? So again, it's all about testing, it's all about verification. Great, okay, thanks Bill. I wanted to start off by saying that I work in the space of machine learning and AI and I just wanna say that I'm very thankful that we've gotten to the point where we can have a panel discussing whether you should worry about using some of our methods because 20 years ago when I started working in this area we spent a lot of time trying to convince people that what we wanted to work on was interesting at all. So I think it's really a testament to how far our field has come that we actually have methods that are being rolled out in such critical applications that we have discussions like we're talking about today. So let me, I just wanna frame how we do our, how we lead this discussion to talk about what has happened in the past in terms of the industrial revolution. And so some people say that what is happening right now is really the fourth wave of the industrial revolution and you can see here over time it's gone from the 1760s to today and really now the new industrial revolution is coming from the use of cyber physical systems and AI in a really sort of woven through every aspect of the facets of our lives. But all of these successes that have come from the various aspects of the industrial revolution have really improved our lives. So I lost my notes here. So it can recognize my face up here. Okay, so all as we have had these improvements in first mechanization, then mass production, then automation through computers and now with AI, you can see if you look at the overall quality of life of people throughout the world, this is some data I grabbed from a site that has a lot of data about every aspect of people's lives throughout the world. And you can see how from this starts, this plot starts from 1870, so not 1765, but you can see how the basic quality of life of people have improved over time. And this is an index that includes both how long you live, how happy you are, how healthy you are and how educated you are. And so this corresponds to improvements coming from the industrial revolution. And so now that we're at the point where we have AI embedded in our lives, starting to be embedded in our lives, I wanna point out from an optimistic perspective that it really has been improving our lives in ways that maybe you haven't realized so far. And so that includes both from a medical perspective that I think Vint referred to in his talk. So here's some research on using machine learning methods to automatically predict whether patients are going into septic shock in hospitals. And this is something where there's 700,000 people that die each year from septic shock. And it's something that's very hard to predict when it first starts to happen because the symptoms are very similar to other more benign conditions. And so machine learning methods have really started to be able to identify this much earlier than nurses and doctors would be able to and alert people to changing the protocols and saving the lives of these patients. Some work that I have worked on in the past is using machine learning methods to predict identity theft in cell phones and fraud prediction in stockbrokers. And this is something that is rolled out in many ways that you're probably experiencing in your everyday lives but you don't realize it. So this is behind many of the spam detection algorithms that keep email out of your inboxes or at least try to, maybe sometimes it keeps real email out if you want to see. But a lot of times it's saving your time to not have to deal with that spam email. It's also rolled out in your credit card companies to predict when somebody has likely to have your credit card number and be buying products on your behalf. And so these are examples of methods that were developed in the machine learning, the statistics and machine learning communities 20 years ago and now they're just seamlessly involved in our lives in ways that, based on Amy's definition of trust, I think we are letting them make these decisions and being comfortable with them. Finally, some things that you might not know about. There's methods that are being rolled out to actually detect human trafficking online by looking at patterns of behavior in the dark web that you might not even have access to or knowledge of. So that's saving people's lives. And then finally, I just wanted to end with the fact that machine learning is also making our life easier by improving aspects of our supply chain. So this is an article that says that Walmart back in, I think this is 2004, 2005, I can't see the number right now, but they were already automatically predicting when there were massive natural disasters like hurricanes, what kind of products should they ship to the Walmarts nearby ahead of time if they knew that it was happening so that they could actually have the supplies already there for the people that are going to need them. You might think that that would mean that they need to have more water, batteries, flashlights and so on, but it turns out they found that they really need to have beer and pop tarts being shipped there. Okay, so I just wanted to wrap up by saying that this is indicating that really AI has already been improving all of our lives and this discussion that we're having right now is really just a function of the fact that people are distrustful of new technology in general and this has happened all the way through the Industrial Revolution. So for example, the Lydites were very distrustful of textile machines and taking away jobs from people that were craftsmen at that time and so I think we're having the same sort of discussion now where people are afraid that AI is going to take away the jobs and things like that and that doesn't mean that we don't have anything to be concerned about. I think a lot of the things that have already been discussed so far this afternoon in terms of having robust, reliable and safe systems as we roll out these AI models to make these automated decisions are really important for us to strive towards but it's not something that we should be afraid of. I think to sort of restate what Bill was saying as long as we are validating and testing the models in the right way then we should be able to really trust that we can use them in ways that will improve our lives in ways that maybe we can't even anticipate right now. So that's it, thank you very much. So the danger in going last in these things is that I wind up presenting a bunch of slides that recapitulate stuff you've already heard but hopefully I can give things a slightly different spin. And so my name is Millan Kulkarni, I'm from Electrical and Computer Engineering and I think one reason that I might give things a slightly different spin than you've heard up until now is that my research is not actually an AI. I don't do AI and ML. I'm in programming languages and software engineering and formal methods and I live in that space of sort of designing and building software to do different kinds of things. And so let me start by just answering the question that I'm gonna pose, which is can we trust AI? And I'm gonna answer it in a way that panel moderators always hate, which is by not really answering the question. I'm gonna say can we trust AI? And it really depends on what you want AI to do. If you're asking AI to tell you whether the picture that your friend sent you is a picture of a Bengal cat versus a Siamese cat, maybe it's not a big deal if it's getting it wrong every now and then. If you're playing with your Google Doodle, trying to make it turn your tune into a box symphony, maybe it's okay if it comes out sounding like Philip Glass instead. That doesn't matter so much. But if you're trusting your AI to drive you from here to Chicago without getting into an accident, what you expect from the system changes. And what you might trust the system to do changes. And so 10 years ago, I used to be able to give Simpson references and everybody would just get them. Maybe people don't anymore, but this is one of the quotes from Reverend Lovejoy. The short answer is yes with an if. The longer answer is no with a but. And the reason that I come at it from a different angle is that the question that we asked in programming languages and compilers is not can we trust AI, but the broader question of can we trust software? And I promise I made this slide before Vint Cerf made exactly the same point an hour ago. So this is a problem that people in my community have been looking at for years. Can we trust software to do the things that we expect it to do? And the way that you wanna frame this question is what do you expect your software to do? I want you, the user, to tell me what you expect from your software. And then it's my job to tell you that the software is actually going to do what you expect. And that sounds like a completely simple, straightforward thing to do, right? You tell me what you expect and I'll just make sure that I give you what you expect. But how do we define what we expect from our software? This is actually an extremely hard question. It's one that we haven't solved even in the case of easy software. How do I, what does it mean to say that the software does what I expect it to do? I can write a piece of code and give you a list of expectations for that software and it will do exactly what you expect. And then you go and deploy it in a real system. It encounters a situation it's never seen before. It gets an input it's never seen before. An input you, the designer, didn't expect. And so you never told me what the software should do in that situation. So I can't help you, right? And that's even in the simple case of software that is nice and deterministic and takes a certain kind of input behaves a predictable way. When we're talking about these big inferential engines that we get in AI, this is an extremely difficult problem. And I think that that's really the question that we should be asking here is what can we expect from AI? What should we expect from AI? And once we place those expectations on AI ML systems, how can we guarantee that they do what we expect? And I don't really have a great answer about what we should expect from AI systems. It's extremely context dependent. But let's, here's some ideas that people in my field have been looking at of things that we might be able to expect from AI systems. So one example is something we call local robustness. So Anand gave this example of adding a little bit of noise to a panda and all of a sudden you decide that it's a different kind of animal entirely. If you've been following the news, there was a big story a few days ago about people putting reflective markings on pavement and convincing an autopilot system to drive in a different direction. I think that's the company that Amy was trying very hard not to name, but I'll name it, it was Tesla. Oh, okay. Right, so the question of local robustness is one where I know that the system behaves as I expect for a given kind of input. If I perturb that input slightly, will the system still do the same thing? Will it give me the same answer? Because I really expect when I say that the system should behave a certain way when it sees a road that looks a certain way when it sees an animal that looks a certain way. I don't mean exactly that configuration of pixels or exactly that road looked at from exactly that angle and exactly that lighting. I mean, a whole bunch of inputs that look kind of like this, the system should do the same thing. Now, this is not a complete solution because how do I know what I mean by local robustness? How do I define what it means for two images to be similar, for two inputs to be similar? Proving that local decisions are always correct doesn't really tell me anything about the end-to-end behavior of a system, but this is one thing that we could try to do. Another thing we could try to do is interpretability. I know that Bill kind of poo-pooed the idea of interpretability in his presentation, but one way to engender trust in artificial intelligence and in systems in general is to provide answers that humans can sort of reconstruct that we can show your work, all right? You've all had the experience of being in grade school and writing an answer to an arithmetic problem that turned out to be slightly wrong and you get no points for it, but if you show your work people are more likely to believe that you sort of know what you're talking about. And I think that there's a similar story we could tell for AI and so there's been work in my field looking at different ways that we can take these complex neural net models that are big piles of weights that do all sorts of things and turn them into other kinds of models. For example, programs that we might have a better chance of understanding, of interpreting and saying, yeah, that looks right. The system is doing what I expect. And the last one I'll leave you with is maybe the hardest, thorniest, most difficult one which is one of fairness, right? How can I, the decisions that my systems are making how can I say that they're fair? And even defining what fairness means is an incredibly challenging problem if we want to validate or prove that our systems really are providing fairness. One possible example might be something like if I have a group of people and I'm trying to make selections from that group that I'm not gonna be more likely to select minority groups and majority groups if on all other dimensions these things look the same. So if I can't, if the only thing that distinguishes two groups of people is say their sexual orientation, my system shouldn't behave differently based on that one feature, right? So this might be one definition of fairness. Defining fairness is extremely tricky. But the thing that I wanna leave you with is this idea that if we can define what we expect from our systems, be they software systems or AI systems or ML systems or what have you, then we have some hope of being able to trust them because we have some hope of being able to verify that they do what they're supposed to do. Thanks. Thank you all. I think that was a great set of opening statements. Maybe we can have a quick round of if anybody has some thoughts that any of the other presentations triggered in you, a quick round of responses. If not, we'll move right on to taking questions from the audience. I wanna make sure that you have the chance to be heard. So why don't we get started? Anybody from the audience? Any questions? Yes, could you walk up to the mic, please? I think we're still being recorded, so. So thank you very, very much. So I guess just to start the questions, I'm gonna start from the very last one and kind of move. Can you describe a little bit how much progress we've made in those areas? And I think you all talked about very similar things. So if I look at, for example, the last two decades or one decade and I'm making it a bit harder from one to 10, where are we today? I think that's for all of you. Man, so I can say that in the simultaneously broader and simpler problem of can we trust software, I, you know, my colleagues in formal methods might kill me for saying this, but I think we're at like a two and a half out of 10. We can do things pretty well in systems that are very well-defined, that are very small, that have predictable inputs, predictable behavior, that have deterministic behavior. None of those things describe the kinds of ML systems that we all wanna be building. And so I think we're not close. So I guess I would say from a, like a machine learning or AI system perspective, I don't know what number to give it, but I don't know what the scale is, but I think we, like Millen was just saying, I think that we've had success in very narrow, well-defined areas. So something like credit card fraud, transaction-specific credit card fraud identification, we are fairly good at that. You know, identifying whether something is a spam email where maybe you're maybe not so good at that because that's actually an adversarial, I guess both those things are adversarial situations where as soon as our models do well, then the adversaries adapt and start doing new behavior. But those are still very well-scoped problems just like plain go or plain chess are. And so I think in those cases, we have very well-defined outcomes, very well-defined actions, very well-defined data that's going to be input to those systems. And I think we're doing a great job of developing models that can predict in those environments what's, I think everybody thinks about in this kind of panel is, how is that gonna work in a self-driving car? Or how is that gonna work in a system like Vint described where like a human learns sort of autonomously over time, learning new concepts and new objects and new ways of acting in the environment. I think we're very far from that. And combined with that, we don't know how, if we can't make the software work when it's not an AI system, maybe we should be afraid of what AI software is going to do. I guess I'm out of here. Can you hear me? Okay. I think in a certain sense, we've made enormous progress. Number one, there really have been these huge events that occur and they get most of the publicity. It's sort of like one-off achievements like alpha zero and the other, don't forget Kasparov lost, who was easily the greatest chess player in the world at the time, lost the chess match with a computer built by IBM. So that was a major achievement. Now it is true that Kasparov got spooked and sort of fell apart because he really thought that the IBM was cheating and he was completely unnerved. So a computer that's not capable of being spooked, beat a spooked world champion. Okay, so maybe that's, it's a good example though. So we've had a lot of things of that sort. Now, but those are sort of one-offs. Day to day, we've made huge advances in our ability to be able to analyze larger and larger data sets. I mean, the big data mania that broke out around 2010 to 12 really did make a lot of accomplishments and we can analyze through the development of computational environments for data analysis. And we use them routinely now like Hadoop and Spark and now there's a new one in their desk. So we can analyze much better, much bigger data sets. Then we could 10 years ago, over 10 years has been all that progress and the magic of it is that we've been able to compute in parallel. Now before that, we were having to deal with big data by saying, okay, I gotta make my analytic methods run very, very fast and they have to use a small amount of memory and you had to think about each individual analytic method but the great thing about the computational environments is that you really now can achieve parallel computation and that actually gives you dramatically higher performance than trying to develop algorithms that will provide better computational performance. So in the space of video processing and processing of visual information, I'd say in the last 10 years it's really exploded and that's in part due to the success of deep networks and AlexNet and so forth and that ImageNet object recognition. But in reality, I think that we're, again, that's very much in its infancy because the way we process visual information is so much more sophisticated. We have more sophisticated models. It's the way we being humans process it is just so much more sophisticated than the machines have right now and already it takes huge amounts of machine power to process that now. So I would say we're two and a half sounds like a good number. There's a lot more that needs to be done to take this into account and one of the things that's really interesting to me is how when I spoke to a machine learning expert a month or so ago and I described my goal of trying to mimic a farmer who's driving a combine to harvest the wheat or the corn and I described this situation and I said, well, how can I design a reinforcement learning algorithm? How do I design my reward system? How do I design this? And he says, oh, it's really easy. You just build a simulator of your combine. And I thought, wow, that doesn't sound easy. So if that's where the state of the art right now is how do we take real world problems and translate them down into the case where we can start to use these techniques that the machine learners are creating. I think they're really powerful techniques but we're still a long way to translating them into reality. Yeah, maybe let me just add for context in case you don't know people have been working on game playing AI systems for 50 or 60 years, right? So it's not like overnight that people wrote a program to be a go master. It's been a long time coming and how long have people been working on self-driving cars? 10 years maybe? And also just some history in the field of AI, we have typically underestimated how hard a problem is. So people thought that you could solve the computer vision as a summer project that grad students could do as an internship. And that also was 60 years ago, right? And so we have made a lot of progress but at the same time, we always think things are coming faster than they really are. Yeah, you know, the term artificial intelligence and its visibility, both among technical people like us and the general population is sort of like the old faithful geyser. I mean, it just spouts up and then everybody's talking about AI and then it's sort of, the first time I guess was Carnegie Mellon when they started right back in, anybody know the dates, 70s, saying we're gonna do, yeah, we're doing artificial intelligence and that's going to let it work. Then it went, well, didn't work out too well. Ken Thompson came along. They had chess programs. Ken Thompson came along, said this is an algorithm, my computational problem, chess, I forget it, all this stuff of trying to copy how a grandmaster thinks that's silly. I'm gonna, I can do better. And he did. He just beat all the other systems by algorithms and he actually built this mission hardware to do that. Anyway, so and then next was expert systems. And actually some of those work pretty well but it's sort of again, it declined and well, at least people stopped using the term and I think a lot of them didn't work out as easily as everybody thought it was going to. But this stuff seems more real to me having gone through these geysers. I think it seems a lot more real. But again, it's about algorithms and analyzing data. Just to be a little cranky about that, how many people in the middle of wave three or four thought, hey, this seems real this time. A lot of my friends from grad school at the time now have jobs doing other types of software engineering because they could not get jobs. Okay, thanks. Oh yeah, yeah, I'd like to mention yeah, Ken Thompson also did his touring award lecture talking about reflections on trusting trust. You can't even trust your compilers. But what I wanted to talk about was is right now you guys are, a lot of times you're funded by the state and you have all these states, China, Russia and United States governments funding an awful lot of research and you wonder what is the intention? What is the, what are the interests of the state? Are they to help with our economy and help the common good of my country versus this other country? Is it, or is it trying to affect social control? So if you think about what China's doing with their social credit system, I can imagine that being a real target for artificial intelligence. Where you have this gentle form of social control. So if you're not paying your parking tickets, you're doing credit card fraud, then you can't travel, you can't get a job, you can't do anything else. And I'm wondering what your thoughts are on the trade-off between this power of social control versus your research efforts. And is there a conflict? I'd like to say a word about them. Competing nations, okay. Number one, there is absolutely no doubt what the United States government is very concerned about AI for military systems. I know this in part because I was on a panel to develop a plan for the future, to 2030 actually, for the Air Force to inject much more data science into its, well, its environment. Let's put it that way. Let's put it into the Air Force technical people. So, and there was statements about competition with other nations and there's no question. But what the military is watching carefully and wants to stay out in front. And that of course is a lot of economic and there's a lot of security things that come up with foreign nations. And then there's economic matters that come up. So, there is no question but what this stuff is deemed, AI is deemed very important by the United States government and clearly the Chinese government too because they're working very hard on developing AI. So, and by the way, Wenwen Tung was down there and I are headed to China to work with weather people but it's not nearly so competitive, you know. It's like they want to predict monsoons so we're headed there under much, under Dayton, let's say. So, I guess I can answer that a little bit. I guess I would point out that there is a difference between the research on the methods and the policy of how to roll out those methods. And you may be concerned about the government funding in this kind of research but I think that's valid and should be appreciated but companies are also doing this research and they're rolling out things with other kinds of applications in mind and my historical story about this is that when I was in grad school I worked on a program from DARPA called Total Information Awareness and that got shut down by the senators that had problems with the implication of the government collecting data on individuals and using that against US citizens to modify their behavior. So, the government stopped doing research on that but who kept doing the research? Well, Facebook kept doing the research and other companies, maybe I shouldn't name because of the speaker right before us but Amazon is doing the research and those companies have basically rolled out the very things that people were afraid of that the government was doing and so it's not really simply a government issue. I think people will be doing research on these methods and trying to stop doing research is worse than trying to, you'll have companies or countries like China overtaking us and so really we should be doing the research but really thinking very carefully about the motivations of when the methods are rolled out and how they might affect society. Yeah, I think that's exactly right which is that it's really important to distinguish between the tools that we're building and what people choose to use the tools for and I'd want to be very careful about overcorrecting into a world where we say because the research that we're doing has the potential to be used for bad things we should stop doing it entirely. I would say... Maybe I'm talking my own book there but... I just want to add that in my case I feel like I have the ability to choose topics that actually can help the entire world and so if I'm looking for sustainability of food production or improving the safety of the food processing chain then I think that that's something that can help everybody and maybe not is a lot harder to weaponize or make people's lives worse so that's the direction that I'm trying to push my efforts. I think there was a question there. Did you have a question as well? She was waiting from before so if you could... So I just had a question about active learning. When... What kind of parameters are we looking at when we're trying to execute a competent algorithm for that? Are we trying to model based on how humans learn or how do we decide what's the next node that should be learned if you're trying to do active learning? I guess maybe I should take that. Let me just say what active learning is for people in the audience who don't know. So active learning is a setup in machine learning where you're simultaneously trying to learn the model without enough data and so you learn what data to gather to increase the speed of your learning and so there's actually a whole host of different ways that you might be gathering data. You might want to get new instances. You might wanna get labels for instances. You might wanna get new features and in all cases what we try to do is there's a cost associated with every action or query that you would do and so an example of this is that if in healthcare it might be very costly to do a certain test on a patient, right? And so you want to assess is the value of the information that you would get from that test result really helpful to do a diagnosis in the end. And so your question was how should we think about when to acquire new information or what information to acquire? So like the when and the how. So are we, what kind of models are we using to decide what note should be learned? And like you said, there's also a factor of when is it like worth it or not but also how are we doing it and how can we trust that method? Yeah, so I think your question of exactly how we should frame that shows the gap between what people might want to do and what our algorithms actually do. So what our algorithms actually do is a priori somebody specifies exactly what are the actions, what are they cost and then we optimize when should we gather that information and it's different and different scenarios but the larger goal would be to move to a scenario that's more continuous learning like humans do that we don't necessarily say to ourselves, should I take a step forward right now or should I turn to the right? We just have automatically internalized when we should do certain actions and how we would gather data from that. And that's a much more open-ended learning kind of environment where there's many different things that we could do and many kinds of information we could gather. We're nowhere near that kind of formulation and so researchers right now would formulate active learning differently in many different environments. So for example, something that I've worked on is that I do learning in graphs and you might want to predict what are the topics of web pages on the web but you have to go gather the data so we just don't have access to the whole web even if you're Google, even if you're Microsoft you have to actually crawl the web to see what's there and so the query then is should I go across this hyperlink to see what this page is on the other side to then find out what its topic is and so then we frame that as an active learning problem where we're simultaneously trying to decide where to crawl and learning to make predictions at the same time. That's a very narrowly scoped problem where we only have certain actions available to us and then we can actually just look at the data and learn whether we should take those actions or not in hindsight after we've accessed some of the data. So I'm not sure if that helps exactly with respect to your question. No, it helps a lot. I think there was one from Nikhil as well, go ahead. My question is about the possibility of AI. So given a task humans are great at adapting to situations that their task might throw up. So how much this is going to be a challenge or even possible for systems that expect precise specifications? I think it's worthwhile to consider putting that into the design of your system so that it has some self-checking involved. So it says, hey, wait a second, I don't know what to do. Maybe that's a point when you go query an expert or you ask for more data or you go get help and I think that systems can be designed to be capable of understanding their capabilities and when they're not very strong and being able to address that situation. Yeah, actually that's a very good question because one of the things you can do is up front and you have to be a subject matter expert, you go and you actually run designed experiments on systems. Actually that needs a lot more work because a lot of these parallel distributed systems that I just mentioned doing that work is actually very challenging and so not too much is being done, but you can go and run experiments just like you run experiments for medicine, you can run experiments with these systems and say change, for example, configurations. I mean, the systems need configurations. We're talking about computer systems, right, that do things for people. You need to configure those systems to conform with the tasking and you can run experiments to get information about how to do that generally. Now to make changes in real time, that is when a problem arises, you run out of memory, what should happen? Can the system adjust? Well, it can be, the system in some cases can, in other cases, you abort. So it's a tough one. It deserves work. That'd be a good thesis topic, by the way. Any other questions from the audience? Go ahead. Assuming you have a general artificial intelligence system, would you ever be able to trust the system for a problem where there isn't a subject matter expert or an open problem sort of situation? I mean, define general artificial intelligence. I trust a lot of people that are general intelligences to do things for me, right, even if I haven't specified precisely what I want them to do. So I guess it's a question of what do you mean when you say you have a general artificial intelligence, right? So a system that might be able to produce some solution to what the best course of action for a problem is or maybe how to solve a problem that we don't have a solution for yet or don't know the best solution for it. Well, most of the cases we have here, you're working on a specific thing that you want to achieve. And I would say there isn't, when you do that, there can be some amount of generality, but by and large it's not, especially with AI. If it's just analyze the data and understand them, then you can start thinking about general systems because there's a lot of tools that will work across many different subject matters, okay? So you can talk about, so we do have general systems for data analysis. We have Python that has libraries and we have R that has libraries and you build that system and that will handle a lot of different kinds of problems. But with these AI problems, they get delicate and they get challenging and you gotta, usually it's, you focus on that. And I don't know that, I think Jennifer just said this. There probably aren't things that you would call general systems for AI. I guess I would say if you're talking about in the future when we do have, we've gotten to that level or we've achieved something that we call general AI, I would trust it if, because if we'd actually gotten to that level, then we should have encoded in the system the things that allow us to trust these other intelligent systems that Milan is referring to, right? So we have people that we trust to make decisions. But if you think about how we feel about people and how they make decisions, not everybody makes, will make the same decision. And so it's not always clear that there's gonna be a deterministic decision-making process for which there's always going to be a single right answer and we're going to be able to decide if the agent has done that answer or not. We are going to have to learn how to infuse the things that we currently train our human agents through social norms, laws, policies, to behave in ways that we think is acceptable to our society. We're gonna have to somehow encode that in an algorithm to put into these AI, general AI agents. And once we have done that, then I think we will be able to trust them the question is whether we know enough about ourselves to even put that into some form of algorithm. That's what I think we are not really introspective enough to know how we ourselves make decisions and how we value trade-offs in particular scenarios. And so the example that I always like to give is that we send 16-year-olds out in cars driving once they have a license. Do we take them through all the same situations that we're taking self-driving cars through? Like, so what if you're driving down a road and it's cloudy and your grandmother's at the side of the road, but then there's oil on the road and then you slip. This is something that we don't teach them explicitly. We teach them general principles and we hope that in the moment they're going to be able to figure out how to act and behave in the right way or at least in a way that kills the least amount of people. And so we have to figure out how to sort of train our algorithms in the same way. No, this is when I make my pitch for interpretability again, right? Part of the reason that we trust other people to make decisions is that when they make a decision that we don't understand, we can ask them to explain why. And we rely on them to be able to teach us why they made that decision. And the amount that you trust somebody, I think to a large extent, is related to the amount that you trust them to be able to explain to you why they're doing what they're doing, right? We've all had this reaction of like, oh, my God, why did you do that? That's irrational. And you immediately just stop trusting what that person is doing. Thank you. By the way, I just want to clarify. I didn't poo-poo causal things. I was being a little bit flip. No, no, no. I just said there's times when it doesn't matter. It's not causal. The request on the part of the person who needs the service just doesn't care how you did it. They just want it to be accurate, okay? And sometimes the explanation is because I'm your mother and I said so. In the last few years, I've started deploying that one more. And by the way, there aren't bad algorithms and bad analytic methods. There's bad people. Okay, so let's be clear about that. And we're never going to... It's not our job because we can't. I say it's not our job. We're not going to stop bad people. So, I mean, if you listen to what was said, I mean, a lot of the things developed for the internet just got turned around. And, you know, I would have liked to have asked Vint Cerf, I wouldn't let the students come up and ask. So, is he... You know, when the internet was first developed, it was... There was... The usage of the internet was among technical organizations and companies and universities. And that was it. That was a population using it. And if you ever sent anything out, by the way, that was commercial, then there was a spam attack. Only it was people all over the place that had seen, oh, you did commercial. I'm going to get you. I'm sending you 50 emails. So, anyway. So, the internet was not built for security when it first came out. And a lot of those protocols that are still running weren't built for security. So, the bad actors are having a wonderful time. Go ahead. I have just one other question or comment. You brought up the notion of cause. And, of course, I follow on Twitter this going debate about the causal stuff from Udaya Pearl versus the statistical version of cause. And I know that, for me, the causes of things are very, very fuzzy a lot of times. And statistics there are confounders. And for a lot of things, we have superstitious behavior. I mean, how many times have any of us rebooted our machine because it was behaving badly and now it behaves well? So, how? I'm just going to say that's an engineering principle. No. Yes. So, the question is how much of these future general intelligence systems are going to have to behave heuristically because they're not going to have any data versus probabilistically. Yeah. By the way, I didn't say that causal learning was easy. It's very, very challenging. It is, in many ways, much harder than it is the predictive. Okay. So, it's tough stuff. Anyway. I guess I would say that in the field of AI, there's actually two types of AI that we talk about. There is AI that behaves rationally, and then there's AI that behaves like humans. And so, when we are talking about generalized, when people use the term generalized AI, they really mean AI that's going to behave like humans. And humans are not rational, and probabilistic models are very unlikely to be able to fully specify and behave and follow the rules that we behave by. So, I think if you want to try to make machines behave like humans, then we probably will have to have heuristics to model the kinds of bad or rational snap decisions that we make. And up until the last few years, most of the research community in AI has been focused primarily on producing rational agents because those are the things we can quantify mathematically and we can be more sure about what the results of the algorithms will be. And so, I'm not actually sure about, as we move towards having more causal reasoning, that's, if we're doing that in the realm of how humans make decisions, it'll be interesting to see how that sort of, those two ideas come together. Behavioral economics, people have been looking at this for a long time, and there is a difference between the people that focus on sort of an information theory, rational behavior perspective, and then people modeling actual human decisions. So, it's a very interesting question. Let me throw in one more thing. Remember, we've got subject matter experts who come to the table and they've got data, but they've got a model. Okay, especially people in the physical sciences. Okay, so that's where you start. You start with a causal model that they've brought to the table. And so, that's a lot of fun doing that, actually. It's easier than trying to do it where you've got no model at all. You know things that are likely to be important, but you don't know the relative importance and how much they interact and things like that. So, that's the tough situation. Okay, as we maybe head towards the closing part of this panel, maybe I'd like to invite the panelists to make, have a final word or final few words. And maybe, you know, obviously we've dealt with a range of questions, some longer term, some more challenging, some more specific. Maybe if I could invite you all to answer, you know, 50 years from now, that's a millennium in the AI timescale, I guess. Maybe feel free to pick a different timescale that you feel is more appropriate. Looking backwards, if you were, you know, to gather again, would we say specifically looking at these specific problems we brought up, you know, the robustness, you know, adversarial attacks, explainability and so on. You know, these went away and AI systems are deployed in critical applications. We solved them. That's one extreme. The other is, well, this really became a showstopper that limited the use of AI. Where do you think the reality is going to be? Gaze into the crystal ball and let us know. I do think there'll be a lot of advances in the visual-based processing using machine learning and improved sensors and improved representation of the visual space. And so I think that we can anticipate great strides in that area by 50 years from now. Robustness is an attractable problem. There's been a huge amount of work done in statistics for statistical methods, you know, implementing robustness. So everybody talks about least squares. Oh, that's wonderful. But throw in, you know, 5% of your data that are outliers and it'll blow it up, okay? But there's methods out there that when you know that that's a possibility and you want to protect yourself, you know, there's methods of, they are called robust methods of robust analytic methods. So I would think that's an attackable problem now for these, for, well, deep learning. Okay, I think that we are going to make vast strides from the technology perspective and the algorithm perspective. I think companies will continue to see the benefit of rolling out these kinds of methods and models. Researchers will continue to work on them because they're very interesting technical problems. I think we could end up in the Wild Wild Rat West because laws and policy will not catch up with it. And so just as we've had issues with laws in the past and maybe currently deciding whether corporations are people and how should we regulate them, we're going to have the same problem with respect to systems that are making automatic decisions. And I think that there is not enough focus on what are the legal and social implications of these things. And so that is really what I'm most afraid of. But if they, if legal scholars and philosophers can try to help define what we would want out of these systems, I think that the technologists can build the systems that live up to those standards. We just need to have help actually defining what the goals should actually be. Yeah, basically what she said. Because, you know, I think that we can define the technical problems that we need to solve well. And I think what we as researchers are really good at is if you give us a technical problem that is actually solvable, we will eventually figure out how to solve it. So robustness and all these other things can attractable. But, you know, we're not very good at... We're not very good at...