 Certainly. Hey, welcome everyone. Hello. My name is Bill Crossley. I'm B. J. William Murrigan, Anastasia Vornaz Head here in Aeronautics and Astronautics at Purdue. And I'm really pleased to help welcome everyone to this virtual Purdue Engineering Distinguished Lecture Series event today. We use this to bring in thought leaders in engineering from across the country, across the world. This is an interesting one. This is a jointly hosted title talk with my colleagues in Electrical and Computer Engineering and here in Aeronautics and Astronautics. The catalyst for this is the relatively new ICON Center, which is the Innovation and Control Optimization and Networks, which involves a lot of folks across engineering that's led by colleagues in ECE and AE. So this is a neat opportunity. So with that background, I'm also really pleased to introduce Meng Cheng. Meng is the John A. Edwards and Dean of the College of Engineering here at Purdue. He's also the Rosco H. George Distinguished Professor of Electrical and Computer Engineering. So Meng, why don't you come on and introduce our speaker. Thank you very much, Bill. And welcome to the January 2021 edition of the Purdue Engineering Distinguished Lecture Series. And as we pursue to the pinnacle of excellence at scale here in our dissemination, discovery, and translation of knowledge at Purdue Engineering, we were delighted to create this college level distinguished lecture series, whereby we bring in about eight just simply outstanding colleagues from academia and industry research labs to come to Purdue. Well, in this case, virtually coming to Purdue Engineering today is the outstanding speaker, Professor Claire Tomlin, who is the Charles A. DeSora Chair in Engineering at UC Berkeley, where she had her PhD degree in the 90s before going to Stanford in AeroAstro and then returning to Berkeley in 2005. And Professor Claire Tomlin works in a number of topics that my former co-advisor at Stanford, Stephen Boyd, also worked on. So I've had the chance to marvel at and admire Professor Tomlin's outstanding work for many years now, including hybrid system control, integrating machine learning methods with control-threatic methods. And indeed, Professor Tomlin was awarded the Donna Ackman Award of the American Automatic Control Council. And she's also a member of the National Academy of Engineering and the American Academy of Arts and Sciences. I recall a very lively intellectual conversation, maybe in Stephen Boyd's home, maybe a decade ago by now, and talking about by now what we think as very obvious choices in automatic control. But back then, very innovative pioneering work that Claire has already been conducting. It is a deep pleasure to welcome Claire, virtually to Purdue Engineering, who I hope will be able to host you in person here as soon as conditions permit. Also want to thank not only AeroAstro and ECE schools here, but also the ICON as Bill just highlighted. This is one of the new centers here at Purdue Engineering and formed within this Purdue Engineering initiative in autonomous connected systems. It's a very exciting new endeavor, and I want to thank the leadership team of ICON as well. And with that, welcome again to the first distinguished lecture and the panel to be followed at four o'clock here at Purdue Engineering in the new year. Thank you, and back over to Bill. Great, and I think my job then is to say Claire, welcome, and we'll let you go ahead and get started. So pleased to have you here. Yeah, thank you so much. Thanks to Dean Chang, thanks to Professor and Chair Crossley for that really nice introduction, Dean Chang. In fact, I think it was more like two decades ago. Time passes really quickly when you're one of the first people I met when I joined Stanford as an assistant professor back after graduating from Berkeley, and it was in Stephen Boyd's house. We had a great conversation. I also want to thank Professor Shreya Sundaram and Professor Xiaoshui Mu for contacting me and inviting me to give the seminar. It's a great pleasure and honor, and I would love to take you up on your offer to come visit once the pandemic is over. Today, I'm going to talk about safe learning. We had been working for a few years, and we continue these projects still today with NASA Ames on problems in research related to automating some functionality of the air traffic control system. So this is a safety critical system. This is actually a snapshot, and then down in the left-hand side is basically data over a few hours of aircraft, high altitude aircraft, and the movie just ended. The aircraft just didn't disappear. Flying over the western part of the United States. And one of the questions that NASA and the FAA have been very interested in is how to remove some of the cognitive load from air traffic controllers by using the autopilots, the control functionality, onboard the aircraft, as well as the sensing technology onboard the aircraft to be able to automate functions like collision avoidance between aircraft and maintain the separation between aircraft, which is, you know, we've shown stylistically down in the bottom right is like a virtual hockey puck of which, which specifies a five mile separation between each aircraft laterally and a thousand feet vertically. So how could one in such a complex networked control system define and develop and verify control systems which are guaranteed to keep the the aircraft separated from each other? That was a motivating problem and you'll see sort of vestiges of that through some of the examples that I showed today. And we're coming to today or in kind of recent years, maybe pre-pandemic when we were doing a lot of a lot of experiments. We're still doing experiments, but in a more limited way this past year. These are just two of the companies that students in my group have interned at over the past couple of years. The Skydio is basically a flying camera or a flying many set of many cameras, which like a quad rotor platform, which really allows you to, you know, have a camera that's following you around doing sporting events, things like that, a very popular system. On the right is Nero. This is a small delivery vehicle. People can't fit in. It's fully autonomous and it's for really just package delivery, but it's driving on our roads and both of these systems are perceiving the environment and reacting to things going on in the environment. Okay, so today's talk, you know, with those examples in mind, today's talk has two components. The first component is safety-based control. How does one or how might one think about designing control systems, which to, with the limits of the information that we have about the systems with that, one can verify the safety of those control schemes. And we've developed a tool called reachability. I'm going to talk about that for the first part of the talk. And then in the second part of the talk, we'll start integrating machine learning into these reachability computations to try to understand how one might use the kinds of model-based schemes and the kind of bookkeeping of safety-based reachability analysis to analyze and hopefully at some point be able to verify the safety of machine learning and control. Okay, so at a high level, this is a kind of stylistic picture of a state space. And I'm going to use my cursor here. So you should see a little arrow on the screen moving around. This represents the state space of the system. And I've shown a model here. This is just a differential equation model. We have a set of unsafe states. So states that you don't want the system to get into say, it says red ellipse here, in which we've labeled g at time zero. This could represent two aircraft in a set of aircraft losing separation with each other. And the idea is that if we could compute under the dynamics of that system, the set of states which could enter that unsafe state, the unsafe set g at zero in a time horizon, regardless of the best possible control action. So despite the best possible actuation that you could apply, the system dynamics, maybe some disturbances d in the system could push the system into that unsafe condition. Then you want to label those unsafe states and stay out of them. And that's the perspective. It's kind of a worst case perspective. So we call that the backwards reachable set under the dynamics of the system. And if you have a control action and you have a disturbance action, you've got this nice control system there, you can think about the control is doing its best, you know, if you're trying to stay safe to keep the system out of that unsafe set, while the disturbance, you don't know what it's going to do typically. So one perspective is to model that disturbance is doing its worst to cause you to get into that unsafe set. And so we use this game theoretic formulation where the disturbance is pitted against the control, and we compute that reachable set for that best control, worst disturbance action. Okay. And what we've shown in earlier work, and this is work that goes back, well, actually, it's work that follows on from quite early work done in discrete time by Dmitri Berksekas, work with Shankar Sastry, John Ligeros, Ian Mitchell and Alex Bayon were two PhD students that worked with me back at Stanford, where we showed that that backwards reachable set is the is the sub zero level sets of a certain Hamilton Jacobi function J, which solves this partial differential equation. This is a modified Hamilton Jacobi equation, a PDE. It's something that we're used to in control systems. It's modified because it's got this min with zero on the right hand side that kind of freezes actions. If the system goes unsafe at one point, it's got to stay unsafe for the rest of the time horizon. And so a lot of our early research was focused on methods to effectively compute this Hamilton Jacobi equation for different kinds of problems, typically in high dimensions, because for a multi agent control system, the state dimension is like the concatenated state of all the agents in the system. And so typically one solves these Hamilton Jacobi equations numerically, that requires gridding up the state space. And so the the computation is exponential in the dimension of the continuous state. Okay, so here's here's the sort of stylistic view of that computation. So what we're solving for is a function J of XT, a Hamilton Jacobi function, that function is so the the point of view here is kind of one that's quite interesting, you're representing a set as the sub zero level sets of a function J. And that allows you quite a flexible representation of sets, you can represent non convex sets, you can represent disconnected sets. So here I've tried to show kind of a side view where we have a slice of J at the initial time. So here's the set, here's its function J at the initial time. And then when that Hamilton Jacobi Bellman equation propagates the function J, that function that slice basically changes over time. So here's a snapshot of it after a particular time horizon. Okay, so this these sub zero, the sub zero level sets of this function J represent all states for which for all possible control actions, there's a disturbance that could push the system into that unsafe condition. Okay, and just a couple more points about it before we go on. So if this set is what we call controlled invariant for all T, that means that you've, you've done this computation backwards for a particular over a particular time horizon. And then that set doesn't grow anymore. It just, you know, it's the dynamics are such that it just, it just stops growing. So it's controlled invariant, then you have the result that any super zero level set of that function is also controlled invariant, and may be used for safety. So what that means is that if you can stay outside of an over a larger set, then you're guaranteed to stay outside of the set itself. So so that's going to be useful in a little bit. And the other point is, the more you know about the system, in particular, the more you know about the disturbance actions, the kind of unknown behavior of that system, these could be environmental effects like wind on the system, they could be actions of other players on the system. So for example, in collision avoidance, your own ship is, is taking a maneuver and you've got other vehicles around you that you're trying to protect against, you know, they're not typically trying to cause collision, but they may do something which causes a collision inadvertently. So the more you know about the actions of the other players, the less conservative that set becomes. Okay, so we have, you know, we, as I said, this was sort of early work in our group, and we've been continuing to work on computing research and computing these Hamilton Jacobi functions in high dimensions. This is, this is just a slide which illustrates a numerical solution from an early example that we did. But just a few points about this. What we're solving for, as I've said before, is this Hamilton Jacobi function, J of X, E. We call that a level set function. And we're employing in our solution, level set techniques, which are numerical techniques developed in really in the applied math community by Stan Osher at UCLA, Jamie Sethian at Berkeley for computing. So we employ these tools and we're applying to them to these new problems of computing level set functions for Hamilton Jacobi equations. So you get some nice things out of this, not only the set representation, which can represent non-convex disconnected sets, but you get a few other things. You get that the boundary of the region is defined implicitly by the zero level set of that function. That's why I keep saying the sub zero level sets represent the reachable set. You get that the norm of J is or the absolute value of J at scalar is the distance from X to the boundary at any point in time. And that's you think about how you're going to use this in your automatic control scheme. And also the function J is negative inside, so sub zero level sets and positive outside. And so that's represented here. And I've also given two links here. The first link is to Ian Mitchell. He's now a professor. He's actually now the chair of computer science at the University of British Columbia. He developed a level set toolbox based on this work, and you can download that. It's really well documented. You can pull out his examples and put in your own to do these computations. And then over the past few years, we've been extending this toolbox and developing kind of a wrapper around this toolbox, which makes it really easy to use. So we have that on our GitHub page, which is also linked there. And we've also got links to C++ implementation of the toolbox and then a parallelized version of that, which really allows some doing that numerical computation of the Hamilton Jacobi equation in higher dimensions. Okay. So example one, this is something we did. This is Stanford, Rogan Field, back when you couldn't buy a quad rotor on every street corner. So these are four quad rotors. And we built them ourselves, the four students, they're joystaking the quad rotors around when a quad rotor gets within a reachable set distance of another quad rotor. And I've just got a snapshot on the right of, you know, at a certain point in time, where those four quad rotors are in the XY position. And around each is three reachable sets, one with respect to each of the three other vehicles. So once one of the vehicles touches the reachable set of the other, the automated control takes over and guides the vehicle away from that vehicle. And then control at some point when it's safe, control is given back to the student. That's not the best, like human machine handover, but it's an experiment to illustrate that these sets can be computed and they can be flown, you know, in real time scenarios like this four quad rotor example. Okay, you can also turn the problem around and compute, for example, how you might reach a desired set of states or a desired condition, a desired trajectory, so a real control problem, despite the worst case disturbance. So now your control action is trying to get to that desired region. And the disturbance, you don't know what it's going to do, so it's might be trying to push you away. And so that's basically the role of the control and disturbance is flipped. And instead of a max min Hamilton Jacobi equation, it's a min max. And you can piece these together. So we call this a reach avoid set. What is the set of states from which you can reach a desired target set while avoiding an unsafe condition? Using the Hamilton Jacobi formulation, those constraints become very intuitive. It's like you mask out a region of the state space, just by using an inequality on the representative Hamilton Jacobi function J. Okay, so we can reach sets or avoid sets or reach avoid sets, the intersections of those. Okay, a few words about high dimensions. These, this is a, you know, this area of using these reachability based tools is not a new thing. This area of model checking. So developing a model of the system and then checking the model that it satisfies safety properties, for example, has been around for a long time. It's kind of a notion that came out of computer science. And there's been a lot of work in discrete state systems. For continuous systems, there's been a lot of work on for linear systems, you know, if the sets are polygonal or ellipsoidal, that's actually work did some of the work we did with Stephen Boyd when I was back at Stanford. And there's been a, you know, a community of people really at the intersection, I would say of control and computing and and computer aided verification that have been working on these methods, so putting practical constraints on a problem, like, you know, in a traffic scenario, we impose roads or highways or protocols to be able to simplify the problem. And mathematically, that's like reducing the dimension of the problem. Approximation, so mathematical approximations, like linear dynamics, simpler representations of sets, putting a mathematical structure on the problem. So there's been some exciting work in monotones. We've worked and others have worked on decompositions. How do you take a high dimensional system and decompose it into maybe project the dynamics into sub manifolds and then do the computation in those lower dimensional manifolds, because here, the dimension of the state is key, keeping that as low as possible. Exploiting offline. So the last two, exploiting offline computation and machine learning, I'm going to say a few words about. Let's go on to that. So this is this next example. How might we exploit offline computation when we're trying to do reachable set computations in in real time for say robotic motion planning. And here I think there's a there's kind of a contrasting point of view. So on the left hand side is what we typically do in control, slow and accurate, right? We have our, you know, whatever 12 dimensional quad rotor model, and we're planning a path and we have to avoid obstacles. That's those blue ellipses there. So we compute an optimal control. It's slow, but it's going to do it well. On the right hand side is kind of the robot motion planning fast, but less accurate. Do something like RRT, you know, which just computes, you know, set up points and connect those points by lines and then try to track it. The problem is you may, you know, by tracking that, you may actually, you're not going to track it perfectly because you're not using a dynamic model. So you might collide with an obstacle. So what we did was, you know, we kind of put a, maybe a principled approach to what is done in the, in the robot motion planning community, which is to put, you know, buffers of arbitrary size around your obstacles so you don't hit them. We said, well, let's use reachability to kind of do that in a principled way. What we can do is we can take the difference of the model between the high dimensional control model on the left hand side and, you know, the simple point mass model that's typically used in robot motion planning on the right hand side, compute the difference of that model, which is a dynamic model itself, the relative model between those, and then use that relative model to pre-compute a worst case error bound. And that worst case error bound, this is pictured in the middle, is like a bound or a buffer, but a principally computed buffer that you can either put around your vehicle or you can, you can bloat the obstacles with that buffer. And you can pre-compute that and then carry that around with you in planning. So you can still do planning arbitrarily fast because you've done this offline pre-computation of the relative dynamics reachable set. And yeah, I've got a few slides to illustrate that, but I've basically told you what we're doing. We're computing a reachable set which represents the difference between, or the max error between the planning model and the tracking model. And then we use that and, you know, in reachable sets, these sets are not always intuitive looking. I mean, it's kind of interesting what the, what these tracking error bounds look like. But basically what's this, this is saying is for this example that I'm showing here, which is computed on the left and I've got slices of it on the right. If you were trying to stay within 0.5 meters of your high dimensional tracking model, there's no, you can't, there's no sub, there's no tracking error bound within that 0.5 meters, which will guarantee you'll stay within, which will guarantee that your tracking model will stay within that distance of your planning model. But if you're trying to stay within 0.75 meters or one meter, there is some substance to that reachable set at that level. These are slices of this set. And you not only get, you know, where, how you track the, how the tracking model should track the planning model, but the control law that you have to use. That comes out of the reachable set computation. So we call this some, this methodology fast track for fast and safe planning, and fast and safe tracking. And we've used it in robot motion planning. So let's just conclude this example by showing some of the examples that we've done, because I think these are kind of fun. Here, we said, you know, we could, you know, track a path through obstacles, but what's the, what's the most challenging obstacle to avoid? You know, in the Skydio example, it's the person itself, right? The person, people who are moving around, you want to make sure you don't collide into those people. And so in joint work with my colleague, Anka Dragan, who works in human, she works in predictive models of how people might interact with an autonomous system, we used what is I think a fairly standard model in the, in the robot motion planning community, which is a Boltzmann model, predicting, you know, where a person or what a person might do as being rational for the goals of that person. And so the goals are modeled as a Q function, a value function here. And what this equation here, this, this proportionality shows, is that the probability of a person's actions, here I have a pedestrian in a room, and the UH represents their actions. And there's a door over here, and we, we assume that we can model the goals, all the goals that the person might have. So the person's actions are rational towards the goals, meaning that this model would predict that, you know, regions of high probability in pink would, you know, have the person moving straight towards the door, and low probability actions are things that move the person away from the door. So we said, well, you know, we can, we can use that, we can also update it if we're, you know, tracking what this person's doing in real time, we can see if that person's actions are rational for that model. And if they're not, we can alter this beta coefficient, this is like the inverse temperature coefficient in a Boltzmann model. So, you know, for example, if beta is zero, that's kind of uniform distribution over all actions, whereas beta, as beta gets larger and larger, it basically says that most of the probability mass of that model is on the action, which is optimal for the given goal. So, you know, if the person suddenly turns around from the door, immediately that probability mass is going to spread out. And we do that by updating, doing a Bayesian update on this beta parameter. Okay, so that's, you know, that's the model that we're going to use for the person. And now we're using, we have two quadrotors in this example. So quadrotors are represented, they have like a one and a two, these pink squares around them. And there's two people, this is Sylvia, and this is Jaime. And this is our lab. So Sylvia is going to walk to one side of the lab. Jaime is going to walk to the other. And the quadrotors are trying to get across the lab, too. So the quadrotors are trying to avoid Sylvia and Jaime, and they're using these Boltzmann models to predict the motion of what Sylvia and Jaime are going to do next. And so in this top down view in the bottom right hand corner, you'll see this little distribution blocks coming out of Sylvia and Jaime with, you know, the pink representing high probability and the blue representing low probability. And let's just play this. So the one thing to note here is that Jaime is a little bit hard to predict. He's kind of dancing around, slowing down. You see that he's got a bigger probability around him because of that than what Sylvia has. And if you watch that, let's play it one more time. So you can see the, let's look at the bottom right hand view now. You see that not only are the quadrotors kind of maintaining this representation, but they're also, you know, slowing down, stopping for a little bit while, you know, the Jaime passes them and Sylvia passes them so that they can get to their goal. So that planning is done using Fast Track. They're using the predictive motion of Jaime and Sylvia as the obstacles and they're using their Fast Track plans to avoid the regions of high probability of Sylvia and Jaime. One more computation point before I go to the, just talk about integrating learning. And this is kind of a nice segue because we are using learning here that this past year, we, with the Samuel Bansal, we asked the question, what about a, we've often thought about grid-free approaches for Hamilton Jacobi computation. And we've never really, we've done some things, but over this past year, there was this great result from the computer vision community, or it's actually a computer vision group at Stanford that started using sinusoidal deep neural nets. So where the activation functions are sinusoid, to be able to better represent continuous functions. And so Samuel Bansal, who just graduated this year, this past year, from my group said, aha, why don't we try to use that to represent, to create, or to train a deep neural sinusoidal net to represent, to solve Hamilton Jacobi functions. And so this is what he did. He said, okay, suppose you can have, you know, random samples of the state and time, use the fact that we know the dynamics, so the loss function that we're going to train on are the Hamilton Jacobi dynamics with the initial data, fit the value function and repeat. Okay, so the, you know, the standard ReLU, your gradients are really important, right? So the standard ReLU, the gradients don't work well. So this is where the sinusoidal activation functions come in. So this is Gordon Wettstein's group at Stanford who put this paper out last year. And so Samuel did this, you know, it took a while to kind of figure out how to do it properly. But now, really for the first time, we're computing, okay, high dimensions here. So this is a nine dimensional problem, three aircraft, three dimensions each, because they're flying in the plane. And we've never been able to do a nine dimensional computation directly before. We've always had to use decomposition or something like that. So here, using this method, which Samuel calls deep reach, and we submitted it to ICRA, this is the archive link here, we're able to compute that, use a neural net to compute the solution to this Hamilton Jacobi equation, trained only with the, you know, the fact that it's trained with a loss function that is the Hamilton Jacobi function and the initial data. And so the, you know, here's the example, it shows that using previous methods, using pairwise computations, that's the blue set, we're missing some unsafe conditions. And so for example, this purple dot here, if we look at that, that would have been marked safe by a pairwise computation in this three aircraft, two evader single pursuer problem. And you can see here in the middle pane that purple dot here, the pursuer can cause the two evaders to lose separation with each other. So that wasn't caught with previous calculations. Now we catch that. And if you take some state which is right outside the pink region, these deep reach states, you can see, you know, the trajectory, the optimal trajectory from the Hamilton Jacobi equation is safe, but it's just barely safe, right? So it complies with what you'd expect for something that's really close to the boundary. We've also used deep reach to, this is a 10 dimensional problem, a two vehicle, a narrow passage on a highway or on a road where each vehicle is a five dimensional vehicle. And so 10 dimensions is something that we could not compute before. So this is the unsafe condition where vehicle Q1 is avoiding a car, you know, that's blocking the road, but goes into the path of vehicle Q2. So deep reach computes the reachable set, it computes a safety controller. This is the safety controller, it dictates for both vehicles what they should do to stay safe, you know, and it's that same controller. Now you update the initial conditions, the same controller, you know, the optimal control solution computed directly using this Hamilton Jacobi, this deep reach method for solving Hamilton Jacobi equations. Okay, I have just a couple minutes left. And so I'm going to spend the last few minutes talking about what I think is really the, you know, where we're going with this. And that's incorporating learning techniques. And you saw this a little bit just using techniques, using learning techniques to solve these high dimensional problems. We're actually much more interested in, well, not much more, we're interested in that, but we're interested in these, these kinds of problems where you're trying to control the system, you may know the via, you may know the system's model well, or, you know, it may be complex and you don't know it well, but you don't really know the environment well in both cases. And how might you use learning in real time to learn about the environment, incorporate that with the system model and design an overall safe. You know, we're still this is, this is a problem that is a very, you know, very current research problem. And I think we're making headway and other groups are making headway on this, but it's really there's really a lot of open research questions here. And so we started out initially by just saying, how might we, you know, use reachable sets to prune away unsafe conditions. And that's what we did here in this experiment, the quad rotor is just trying to track the vertical step trajectory. But to do that, it would have to exit out of its envelope, which is computed to keep it away from the floor and the ceiling of our lab. And here within that envelope, it learns to track the trajectory just by so we're using a very simple policy gradient sign derivative learning algorithm, which gives it the trajectory it wants to track, it gives it the envelope, and it learns the controller to track that trajectory, but stays within the envelope. So it's like using the reachable set as, as a kind of safety constraint and telling it, you can learn inside the safety constraint, but when you get to the boundary of that reachable set, apply the safe action to keep you in. And, and then we extended that a bit, we said, well, if we learn more about the model and the control, then we can update the safe set, you know, something that started out as being quite conservative, you might be able to make less conservative, kind of like a novel someone who's a novice driver is very cautious at the beginning, but then, you know, as they get to be a better driver, they can, you know, improve their performance in driving. And so we, in a kind of principled way, we asked, how might we update the safe set boundaries as we learn more about the model in the environment. And, and in this case, let me just show this one sort of repeat of that video. But here we have, we have two experiments that we ran in sequence, and we superimposed them on top of each other. So the quadrotor is still trying to follow that step trajectory but in the, in the second experiment, we turned on a big fan that provided a really significant disturbance in the lower part of the room. So the ghosted out quadrotor is the one that's not updating its reachable set as the vehicle is detecting more about its environment. And the non ghosted out quadrotor has trumped the reachable set to the point at which the disturbance that it's currently measuring in real time matches the safety condition of the reachable set itself. Okay, so let me conclude by, so example six is my kind of conclusion example. It says, where are we now? So this is the problem that I mentioned. We have, you know, systems that are maneuvering in unknown environments. And we'd like to be able to ask, how do we design, you know, once we perceive the environment in real time, how do we use the kinds of safety constructs that we have from reachable sets, for example? And what else do we need to be able to make guarantees about the performance of those systems in real time? And in our lab, we said, well, let's start with something very simple. We have a robot here, which has an RGB camera. And it has to get from a known place that it's starting at, to a known goal. The known goal is the star, which is kind of around the corner in the lab. But it doesn't know, it does no model of what's in between. And so it's using its camera to perceive the environment and follow the, and get to the goal. So all it can see is what we see on the bottom right hand corner. And so in our current work, we're, and I think, you know, I've, I'll end, you know, after this slide, because we're having a panel at four o'clock this afternoon. And this is the point at which I want to kind of launch into a discussion about safety-based control and, you know, what we're thinking about and what others are thinking about and how we might tackle this problem. But I'll leave it by saying that we've been thinking about this in a very modular way. And the point is to use learning where learning is needed. So for here, this problem is really in perception. And that, you know, now 99% of computer vision is using deep neural nets. So if we put a perception deep neural net module in a loop with a standard planning module, a standard control module, what are the kinds of guarantees that we can make about that? Okay, so, so let me conclude there. Let me just go to, because as I said, I have these slides a little bit later. Let me go to my conclusions live. Okay, so this talk really focused on learning, it focused on safety, but now thinking about how to incorporate learning into those safety-based control schemes. In a safety critical scenario, how do we update? Like, you know, we're sort of saying, okay, we've computed a safety certificate, and now we've got more information. So how do we update that safety certificate? How might we deal in particular with what are known as unstructured environments? So here, you know, not only are we navigating in an environment that, you know, okay, we've cleared it a bit, but we have a, you know, people walking through that environment. How do we update those models that we have of people? How do we, you know, learn better models in real time? And questions of, you know, if we're predicting models about what's going on in the environment, can we use the tools we have to understand how to interact with the environment? So we not only get, we're not only achieve our objective, but we achieve high confidence in the predictive models. It's like that kind of age old question of the tension between, you know, learning and controlling. And how do we do that in a way that we can make guarantees? Okay, so with that, I'd like to conclude the talk. And I'd like to thank, importantly, many people, a large number of students who worked with me on this over the years. The students in bold or the students who are PhD students at Berkeley still, and others who've graduated and gone on to be researchers and professors at other places, I'd like to thank as well. This work is also done in collaboration with Alexandra Faust, who's at Google, and Jitendra Malik, who's a computer vision specialist at Berkeley. And I'd like to thank our funding sources who funded, you know, a lot of the different projects that I've shown here. So thank you very much. And I'm happy to take questions. Okay, thank you, Professor Tomlin. My name is Ben Feng Tsang. I'm a social professor here in Air Astro. And I'm actually one of the many academic grandchildren of Professor Tomlin here. I would like to have, I think we have about 14 minutes for Q&A. And please type your questions in the chat box so we can see it. And we, let me see. I might stop sharing, Deng Feng, just so I can see faces, if that's okay. But I'm happy to share again. Sure. This way I can also more easily see the chat box. Okay. So for the audience, please type your questions in the chat box so we can see that. And as I said, we're going to have a, in the panel discussion this afternoon, later this afternoon, I'm going to spend a bit more time. I mean, there's going to be a number of perspectives about, you know, safety and control and learning. And I'm going to talk about that last project, is kind of use it as a springboard to talk more about thoughts on integrating machine learning and, you know, what are some of the directions we could think of in research to think about maintaining safety despite learning? I look forward to that. Okay. We have a first question. Can you talk more about the worst case of something on the disturbances? Is it potentially conservative? Yeah. I think that's a very good question. The way that we use, okay, so here's the perspective. It's kind of a robust control perspective. You have, you know, your control actions and disturbance actions, and you don't know what the disturbance is going to do, but you assume you know bounds on the disturbance action. You could know more. You could have a probability distribution on the disturbances. But for the time being, let's assume that you have a bound on what the disturbances can do. So you say, okay, well, I want to protect against the worst case disturbance, and I'm going to compute the set based on the worst case disturbance so that when I come up to the boundary of the set, I'm going to employ the control that will ensure that if the disturbance did its worst thing, maybe inadvertently, but if it did its worst thing, then I'm guaranteed to stay safe. But away from that boundary, you don't have to apply that control. So the idea of a set is it's like a least restrictive control law. It says that, you know, yeah, do whatever control you've computed for performance, for, you know, your control objective. But when you get to the boundary, this gives you a safety net. It kind of gives you a control to apply. And that was important when we started incorporating machine learning schemes into the control, because we wanted to make sure that we had that safety net. And so that safety net, we computed on a nominal model that, you know, again, despite the worst case disturbance would keep the system safe. But then, you know, as we're running, we're learning more about the disturbance. So we can update those disturbance sets, you know, if we never see disturbances which are at those worst case bounds, after a while, you know, we could update those or, you know, that's the question. How do you update the disturbance sets? We just said, okay, let's, you know, take a significant amount of data. And then, as we've seen that there's really no visitations of those boundaries, we're going to let, you know, model our disturbances being a little more restrictive than we thought it was. And that allows us, you know, a larger region to operate in. And that's kind of the way we update the sets with this learning-based scheme. But yes, in general, you know, it could be conservative if you don't know much about what the environment can do. But I would argue, for a safety-critical system, that's kind of where you want to start. Thanks. The next question is, why we are using the backward reachable set computation? Can we use forward reachable set to check if the system reach unsafe regions? Yeah, I talked about the backwards reachable set, but we could also use a forward reachable set, which, so the point of view of a backwards reachable set is you know you're on safe conditions, you're still, you don't really know where you are, you're just going to compute the safe initializations of the system. Forwards reachable set, it's a, the problem is, you know, it's not, it's a little bit different than just turning around the computation. So you're, you've got a set of initial states and maybe some unsafe conditions, for example, and you're going to compute where you can get from that initial set. Sometimes that's actually, depending on the problem, it's more efficient to do that than a backwards reachable set. And so the, that level set toolbox that Ian built that we've worked on with our helper OC, the GitHub link I gave you, it does both backwards and forwards reachable sets. Sometimes you want to do both, let's do, you know, these like shooting methods, do backwards and forwards and kind of learn more about the problem that way. Okay, thanks. The next question is, what are your views about using reachability for safety problems, where the next models are either data driven or unknowing? So that's a good question. If you don't know the model, then you can't, you need something to, you need a model to begin with, to be able to do a reachable set computation, because the heart of that computation in solving that Hamilton Jacobi equation are the dynamics. So in that equation, you saw f of X, U, D, that was the dynamic model that you have. So, but suppose, but the, the, the second part of your question is, you know, you, the, they're data driven. So suppose you do have a model, which you've identified through data. So that, that becomes an interesting question about using reachability. Could you, for example, and this is a, an idea that one of my students has worked on. Suppose that you, you learn a model through data, or you learn, you know, part of the model through data, you could use a reachable set computation as a way to kind of track the behavior of that model. So you're, you're using the reachable set computation to ask, you know, whether the model is good enough for your purposes at this point. And, and it also tracks the uncertainty of the model. So in that last slide that I had where I said we're using sort of doing reachability on predictive models, this is precisely what we're doing where, I mean, reachability is a tool for giving you a preview of what a system could do if you give it the dynamics of that system. So now suppose you have like the, a learning method that is learning a model, that learning method has dynamics associated to it. So you can do a reachable set computation on the dynamics that is the learning method to give you a preview of when will I have learned enough that I have high confidence in this model. And, and I think those are really interesting questions because it's a tool that you can use at the same time as learning. So it gives you kind of a more analysis of the learning algorithm itself. So that's a really interesting question. And, and I think that, you know, again, lots of open ideas or, you know, lots of open questions and challenges here. But it's, I wanted to say that reachability is more than just a tool for like, you know, showing where vehicles can get to, it's, it can be used as a tool for analyzing a dynamic process. And that dynamic process itself could be learning. Thanks. The next question is a really good one. I think, have you tried other activation functions in the neural networks? What is the benefit of using sinusoidal functions? Yeah, we tried. Yeah, we've tried. So we tried ray loose and that didn't work. Actually, I have some results to show about the, or in the, in the paper that I referenced that we wrote about this, we compared it with ray loose. That didn't work well. Didn't work well just means the, the, you know, after the same amount of data, the same amount of training, you get a function that's not capturing the actual known function. So we did a number of tests on known functions. It's not capturing the known function well. The, the sinusoidal is a continuous function and you can keep taking its derivative. So that we, we tested other continuous functions as well, like the, the Arctan function and others. This is empirical. And I don't, I'm not sure I understand why sinusoidal behaves better than some of the other continuous functions yet. But the best results we got were for the sinusoidal functions. The paper that I referenced also did some analysis on other functions as well. And for that reason they proposed using sinusoidal functions. So I think that, you know, that they're, in general, there's, you know, many an infinite number of functions that we could use for these activation functions. And we don't really know, we, we, we know we wanted to, we, we know there's benefits of using a continuous function that, or a smooth function. But, you know, comparing different continuous functions that to use, that's still like why the sinusoid works better in the empirical results that we did is we're not sure yet. Thanks. The next question is for unstructured environments, you mentioned that you updated the controller to deal with the new environment. Do you incorporate any type of schematic reference in the environment, such as recognizing that the door may have open, may have people come out to inform the planning process to handle that and the, and the pre-plan for disturbances that may come from that. That's a really good question as well. In what, in that architecture that I presented, where we have a modular perception planning and control, the interface between the perception module and the planning module, we chose that. We actually iterated over different interfaces. And in that architecture, we chose a waypoint as the interface. So perception module outputs a waypoint, which is a next waypoint on the way to hopefully conflict free waypoint on the way to the goal. The form of that waypoint is also interesting, whether we use in that simple ground robot example, it's just a four dimensional model that we use for planning and control. We ended up using X, Y, and heading as being the waypoint. Just using X, Y didn't work out so well. I think for kind of obvious reasons there, but the choice of interface between the learning and the planning is again an open question. If we go with a waypoint and say we were doing, taking this architecture and trying to use it, pull out that ground robot, that wheeled ground robot and put in a walking robot, which we're now trying to do in collaboration with Koshel Srinath, who's a professor of mechanical engineering at Berkeley who works on designs of walking robots as well as control laws for them. The waypoint would naturally be, I think, higher dimensional, incorporating a lot more of the dimensions of the system itself. So then to go back and answer your question, we weren't, in that project, we didn't incorporate it. It's incorporated within the neural net, but we didn't explicitly incorporate any semantic information. What we did in that project was we trained the neural net. So we used a standard starting point, which is a convolutional neural net that's used a lot in computer vision. And we trained it using simulation data only. Stanford puts out these indoor spaces, databases, so these are textured meshes. We basically set up an optimal control problem where we said if you know the origin, you know the goal, and you know the environment, and we use the textured meshes from Stanford. So these are like indoor spaces in university buildings. Then you can design an optimal control to get the vehicle from, you know, start point to goal. And so we, that became our loss function is the difference between what the perception module predicted as the next waypoint and what the optimal control scheme gave. And that's, you know, all in training, all with simulated data. And then we took the neural net and then we applied it in real life on the robot in an indoor environment at Berkeley. No retraining with real data. But we didn't, we just asked the perception module to take image data as its input and give a waypoint. And using that loss function, which is distance to what the optimal control trajectory would have said, and what the actual trajectory would have said, using that loss function in training where the optimal control knew about the obstacles was enough to generally give very good performance. In fact, better point of that architecture was to compare it to an end-to-end scheme where you're using learning from pixels to control action. And we showed, you know, demonstrably better performance using that modular architecture rather than an end-to-end framework. But the question about semantics, that's a good one. In this problem, we're not using that. We're looking at a fairly, you know, simple scenario yet quite, quite involved when you look at that architecture. Okay, I think we are running out of time. Thanks, Chris Tomlin, for this very entertaining and enlightening talk. And thanks everybody for attending this Distinguished Lecture series. And we look forward to attending your panel sessions in a couple of hours. Thank you very much. Thank you very much. A pleasure to give this seminar. And thanks for everyone's questions.