 Ladies and gentlemen, we now have a panel discussion. The topic of the discussion will be realizing the Army's vision for the future. The panel will be moderated by Mr. Patrick Tucker from Defense One. Please welcome our panel members, Professor Dieter Fox from NVIDIA, Professor Marshall Ebert, CMU, Lieutenant Colonel Christopher Lawrence, Army AI Task Force, Dr. Tony Stence, Uber ATG, Dr. Ethan Stump and Dr. Stuart Young, Army Research Laboratory, and Lieutenant Colonel Jay Wishem, next-generation combat vehicle cross-functional team. Please join me in welcoming today's moderator and panel members. Okay, thank you very much. Good to be here. So, a quick question for the audience. I'm Patrick Tucker, Technology Editor at Defense One. Who here knows the etymological origin of the word robot? Raise your hand. Handful. Some Europeans, which does not surprise me. So, it's a Slavic word, and it means slave or service, and it enters modern usage in 1920 with a play by a Czechoslovakian playwright named Karl Kipik. And the play was called Rosem's Universal Robota. And so, the robota were this creation of the play's protagonist, a guy named Rosem, and he's kind of a Silicon Valley Elon Musk type. Act one, Rosem is created an entity that can take over all human labor. He becomes fantastically wealthy. Act two, the robots have destroyed the economy and created, you know, civil strife. Act three, guess who's killing all the humans, right? So, I bring this up. I think it's important to remember that all of our fears and all of our projections about robots existed decades before anyone actually tried to make them, right? So that hope and that fear, it's just sort of ingrained in the concept, but it's always projection. It's always just a notion that we have and that we have to conquer. And that's why I'm so glad to have this panel here today to help us talk about the actual difficult work of making things that can relieve the labor that particularly we're putting on soldiers is an incredibly dangerous and difficult line of work. And so with that, I would like to, let me start first with, do we have Lieutenant Colonel Lawrence here? It's you. Hi. Hi. So, when you look at this, all of these things here from an operator perspective, I wonder if you can, to start us off, tell us a little bit about what you want Battlefield Robotics to give you in the next five to ten years. And then we'll get into the difficult technical challenges of meeting those expectations. Jay, that might be your up your alley if you want to go take this one first. Yes, we're Jay. Please go ahead. But I'm welcome to chime in if you want. Why don't you take a stab at it real quick and I can help. Okay, sure. So, Battlefield Robots, essentially we need them to, you know, be an augmentation system for the soldier. We need them to give us enhanced situational awareness. These are some of the things that we've seen. So they need to have some context understanding, be integrated as part of a team. And then ultimately, you know, like I said, feed that situation awareness back to the soldier, the operator, right? So that you get this increased standoff capability, increased survivability for the soldier. These are all really good, I mean, overall themes that robots can offer in terms of capabilities, right? So reduced cognitive load for the soldier, especially if you have a good human soldier interface where you have one soldier that's interfacing with many robotic systems. But again, going back to the situational awareness of what is, what are they perceiving, right, in the battle space? And then with that, obviously that's inherently going to improve the survivability of the soldier, right? Yeah. To be able to apply facts much more rapidly. Yeah, I think that's well said, Chris. So fundamentally, kind of the way I look at it, I'm very much like Kevin at. A lot of the CFT members, we, you know, the Army made a decision not to put technical experts or acquisition experts in the CFTs. We're all war fighters. So we take a very different sort of approach. I kind of boil it down to a very simple set of things. As a commander, I would like to be able to choose when I make first contact with an enemy force with a human or not. I would like to have that flexibility and option so that I can elect to put an unmanned system, whether that's a UAV or an unmanned ground system out, an unmanned system out to make that initial contact, you know, that very tip of the spearpiece. I want to be able to elect to do that on my terms. And that's where robotic systems really give you some interesting capability. What it also allows you to do, and Chris kind of alluded to this, I start changing what the decision, you know, processes for the human. How can I make better informed decisions faster than my competitor or my opponent? And again, a lot of this comes down to how do I make these choices in such a way that I'm preserving my capability and my forces and then putting the enemy in a very bad position. That doesn't always mean killing them necessarily, although, you know, the plight of the reality is solve more problems in human history than pretty much any other method, rightly or wrongly. But it's that ability to make those decisions faster and then choose when I make human contact. And sometimes you want to make contact with a human first because of the complexity of the task. Perhaps it's going to be in a very populated environment and there's going to be a sifting and a sorting of who the threats are and what's going on. Or perhaps your intent is not to have a kinetic engagement. I mean, there's reasons that you want to do that. But if you believe contact with an enemy threat is imminent, I would really like to have a robotic system out there that is increasing my situational awareness and then allowing me to make a choice on what is the most efficient way to deal with that threat that puts my soldiers in the best position to win. Okay, so this gets at a lot of what we're talking about here today and a lot of what you see around this room because core to that hope in terms of what you can have that system do is perceiving, right? I mean, that thing has to make a lot of decisions about what it is experiencing and then relay that information to you in a way that's helpful. And this gets at the problem of robotic perception. I don't know who is familiar with Rodney Brooks, the roboticist that invented is behind iRobot. But he said he has this very good quote about the evolution of robotic perception. He said, you know, in the 1970s, we thought that teaching a robot to see was going to be a summer program for a master's student. And now it's taken us, well, we're not done yet. We're not done yet. So let me turn to Marshall, Tony, and Dieter. What does machine vision look like in the next five years and in the next 20? I guess I'm the victim in the middle here. Yeah, I think where we really have to go is towards a much more robust, general, semantic understanding also of the physical world around the robot so that they can, for example, interact with it, as Sid also mentioned. So just to put this in contrast, we humans have this really good understanding, intuitive understanding of the world around us and how things work. For example, nobody here would be surprised if I opened my hand and suddenly the microphone would fall down, even though you might not have the exact equations in your mind but you know how this works. Or for example, I might not be able to tell you exactly how many ounces that microphone weighs or how long it is, but at the same time, I can interact with it in a very robust way. And if I bump on it, nobody is surprised that it makes that sound. So we have very good understanding and predictive models of the world that we then can use in order to interact with it. And in robotics, currently there's still a lot of focus on having exact models, kind of really physical understanding of these models, but the problem with these physical models is that it's very difficult to train them, to adapt them and to teach them. So I think what we need much more is how can we teach perception system over time and train them in the real world through experience. And as an extension of the real world, of course, I think one thing that's going to be important is how can we train them also in simulation just because it's very hard and very expensive to train robots only in the real world. So let me mention another aspect of perception in the future, which is this idea of embedded perception, let's call it that. And this is the following idea. I showed this example of semantic perception, semantic segmentation, which is a critical tool in autonomy. And the idea there is to basically label pixels in images as to the object that this pixel represents. And the way that works is that you have typically a neural network type of structure that tries to do as well as it can in labeling every pixel in the image correctly. So that sounds great. Now, can you think of any application in the world where you actually want to label every pixel correctly? It simply does not exist. One of the major limitations that we have now is that the perception systems that we have now solve the problem that is... solve the wrong problem in a way. I don't need to label every pixel correctly. I need to understand the scene only at a level that is sufficient for me to carry out the task. And worse than that, it is solving a problem that is harder than the problem that we need to actually solve. So what we need to do is to have a perception system that are actually developed and trained in the context of a task. A task-driven perception. That is something that we don't know how to do at the moment. We know how to do this for simple tasks, like grasping, for example. We don't know how to do this for more complex tasks like navigation. Yeah, I think in order for perception to be truly robust, we're going to need an immense amount of data and has been pointed out the labeling problem is a real bottleneck for that. And so I think we're going to need to leverage approaches that automatically label, automatically train models. For instance, we can have a human driver that can drive a vehicle around and the system can determine that, well, it's a road that they're driving on and when they, for instance, steal around something, it must be an obstacle. And so get very efficient classifications in that way. Additionally, the robots can go out themselves in the environment and encounter it and automatically understand the environment that way. So for instance, if a robot sees something in the environment and then drives over it, it gets much more information about what was actually there and can use that to automatically classify the data. So this labeling problem is a big one and if you look at some of the posters in the wall, they remind me that one of the big potential solutions to that that's emerged over the last couple of years is deep learning. Because in a deep learning system, the system finds its own labels to things in a way that conforms somewhat to your predictions, hopefully. So why, Marshall, I'm going to stick with you for a second. Why hasn't deep learning solved this perception problem? What's needed for it to solve this perception problem? Well, I guess three things. First of all, deep learning has been very successful precisely because of the availability of data and the availability of labeled data. So step one is, as Tony said and as we said earlier, is the ability to train the system with minimal amount of data and minimal supervision. And that is critical and in fact this is something that completely separates autonomy from other applications of computer vision where you have the luxury to train those networks offline with massive amount of data and so forth. The second thing is going back in time, going back to the time of the Rodney Brooks quote that you had here and to be able to incorporate more structural knowledge, additional knowledge in those systems. We went very far into the realm of pure data driven learning pure black box system that basically learned from scratch from data. That has gone very far, but we need to now step back and see how we can use additional knowledge, common sense knowledge, geometric information, knowledge about the task, et cetera to add to this data driven black box learning. The black box thing is an important one because neural networks, they've actually been around for like a little bit. In the 90s there was some very interesting work with neural networks and it was before there was like really big data sets and before there was really great compute, but they were around. I talked to a couple of folks that were trying to create an application for them in the public space, particularly like predictive policing actually here in Pittsburgh. A lot of people don't know about this. There was a neural network sort of like project that was based on predictive policing here in Pittsburgh and it worked great, but city planners and municipal folks were like, I don't know if I can want to adopt this because I don't understand how it made its decision. I don't know how it reached its decision because that opaqueness that was neural networking in the 90s and to a certain extent today was like a huge inhibitor. So I want to ask our operators here if you don't fully understand how a system arrived at the decision that it made, how much do you trust it? And I want to go back to you, Marshall, or anyone that feels like they have something to contribute to that. Is the black box opaqueness, is that still the best descriptor of the way neural networks are operating now or have they evolved from that? So first to the operators, how much do you trust a neural network that you can't understand its reasoning, but you do see the output being correct? You know, I think it comes down to the soldier interacting with the system early on in the process and start practicing kind of the, you know, the teaming concepts that we're working on with man and unmanned teaming. How do they see themselves interacting with the system so they can build trust over time and understand the system's limitations, what its capabilities are and what it can offer? And I think over time taking that feedback too, you know, from the soldiers saying that, you know what, this is throwing too many alerts. I don't trust it. These aren't accurate. And then we might have to go back to the drawing board a little bit. But I think, you know, there's systems we feel that before that, you know, maybe somewhat as a black box to the soldiers. So, I mean, we have means of, you know, testing and evaluation and through modeling and simulation. And we've already just discussed recently about putting some bounds and some of the, you know, around the systems in terms of interjecting, maybe physics models. You know, bounding necessarily what it can do in terms of its response. So I think there's means for us to tackle this problem. Yeah. I think I would say it kind of depends on what the task is. So I would just offer that everybody in this room, you put in an order in an amount of trust in anything, but from the time you woke up to the time you got here in something you can't actually describe how it works either. Unless you've like torn apart a car on your own and put it back together, you probably in theory understand how our car is working, particularly a new model car with the digital systems in it. But you probably really don't know. Like you have an idea. And by the way, most humans are not as explainable as we might like to think either if you've ever asked your 10-year-old why he did a thing. So from that aspect, a simple task, I think it comes down to demonstrative effect. Like if you demonstrate a soldier that the tool system that you're trying to enable them with generally functions relatively well and adds some capability to them, some minimal level of capability, they will grow trust very, very rapidly. And in more complex activities where you're trying to assist into actual decision support tools or higher cognitive processes where like multiple complex systems together, that is going to be a little bit more difficult. But again, these are all uniquely human activities that you chain series of humans together and then you have these arbitrators that are again humans that are doing this. These are not always explainable either. So I think that you can gain trust, but a lot is going to come down to what you demonstrate as a capability. Yeah, I mean I would echo that. You trust something because it works, not because you understand it. And so the way you show it works is you run many, many, many tests and you do a statistical analysis and build confidence that way. And that's true not only of machine learn systems but other systems as well if they're sufficiently complex. You're not going to prove them correct. You're going to similarly need to put them through a battery of tests and convince yourself that they meet the bar, that they work. Yeah, I fully agree also on the notion of maybe we have to let go a little bit of this idea that we can prove everything that it's correct and perfectly predict its behavior just because the kind of data that we're dealing with is so complicated, right, that you just can't explain it anymore at every level of detail. And as you pointed out, we humans aren't that great at explaining why we're coming to our decisions. Also we're not that good at that. And also I think in the machine learning community there's of course a lot of work going on exactly in this direction where techniques are being developed, how you can kind of introspect these deep networks and kind of see why the network might come up with a certain recognition solution or something like that. You can look at which part of an image led it to recognize a cow and things like that. And also other areas are going into how we can put this physical structure, specific constraints into these networks so that they learn within the confines of what we think should be physically okay. So I think it's a lot of good progress going on. Okay. So let me turn to you, Stuart, Chris and Gary. The army modernization priorities, we just heard a little bit about those. And there is a bit of a gap between what commercial industry is working on and what the army needs. So how are the army modernization priorities shaping right now AI research that you're doing? So the biggest thing for me is they're enabling us to focus our research efforts on capabilities that the army actually needs. And it also is an opportunity for us to inform them as they develop requirements, we can help them understand what's technically achievable and what's a little bit further out. And we use this interaction to help us focus our research and discover new things that are of interest to them. So instead of looking at, you know, a wider swath of problems which we're always inquiring about, it allows us to understand and get deep into the area of interest that they have. So the engagement with the NGCV, which is why I invited them to be here in such force, is because they've really adopted what we're trying to do in the CTA and they are really helping us understand how they would actually use them in the field. And that's informing the research experiments that we have to do to meet the requirements that they have. So on this for Tony with Uber, what is Uber doing right now in this space and what are the lessons that you learn from the RCTA that are applicable to your work in Uber and what can the military learn by partnering right now with Uber? Because I know that's something that they're talking about. Sure, yeah. So I would say there's a lot of technology that's in use not only at Uber but in the self-driving car industry that traces back to programs like CTA. Technologies that were either innovated on the program or matured or extended in some way or another. For starters, a lot of the perception technology, for instance, detecting people, detecting pedestrians is very important. The multimodal approaches and machine learning approaches have been very successful for doing that. Other perception problems as well, I think there's a big deal made about the fact that self-driving cars use maps, and they do, but there's an awful lot of stuff you encounter in addition to moving actors like cars, pedestrians, and bikes that aren't in a map, things like a construction site. And so when the car encounters that, it needs to have some understanding of what that is. So a lot of the semantic perception techniques have been applicable there. Prediction is a very important component for a self-driving car. The faster a car goes, the more time it needs to stop, and therefore the more important it is to predict what other actors in the environment are going to do for the next several seconds. And so the same sort of machine learning techniques that are used in perception are also used there. Planning is another area. Routing technologies for getting a car from A to B sourced in these programs. A lot of local navigation approaches where the car has to follow a lane, stay behind another car, or drive around something. Uber uses sampling-based planning approaches where in faster than real time we investigate different sequences of steering throttle and braking commands and evaluate them and pick something to execute. So that's been very successful. And then other technologies as well. It's very important that the car knows where it is at all times. And so a lot of the technology that was developed in the extensive slam research has made its way into that application in self-driving. Are you guys working on something that's GPS-independent because that's a big issue for Army right now? Sure, yeah. GPS, for instance, driving around in cities with tall buildings, GPS is not particularly good at localizing. So we've had to adopt multimodal approaches for getting the position of the car. Okay. Go ahead. Are you now exploring like a crater with the Army? Is that a thing? Certainly have looked into possibilities of teaming and we'll look into more in the future. Okay. So Ethan, what are the autonomy challenges that ARC has to focus on that industry won't focus on? And also sorry for neglecting you for so long. So I think probably the number one task, and it's kind of mentioned before, is that I think we have to deal with unstructured environments and there are certainly aspects where industry is interested in this, moving off-road terrain. I think the adversarial aspect is the key thing that the Army needs to face, that industry might not necessarily face. And I think it's adversarial not in that we're worried about people trying to goof with the cars or mess with them. But just the fact that you need to operate in a place where people are deliberately trying to mislead you, deceive you, outmaneuver you and you need to be able to deal with that. And we need to think about how we do this in a way that we have as little sort of preconceptions as possible. We don't need to, we need to be able to adapt if the plans change. We need to be able to do this without a lot of lead time and be able to adapt as the environment changes and the mission changes. So I just came from AUSA, it was a wonderful big show. General Murray was there, the head of Army Futures Command, and as well as a wonderful assortment of new semi, or not very much so, but robotic tank prototypes for the next generation combat vehicle. Industry's getting out there and making this stuff that looks like it just rolled off the set from like Mad Max. But if you talk to General Murray, one of his biggest concerns, and I think this relates a little bit to the work you're doing, but also to the future applications of these in a battlefield environment, Murray's big concern was the network. And whether or not the comms are sufficient to create a situation where an aerial drone, a ground drone, a semi-autonomous self-driving vehicle can all share enough data with both the soldier and with the folks in back of the soldier to allow for a successful operation. So that bandwidth is a huge concern of his. I wonder if you can talk a little bit about how big for the operators, how big a concern that is for you. And also, what role does intercommunication among systems play in your work? I know we're talking about autonomous systems that are supposed to be thinking on their own, but until we get there, there's going to be a lot of need for the transfer of video imagery and other imagery and just a big network that the Army wants to have behind these things. So talk a little bit about the challenge there. Sure. So I think ultimately we got to get, as you pointed out, it's a great question. We know we're going to be operating in a contested environment, and we can't necessarily rely on our networks to transmit high bandwidth data, like full motion video. So we need to continue to push and develop our autonomy ability for these systems to, if you will, get minimal guidance initially, and then they move out, right? But then, of course, we know they need to adapt, and that's going to be one of the biggest challenges of these systems. Yes, we're pretty good now in terms of adding waypoints to these systems and then autonomously navigating out to those waypoints, avoiding obstacles, and so on, and reporting back. But in terms of situational awareness, how do they add value to the team, and as Jay talked about, an increased standoff capability is making contact first. They essentially need to report back a salute report, right? A situation, what's the size, what kind of class of the threat am I looking at, what's my confidence level, and then maybe a chip image, right? A still image of that being relayed back to the operator in a common operational interface where you're getting this common picture, right, of that. And then also, obviously, we need a geolocation, right, at that threat as well. But we need to continue to push that in terms of research that we can't necessarily rely on and we want to move away from the tele-operation full-motion video. Sure, there might be times where we could have that option. We want the full-motion video, or we want another still image, and we can do that at times, but we need those systems to be able to function without it. I can augment this a little bit and talk about sort of what the current thoughts are about how network interfaces with autonomy and the use of robotics and autonomous systems. You know, now the military operates drones and we basically assume a full-motion video link, and we know that that's not a reality. We don't have the bandwidth for that. But what we're doing is we're thinking about how the networking actually becomes just another aspect of the entire planning process. You know, you don't need a lot of bandwidth if all you need to do is just, you know, recognize one small thing in the image. Most of the image is actually useless for that. So that task actually doesn't require as much bandwidth. And if you're going to be using, if you're going to be doing a control task that requires a lot of sort of high-speed, reactive control, and you want human space to be doing the tele-operation, you need more bandwidth. But if it's something that's low tempo, low impact, you know, you can get away with less bandwidth. And so there's a relationship between the perception and the control and the networking required. And so you actually want to live in the space where you're balancing those three things. It points to understanding better what the task is and tailoring the communications to the task. And then we can also look further and think about how the networking is not just a passive thing that we actually are actively shaping the network that we have. I mean, the Army already does this. You know, every unit from Italian and above, they're going to be equipped with a retrans station. So, you know, they'll put their operating posts down and the radio is going to reach so far. But you have a unit that can go out and put up an extra set of antennas so they can retransmit the comms out to get a wider field across the battlefield. You know, we can use robots for that task. And moreover, we can have the robots be able to adapt to how the network situation is changing and replace and move around the retransmit capability based on what the needs of the mission are and how the mission is evolving. Yeah, I was going to basically build on the point that Ethan was just talking about. So, ARL, you know, we've looked at this autonomy and networking communications as a dual that the autonomy enables the communications through the active approaches that Ethan was talking about. And some of our recent visits out to NTC, they actually highlighted this. The bedan commander was like, hey, I'd love to have a robot just be able to do the retrans for me so that they can have the connections. So I think, you know, this network, and then of course, as was already alluded to, you know, the autonomy is what I believe is so critically important to reducing the requirements on the network, which we know will be huge and contested. So the more we can do on autonomous fashion, the more, you know, it frees up, gives us maneuver space in that network domain. I think there's one thing that you don't want to lose sight of, and it sounds very simple, but the things that you desire to network in terms of these capabilities and this information you want to push to the, you know, the three or four, let's say three to like 12 human beings that are onboard that platform, and that's your raw numbers, right? You know, between the crew and then like a dismounted squad in the back, for example. The information is critically valuable to them, whether or not that information is going off-board that platform to some other node, for example. So a lot of these things that you want to apply some autonomy to or different functions and applications to, to the human beings onboard that individual platform, even if that node becomes isolated from the network, can be critical to them. So we talked a lot about, you know, the computer vision problems that we have in the frame of artificial intelligence assisted target destination recognition and tracking, one of my favorite acronyms ever, by the way. That is critically important to those human souls on that platform. The information you would get then one of you would want to pipe out to some other thing to provide this sort of collective understanding. That's critical, too. But I think that's something that we lose sight of is that there's a minimal threshold for capability that we can strive to. We all seek to go into this high, you know, XY axis of granular solution and complex environment. You all want to go in the top right because that's your bent from a research standpoint. That's really the goal. But at a real visceral level, there's real minimal capability that you can draw in. So network is a critical concern, absolutely. But if I can solve some problems for that crew, that's better than what I have now. And now, as I have more assurance in my network, as I build a more robust network, or we become much more dynamic in how we treat the network, generally through some very innovative AI applications that I've seen recently in terms of how you monitor a network and then flow information to the right actor at the right time, I'm just gaining more capability over time. But don't lose sight of the fact that network is a challenge, but to the crew on that vehicle, what you can enable them with, that's life and death. So think about questions that you might have because very shortly we're going to go to the audience for questions. And if you start coming up with them, raise your hand and I'll find you. And I think that that would be good. But I want to bring up something that you mentioned a moment ago, Ethan, which was this challenge of dealing with all of this stuff in an adversarial environment, in a data adversarial environment. Who here knows what again is a generative adversarial network? So you're all like roboticists, so you do. Okay, that's great. For folks at home that are watching, it's basically a means to, it can be a means to spoof a data set, right? You can use it to test whether or not a data set is true, or you can also use it potentially to inject faulty data into a data set and create a impression that is wrong. The NGA right now, the National Geospatial Agency, is very worried about some research that's come out of China that applies that methodology to spoofing pictures of the world, right? Spoofing like geolocation data, like satellite photos of this place or that place. And so they're trying to get ahead of that. But I wonder if you can talk a little bit about, first to Tony, the data integrity problem at Uber in the commercial space. And how lessons there, but also the data integrity problem in the military space, particularly with this, you know, this issue of autonomy in these environments. And that's kind of the second part of that is kind of a jump ball to whoever feels like they can tackle it. I guess there are lots of different types of data at Uber, but one of the most important kind is to get representative data of all of the situations that the car can encounter as it drives around. So that you come up with models that properly represent everything that the car will see and are not over-fitting to any particular situation so that you would do very well in some areas, but then struggle in others. So certainly data integrity in that sense is very important. Yeah. Like adversarial attacks on data integrity. How is the military looking at that? What are your worries and what are your solutions? Particularly in terms of autonomy and vision. Yes, it's clearly a really important question, right? We see this one more fake, even generated videos out there that when you look at them, you feel like, yeah, that's correct, and it was someone who was actually saying some very different words and you just can generate videos that look like someone said, something that they actually didn't say. I think one way forward will also be just to make sure that the data that is being collected that we can keep track of that data. Like what is the origin of an image? When was it taken? How was it taken? And then make sure as it goes through these different computing processes that we keep track that nobody really tempered around with them. And then at the same time, yeah, you mentioned GAN and a part of GAN is of course the discriminator network which is the job of that network is exactly to distinguish between fake generated or artificially generated data and real data and maybe these kind of lines of discriminators trained specifically for these purposes can help with that problem as well. If I can just chime in here for, I think the military, we've already had the structures in place in terms of protecting our data and I think as long as we know that we have to have the proper security measures for this data because if we want to protect against things like GANs, you have to have either the data of the neural network because you have to have the adversarial network pitted against the discriminator in order to basically find the vulnerability. So hence you have to protect your network and you have to protect your data. So this is something of interest obviously to the military but I think we have the right structures in place to be able to do that. So we talked a lot about corrupting the data set basically but there's another way to adversarily corrupt the performance of the system which is to generate input that will adversarily change the output of the neural network. This is something that is being studied extensively in the context of face recognition where you can present an image that is not necessarily a fully realistic image but will produce a recognition output that is basically what you, the adversary, want. This is an even more concerning problem because then you don't have to mess around with the data set basically. All you have to do is to figure out how the network operates on input output and be able to fabricate inputs that are not necessarily realistic images necessarily but that will produce the output that you desire. There is extensive research on that topic both on how to extract, how to identify those vulnerabilities and how to build networks that are immune to those vulnerabilities. Just a quick point from an operational perspective we, the chart that Kevin put up earlier where you have the kind of a climb of what a machine would do, what a human would do and then really where's the sweet spot of what you do together. And we view this problem as how do you queue a human to help assist, augment and interrogate an object to kind of make your way through some of those problems. So basically that would be your next generation camouflage if you will. You're not camouflaging necessarily from a visual perspective like you would in traditional military sense but it could be something as simple as pixels or scrambled images on board of a platform that didn't produce a result you don't want from a system. But a human if queued to look at it can assist in the classification of that object and that's what we think is one of the again it's what's the threshold of right and it's probably where's that human in the loop for that teamed activity between a human and a machine. So let's go to the audience for some questions. We've got a little bit of time left and you've got a fantastic panel here to interrogate in terms of your hopes and fears and expectations of the future of military and their adaptation and adoption of machine vision. So shy. Okay. A little bit shy. Oh wait here's one right here. Yes they do. They are coming up right there. Thank you. Good afternoon. And please name your affiliation as well. Yes I'm Lieutenant Colonel Matt Cooke. I'm with Future Vertical Lift. I'm the modular open systems approach lead. So although our problems are very similar on the battlefield the solutions may be very unique. One of the things with MOSA that we all have to do for open systems architecture is make sure that we're standardized so that we can get the rapid adaptability and the high systematic reuse of our components for hardware and software. That being said a lot of people here working on unique solutions. Are you guys using as essentially a baseline Stain Ag 4586, Air Inc. How will the U.S. military be able to pull these from 6162 all the way into S&T and then beyond that into production? So I can speak what our plans are for this program specifically. So the pathway that we've outlined is we've been working with the Ground Vehicle System Center. So these are the folks that own the Ground Vehicle Robotics portfolio for the Army. And what they've done is they have put forth that the Army should standardize for ground vehicles around Ross. Everybody has at least heard of Ross. They should be familiar with it. And they put forward what they call RossM, which is Ross Military, to kind of play off the idea of Ross Industrial, where a large industrial consortium got together and said what are the important aspects of Ross that we need to focus on? What are the capabilities we need? And how do we sort of decide which packages and standardize the interfaces between them? So RossM is an effort to essentially build a business model which is around modular systems. So GVSC has... I think of it as a reference architecture which essentially says here are the basic building blocks of a system and this is built upon a host of programs they've done over the last decade starting... Stewart showed the chart of the flow of things. These things are all interconnected when it comes to ground vehicle robotics. So we're actually in the process of moving things from our CTA into that and we're proposing modifications to that because that's very much architecture that was kind of what came out of our CTA-1 and we've learned a lot of things since then. We have this entire intelligence architecture which goes from language and reasoning and ties in perception and so we see those as things that are augmenting that and so we're going to be folding those into the reference architecture that then for ground vehicle robotics at least we'll be able to build upon. So that's the strategy that we see forward as a path forward for ourselves. Right there. Yep, so Shad Reese, senior EOD robotics technologist out of Indian Maryland, question on standards. Does industry have standards and are they bleeding over into the DOD for us to leverage for testing evaluation of all these technologies? So we're tracking on the same topic. Yeah, I guess we certainly have standards but I would say that we're using our own architecture that is proprietary and home grown. I'll just for the camera. What are the names of the standards? So there's been work with what was called JAWS and I don't really know where that stands. A lot of the work that we're showing here in the CTA is pretty foundational and related to your question on the testing side of things, and the systems become more and more autonomous. There's I wouldn't say debate but there's definitely recognition of the need for doing a better job of understanding how can we test these systems. So we're working with ATEC and the different testing agencies to help inform them on how you might go about doing these even to the point where what is the definition of developmental testing and operational testing even mean when you're talking about systems that continuously evolve and learn and reason. So as was discussed earlier in the understanding of, you know, is a system good or trust, you know, there's approaches that we worked on in the autonomy COI across DoD to basically say, hey look we're going to have to think about this problem differently. The analogy is more recognition of your child or 16-year-old, whoever has a 16-year-old learning to drive and there's no guarantee that they're going to ever avoid a wreck or an accident. But they demonstrate certain capabilities and they demonstrate enough for you to convince the DMV to give you a license. So it's not a guarantee but it is evidence-based so you have some level of trust and so Tony was talking about this earlier. So I think we're going to have to get past perfect testing of every edge corner case because that's a fallacy and time and on earth doesn't permit that. So we're going to have to think about that related to the standards. This is something that Chris and I are talking a lot about trying to figure out how can we do this across the AI Task Force and ARL and informing this ROS and GVSC community to standardize on these interfaces, what are the right interfaces so that we can get maximum effort. So the RCT is doing a lot to help inform what they should look like and then we'll push that along to our colleagues in GVSC and working with the AI Task Force to solidify what those standards need to be so that we can get maximum industry involvement in that. Yeah, I guess one additional thing is there are a lot of conceptual and architectural similarities between systems but there's not a formalized standard in what I think you mean by that. So the industry has standards but they're not standard. Exactly. I think one of the problems is that many of the algorithms we're talking about here like the perceptual algorithms and these kind of pieces are really still at the research stage and as we know research groups, they use whichever tool is out there that they can build on top that's kind of the easiest first step and ROS is one of the ones that's being used a lot sometimes people just because there's no great alternative out there, for example in NVIDIA is developing an SDK that's more geared toward high speed GPU computation but again I think at this point there's nothing that the whole community fully agrees upon. In an attempt to make this discussion even more depressing I want to point out that standards are not only about software and software architectures, they're also about data and how to represent data, how to share data, how to transfer data how to reuse it, issues of privacy in particular in the self-driving area. The government I believe needs to issue more information a couple of months ago exactly on that topic so the fact that there is an RFI from NIST is evidence that this is still a very much a work in progress. Yeah, you don't request information on something that already exists so other questions? Must be one or two, I think I saw a hand okay well I guess did you have a follow-up? Yeah okay So this question is on I guess the interactive piece it's not clear to me where we are in regards to the algorithms generating a model and then the machine learning system knowing what's missing in that model and then tell the human what's missing in the model so you know kind of have these systems figure out what they're missing and kind of ask for more information from the human so I don't know if there's research being done in that area or someone can elaborate on that. Yeah that is a huge problem in fact this is related to a deeper mention in introspection that is the system being able to understand its own performance and understand what is missing in its performance and there is research in this area that is launched a program called competency aware system part of it is I mean the system means self-aware of its own performance and if you will its own weaknesses so that's a step in that direction Yeah it's a difficult one because humans are terribly bad at understanding the information that they're missing so right back there yes Larry Matthews JPL I've been a participant in the program and it's an opportunity to ask Tony since robots for the military ultimately need to work day and night all weather all season if you can give us any insight into where industry is at with that range of environmental conditions. Yeah it's certainly very difficult to field a self-driving car that can operate in all weather all conditions and so really the approach we're taking is since we control the vehicles in other words it would be part of our network we can selectively deploy them so we can start by deploying them only in good weather conditions only in easy environments and work our way up from there that allows us to realize the potential of self-driving vehicles long before we solve the entire problem but the entire problem itself meaning a self-driving car that can operate anywhere anytime all conditions is a ways off Can you specify or elaborate a little bit on a ways off could you like 10, 20 years I know people hate to do that and yet that is useful Yeah it's very difficult to estimate that and to solve the entire problem I would say is years off and I've made estimates in the past and I've been wrong so perhaps I shouldn't make another guess What was the wrong one? Which one? Other questions? I think we've got time for maybe one more so if you were holding back but oh we do we have one here in front thank you wait for the mic please and affiliation helpful thank you Don Rego from the Army I have a comment hopefully lead to a question about the issue of trust and black box mentality I've looked at the outputs of these things for as long as anybody and the thing that's most troubling about them is they change their mind from frame to frame ostensibly you've got the same image and from frame to frame the output will differ and I think the thing that's going to erode confidence the most is something that appears to change its mind in a indiscriminate way so is there something we can do in algorithm development to make those outputs more consistent? Yeah I fully agree that that's of course a big problem right if you see a video for example or something like that and it just doesn't know what the previous frame how it analyzed that anymore or at least seemingly I think on the one hand side yeah we can do a lot of course on how we train these systems so for example one common thread is that we see a lot of what called domain randomization which means we train these systems so that they are robust to changes in lighting conditions and to other kind of changes in the coloring and in the scene and I think as we train them on a higher variety of scenes then they will become more robust and also incorporating techniques that that we've known from the more traditional settings kind of temporal reasoning so that they learn to be more consistent over time I think those kind of things will of course increase the quality of these and in the end yeah we might always face this problem that a human looks at the output and says like that's not intuitive that doesn't make sense to me and that's a general problem we'll have to face yeah. Well folks I by my clock it's been an hour and I want you to please join me in thanking these panelists for their insight today thank you so much.