 Getting robots to operate as teammates is not an easy task. Researchers from the Army, academia and industry have been tackling this challenge for a little more than a decade. At a simulated marketplace here at Camp Lejeune, they're trying to get robots to take complex commands like, meet me at this fruit stand. So one of the big breakthroughs over the last 10 years as part of this project is the fact that the robot no longer just sees the worlds as a collection of 3D objects that have no real meaning but are obstacles to be avoided. Instead, the robot sees the world with some meaning. It can understand things, it can see things and what they mean. It knows what tires are, what a fruit stand is, what a store is, what a doorway is. And it can look for these things. And that makes it of course much easier to interact with that robot. The University of Central Florida represents one of nearly two dozen universities and industry partners that have teamed with the United States Army to push the frontiers of scientific research in four critical areas of ground combat robotics that affect the way U.S. war fighters see, think, move and team. And getting robots to communicate is a huge factor. We have a number of interaction modalities that is actual tools and methods how the soldier can interact with the robot. One is speech, another one is gestures and pointing. And the third one is a traditional user interface that has a map display and also shows you what the robot actually sees. One of the big things here is the fact that the robot can be actuated through voice commands. You can tell the robot what to do and you can use fairly natural language in determining what you want to do and describing to the robot what it shall do. Go to the fruit stand. So you can say go behind this building and then if the robot sees two buildings it is actually smart enough to figure out to ask which one. The left one, the right one are to show you a picture. If you say go towards the gas station or the fruit stand as long as it can classify what a fruit stand is it will find it and then it can go there. But you can also support this by pointing for example in the direction of the fruit stand. So if it yet does not see at the beginning of its mission where the fruit stand is it knows where to go to. Researchers say advancements in human robot teaming no longer place robots as tools for soldiers. Robots are now teammates to soldiers. Our soldiers do a great job in the field and this is awesome. The one great benefit of having robots in the field with them is that if there are environments or scenarios where we don't have a lot of intel we don't have a lot of a prior knowledge like let's send the robot in first. Let's hold back our soldiers and keep them a little bit safer have the robot collect some data and give them a little bit more intel situational awareness on the situation. The other thing is that robots have the ability to do a few things that soldiers may not be capable of doing. We have sensors on the robots, cameras that can zoom way further than the human eye can see. This allows us to collect data further off again providing that intel to our soldiers to give them that extra information. The challenge here for the robot though is communicating data to the soldier and location where there's low bandwidth. When the robot is in a remote location there's a lot of network bandwidth problems. So a lot of times we lose the connection and so it's a very sparse communication. So in that case sending over the video of what the robot sees or over this low bandwidth network is not ideal and even an image may not be transported. So text is the best way. So we try to have an algorithm that turns that information that's present in the visual data into a very abstract, succinct textual description and then send it over to human commander. The robot has to use its sensors that are on board and typically the sensors include the camera primarily for computer vision and then there can be 3D sensors so they can collect 3D points then combining the two modalities to start building the map. So unlike commercial systems where maybe they are given the map of the environment prior to the task. In our case we don't assume any prior knowledge so the robot uses its own sensors to start building the map on its own and then execute a command. There's been a lot of advances in visual perception. This really just means advances in how the robot sees the world in a similar manner as a human. So being able to identify buildings and differentiate between grass and gravel and different types of terrain. So there's been a lot of advances in deep learning. Having better perception on the robot provides better intelligence for the robot which allows it to make better decisions when it's planning past an environment. The enormity of this complex research problem is explained this way by MIT's Dr. Nicholas Roy. There's several technologies that we're testing here. So one is natural language understanding. A second is mission planning. A third is perception. A fourth is human-robot collaboration. The data that we're gathering here is really an assessment of the performance of all those different systems. So we have humans giving instructions. People use lots of different language so we're gathering all of that data so that we can train our models to work better. We are gathering data on how the language can be used to describe the world so that we can build better models of how the robot should understand the perception and the context of language. We're evaluating how quickly and effectively our planning systems can perform. Are they choosing good trajectories? Are they going to the right place given the language? And we're also evaluating how well the human and the robot actually do the task together. Is the robot actually providing an assistance to the human or is the human actually being slowed down by the robot? We're assessing these things as well. Researchers are using the commercially available ClearPath Husky as a research platform that they've embedded with a suite of sensors including a laser range finder and RBD camera and a variety of conventional CPUs and GPU computers. The heart of the system is what is referred to as symbol grounding. So symbol grounding is an algorithmic approach to taking in natural language and associating it with the perception of the robot. So for instance when I tell the robot go down the street the robot doesn't necessarily going to know what that means but it's actually learns to interpret the word street in the context of a perception of a gray surface in front of it. What does it mean to go down? The robot's actually learned that when it's told to go down something it means it's got to navigate along the length of something. So that's part of the system. What the human's going to be carrying is the human partner going to be carrying is the thing called the MMI, the multimedia interface. Essentially it's a device that makes it easier for the human to understand what the robot's seeing. It's a way for the robot to communicate with the human over distance. Other technologies that we have on board are of course just basic perceptions. So there's a lot of work on into having the robot actually recognize what's out in the world. Again it detects obstacles like barrels as barrels as opposed to just a massive obstacle that should avoid. We have mission planners on board. Let's say a robot's given a goal return to waypoint alpha. The robot was told where waypoint alpha is. Now it has to make decisions about how to get there. So we have motion planning and task planning systems on board as well that are interpreting all of that in terms of a thing the robot should do. And the robot has to do all of this while navigating around pedestrians. You could imagine that a robot's human team are in different parts of the environment and the human teammate does provide the robot with an instruction meet me at this place. If it's a crowded marketplace there are people walking around. So in an ideal world if there's nobody in the environment the robot just sort of takes a straight path to wherever it needs to meet the soldier. But we want to be socially compliant in this situation. So the robot can then plan paths that will very seamlessly maneuver around the pedestrians so it sort of makes a prediction about where the pedestrians might be going and then it can plan around where it thinks the pedestrian is going similar to how humans would do this when we're walking among crowded areas. These are sorts of the environments that we want to make sure that we can operate in because we will be operating crowded environments. We don't have a lot of understanding about the exact paths that humans are going to be taking. So this is sort of why we are choosing these unstructured environments. This is the hardest problem to tackle unlike some other applications where everything is very structured. So if we can address these unstructured environments and come up with algorithms and use machine learning to develop these models that operate in these unstructured environments then they can easily be applied to more structured environments. The collaboration within our ARL researchers, scientists and engineers in Adelphi and Aberdeen has been really great. They have a particular set of skills that we've been able to leverage. They have a set of domain knowledge that we've been able to leverage. They also have brought us research challenges that have informed what we want to work on and that's also been terrific. I've really enjoyed working with them. They're highly skilled, highly technical, and highly collaborative and that's perfect. How soon before we can expect to see this technology fielded? I would say for certain straightforward tasks and tactics. We are very close for tactical maneuvering that is real say building, clearing, coordinating and search tasks. We are still a little bit away but we are making great progress towards it. Working these areas continue to mature as part of high impact foundational research ARL pursues in collaboration with the worldwide academic community to shape concepts and the future operating environment. This is an example of where we put the best brains from all places together across a number of different fields and we're putting them together to solve a real problem that is important to the nation. So I think this is truly an investment that will pay off not just for the military but for society at large. For the Combat Capabilities Development Command Army Research Laboratory I'm TJ Ellis.