 My name is Haikem and I am from Yerevan, but I am from San Francisco, California. My name is Khosun and I am from England. Okay, so I work at a company that builds drones with cameras as robots for capturing data, building 3D models, understanding the world, and I want to talk about some of the parts of that. It's mostly videos and pictures, so hopefully it's entertaining. And please ask any questions you want, just raise your hand and interrupt me, no problem. So I will talk a bit about what the company does. So our goal is to make things more productive, creative, and safe with autonomous flight. And there's a lot of different use cases, which I'll talk about. The company now is a pretty big company. We're the biggest drone company in America, and we're used across a lot of different use cases, and we have tens of thousands of drones out there. I've been there for eight years since we were just working out of a little house, and so I got to see a big part of the kind of journey of building the robots from just put together components and prototypes to pretty large scale kind of building. So the use cases for these kind of robots and drones are just span a lot of different industries and segments. So if you think about all kinds of inspection tasks, so any sort of buildings, commercial buildings, bridges, dams, any sort of infrastructure in any country around the world needs inspection. They need pictures to look for damage, they need to track changes over time, plan out construction, and we kind of focus on all this different stuff. And the company started out just capturing video for filmmaking, for action sports. So if you're riding a bicycle, it can film you and get cool footage. And now just a really widespread set of use cases, but the common theme is basically the drone has cameras. The cameras need to see and understand the world, understand how to move, understand what's around, and be able to make decisions on its own and capture the right photos and video. So the main focus of our company is getting the drone to fly itself by using the cameras to build a map of the world. And so the reason for that is because it's hard to fly the drones. So if you're a trained pilot, you're an expert, you can fly it around and you can kind of capture the data, but for everybody else it's really hard. So making it easier by having the robot understand what it's doing and using AI and navigation to basically do it itself makes it a lot more accessible to any companies. Think like the power company that manages electricity in Armenia could use these things to scan their transmission towers and find damage before something breaks. And so a big part of the company's focus is kind of going from one person flying a drone that's always controlling it to the drone basically doing a whole thing by itself for 20, 30 minutes and the person is just watching. And then going from that to the point where there's no person there at all and from the point of the drone starting to fly to the point where the drone comes back home and does its whole mission, nobody is there and it can be at some remote site and you just program it and you get the photos on the website and you can then do something with those photos. And then even more from there we kind of have this vision of having many drones do tasks together and have one drone located in one spot be able to do different missions for different use cases which I'll talk about a bit. So that's kind of the point of the company that we work on and the vision and I'm going to go into talking about kind of some of the technical parts of the robots navigation. And here is my dad with one of the drones. Recently made it through Armenian customs. That's a great timing. Okay, so I'm going to talk about three parts. The first is the core kind of visual navigation of the drone. How it uses its cameras to navigate the world. I'm going to talk about 3D scanning and building of 3D models from the image data and then I'm going to talk about the dock and how it does missions from takeoff to landing and everything without ever needing a human to be there. So the drone itself, so this is our kind of second generation flagship drone and it has seven cameras. So I've kind of drawn arrows here for where the cameras are. There's basically three on the top here that provide navigation capability and three on the bottom. So each of these six cameras is used to see in every direction at the same time and the field of view is very large. It's something like 200 degrees of freedom so it can see all the way here and all the way here with all three of these cameras. And so with them put together it's kind of carefully designed to see in every direction if you can fly anyway and have obstacle avoidance and not crash. And then this is the main camera so this is the high resolution camera that's used for capturing the images that primarily get used by whoever's flying it. And so this camera is on a 3-axis gimbal here so you can kind of see it stabilize. So maybe I'll just pass this around if anyone wants to take a look. So as I've kind of mentioned the cameras are everything because everything we do is based on the cameras and the image data that's coming in from the cameras and it's kind of carefully designed here to have really wide visibility. And a kind of fun thing is that the birds also work a lot like this where the different types of birds have very different goals but for example this bird is kind of crazy. You can actually see in front of it and behind it with these two eyes and for most of the area it only sees with one eye. But for the part that's directly forward where it needs really good depth perception it has two eyes so that it can compare the signals and do stereo vision. And then our drone is designed with this kind of things in mind to have multiple cameras see in the right directions that give the right signal. So the hardest thing about this kind of navigation is that cameras are you have a lot of different visual things that you can see. The drone can be flying anywhere. It could be flying in a jungle. It could be flying in a desert. It could be flying in this room. And there's a lot of different types of difficult things that are making it hard to figure out what is the 3D shape of the scene and pull out the data. So stuff like water and moving waves. Things like glass and reflections and very thin wires that become hard to see especially in difficult lighting like a sun behind it. Things like these kind of camera effects where this will be it's not a real shape that you're kind of building a 3D map of. It's an artifact of the camera. And so white walls where there's really no texture but you have to know it's a white wall to not crash into it. You can get stuff like water and dirt and dust on the lenses. So there's all kinds of stuff that makes it hard and that's kind of the core challenge of the visual navigation pieces. So I won't talk about this too much but we have kind of like a technology architecture that is a lot like a self-driving car where the core navigation here at the bottom the goal is really don't crash and be able to move around wherever you are without GPS signal. If you're high up in the air you can use GPS and it works well but if you're under a bridge or inside a building you can't do that and so you have to use the cameras to move around. And then up here is more like higher level applications. So the 3D scanning application which I'll talk about later is built at kind of this level and uses the core algorithms. And then there's this kind of middle side which is really the new stuff which is remembering a place the drone has flown between flights. So you have a bridge where the drone is installed and it builds a map of the bridge and the next flight it already knows where the bridge is and it can localize itself and already save time and get the exact same imagery which I'll show a bunch of stuff about. And then there's multiple ways of kind of controlling the drone whether you're flying it yourself or you're doing it through a web browser which I'll demo or it's just doing it totally by itself. So the core navigation pieces so as I mentioned kind of the primary goal basically navigate and don't crash. So these are the types of places where it's really hard you can't use GPS and you have to be very robust with the cameras so if you're flying inside a bunch of these poles or up in a bridge in a metal structure so there's no GPS and you kind of have shiny metal very close to you so it's the type of place where the drone is the best option to get there and get the right data. So the first step in this process is to do state estimation which is kind of like visual odometry if you've heard that so basically using the cameras and the inertial measurement unit so that's like the accelerometer and the gyroscope like in the phone combining that together to understand the motion of the drone and so the way it does this is so if it's moving through this room then the camera will find interesting feature points like maybe this corner up here is something that three different cameras see when it's here the drone comes over here it sees these same points and then it knows that if this point hasn't moved and it sees it in these pixel coordinates here and then a different pixel coordinate here and that gives the information to solve for where is this point relative to the drone and then that gets repeated with a lot of different points around the drone and you get basically a trajectory of how the drone moves and this is the core thing. In addition to that there's a bunch of other things that get estimated so for example the properties of the camera lenses can change with temperature they can change when the drone hits things which it you know almost never does but the so if you put it in whole temperature to hot temperature the lens will actually kind of shrink and expand and so we need to actually estimate those things at the same time as we're estimating the rest of this. Okay so not crashing is kind of the thing that we are put the most focus in and are known for so the obstacle avoidance is the key where you're flying you can try to push it into a tree, a wall, whatever and it won't go, it will stop itself and the way we do this is using deep learning for basically comparing the different images between the cameras and finding matches between them for every pixel and turning that into a 3D map around the drone so we can say this is a 360 view of all the drone's cameras combined so here it's flying down following a mountain biker and this yellow box is actually the main camera so this is what the output video would look like but it's a very small part of the full 360 and so here actually we're flying backwards for a lot of it in front of the mountain biker because it's a more exciting kind of video but you have to imagine the drone's flying backwards at high speed and then it has to duck trees that are behind it and plan out the shot so we build a kind of a depth map like this and these are examples of the images that again these are looking straight up because the camera is very wide field of view and so you can see kind of in all the directions and these are the depth maps that are computed using our deep learning networks that are trained on a lot of simulation data and also a lot of real data the problem with the real data is it's very hard to get the labels we have billions of images of just flying around but we don't know what the true depths of things are and people can go sort of try to match a few of them but it's very hard to get real data for this so we put years and years of work into this and making it work well can I load you to the player? Can I ask a question? Yeah, please. Don't you use LiDAR points? We don't and it's a good question the reason we don't use LiDAR is because on a drone that's this small it really doesn't yet offer a good balance of weight and size and cost so some LiDARs are getting really small like for example the new iPhone Max has a LiDAR that works very well but it works very well to a few meters so if you're in a bright sunlight and you're flying and you want to see things that are 20 meters away you need a way bigger LiDAR sensor that uses a lot of power and is very heavy if it's very heavy then the motors need to use a lot more power so on this kind of size of a drone especially if it was a 20 kilogram drone then much more so but the cameras have all the information it's kind of just harder to get for sure okay and then so from this kind of map of the environment and our estimation how we're moving we have to plan out where to fly and the drone needs to navigate and so we have a big complex kind of optimization problem that has a lot of different objectives that it's trying to solve so there's the high level goal of what is the person trying to do so for example if we're filming somebody we might say we want to be in front of the person so moving kind of the opposite way of their direction then we want to get smooth video so don't shake around a lot and we might want to well we definitely want to not crash into things so we have to avoid obstacles then we have to plan out what is aerodynamically feasible maybe there's high wind in a certain direction we have to consider that and so all these things need to get balanced and we run a very fast kind of optimization problem on the drone it's constantly kind of deciding how to how to fly and how to navigate and that's at the kind of high level planning and then goes all the way down to controlling the motors themselves at something like this high level planning happens at maybe 500 kind of optimization steps per second but then at the motors themselves is more like 30,000 times a second where you're controlling the voltage in the current and kind of trying to combine that all together and at that low level you're really mostly focusing on the kind of aerodynamics of the propeller itself and the control there hmm it's going to be a lot better if we can watch videos oh you know I'm just not connected to the internet is there a password for the guest? 357 okay let's see if that works perfect so put all together there's just some fun video clips of things that are you know video our customers have flown and put together with the drone all this navigation working together so in all these cases the person has set some kind of parameters for maybe how far you want to be what angle, how smooth you want the video and different high level things so this was kind of the first even five years of the company was the primary use case yeah the scan what? yes then what kind of computer or controller do you use to solve this complex machine learning stuff yeah so this drone has the main computer for the algorithms is an NVIDIA TX2 and that's NVIDIA is an investor of the company and we worked with them to design our own kind of custom board that has their computer integrated to save on weight and cost and that's the main processing it has a decently beefy GPU and also a Qualcomm chip that's primarily for processing the high resolution video from this because the Qualcomm is really much better at that piece of it okay I think long term the focus you can think about it kind of like being like cell phones we want the latest and greatest of what's in cell phones as they get smaller and lighter and faster and everyone works on accelerating AI stuff the internals of that in the drone that gets increasingly smaller and more capable but at the same time you really want very large camera sensors and that pushes the weight and size of the drone up because people will want much bigger camera like professional cameras thermal cameras different industrial use case sensors so those are kind of the main tradeoffs of like the heaviest thing you need to carry is the camera that's kind of built around that and the bigger you make the propellers also the quieter it is and the longer it can fly but that of course is less accessible to use and the bigger it is also a lot of things get more expensive so there's some tradeoffs there alright so I'm going to talk about the 3D reconstruction pieces a little bit so this is going on just kind of local obstacle avoidance but more about let's build a 3D model from what the drone captures and let's capture the data to build a 3D model so this is something that's a really big use case and used kind of everywhere around the world for all kinds of infrastructure so this is an example of a piece of a bridge in Germany so this bridge you know is fairly large this is one of the main pillars this is something like the whole bridge takes tens of thousands of photos to inspect at some distance so let's say you want to look for cracks and rust and maybe bolts that are breaking to do that you need some certain resolution of imagery so let's say it's 1 pixel is a 1 millimeter so you need to take an image from say 2 or 3 meters away with this drone so to do that for a big bridge it's a lot of photos and this is an example of the bridge so you can imagine you're a trained pilot that's out there maybe it's a hot day and you have to take tens of thousands of photos and keep track of where you image and not go back to the same place not miss any spots because if you miss any spots you go home you try to build a 3D model it won't work and then you have to go back and it's really expensive and difficult so basically focus on trying to automate that process by having the drone build kind of its own 3D model track where it's got an imagery and kind of map it out to try to get the perfect data set to build a 3D model or do an inspection so this is a video that kind of shows some pieces of that so kind of traditionally there's a lot of alternatives to inspection that involve kind of stopping traffic and hanging a truck using a helicopter having people go off the edges also you can you can crash and it's important it's done everywhere like disasters happen when things aren't caught and so the point of this kind of 3D scan is to let you basically this is a good image so you basically choose the some 3D volume by choosing where the edges of it are and then the drone will fly around and build a real-time kind of low-resolution map of it and that's kind of this this yellow area and then as it captures the photos you'll plan out you tell it what resolution you want and it tries to go capture the data and as it captures the data it shows you the model in sort of augmented reality and that colors it so when it goes to purple from yellow that means it's captured the right data and you can kind of see it see it happening as it as it goes so here the white dots are the trajectory it's kind of planned for for this use case so at that level like when you're flying 3D scan basically you're not using the sticks you're just watching it and you're setting some parameters and you're making sure it doesn't miss anything so this is an example of setting the bounds of the volume and then yeah this is building the 3D map and you can kind of see now you can choose what resolution you want and that changes the flight path how long it's going to take how many photos and then it will go capture the photos it says maybe it's going to take 20 minutes or maybe it's going to take two hours and you need to change the battery multiple times because it runs out and then you get a data set like that where you can you know view stuff and you can build 3D models from it so I'll show an example of that so this is an example output of actually something that I captured in San Francisco and so you can get a very detailed 3D model where this can then be used for all kinds of stuff so looking for damage but also planning modifications using it in games and VR or just getting it 3D printed for fun yeah so the first step of the algorithm there is the part where we build the real-time map so as the drone flies around here it's trying to get within say like five meters of all the different parts so that it can use its navigation cameras and stereo vision to build the 3D map so you can see it kind of being constructed here as the white line the drone kind of flies around and tries all parts of the volume and this doesn't have to be super detailed this is just detailed enough so you can plan out the path to go capture the more detailed imagery and for planning out the path there's a lot of different ways to do it most important things are usually speed so how fast can you do it and kind of predictability so one thing we found when we tried to get really fancy with the flight path here in dynamic it was really scary to everybody when they didn't know what was going to go next and so making kind of a nice regular pattern that doesn't miss any spots but is very predictable and we can say here's how long it will take is really important here and then so for building the final high-resolution model then you're running a photogrammetry algorithm so usually this is like a much more expensive offline version of what the drone is doing while flying you process stuff at high resolution you do much more global optimization where you're doing kind of a structure for motion optimization to very very accurately find all the poses of the images and then build a detailed 3d model yeah so so as an example kind of put together this is another castle that I scanned where on the left here is just showing the images the drone captured and you can kind of see it's it's chosen to have a overlap between the photos you want multiple photos that can see any points so you can do kind of 3d triangulation but not too many photos because then you're wasting time and data and then on the right is a 3d model that's built from it maybe one more cool one with this helicopter so this kind of thing takes not a lot of time to do anymore you know this this one was 43 minutes last one was 30 minutes okay I'm going to talk about the docking station next but any questions on the scanning stuff in hardware so we have microcontroller that triggers all of the cameras to start their exposure and that is a tricky problem otherwise so that's one of the reasons why we decided we started as a software company but we realized that in order to have a really well-working autonomous robot we had to control everything so we started making the hardware and the electronics and the drivers and everything also mentioned the cameras are rolling shutter so they they expose row by row and so if your drone's moving you need to model and correct for that too and that's something we put a lot of work into also one of the nastiest problems tends to be vibrations so if your propellers are moving through air they create vibrations those vibrations propagate through the body and if they're rolling shutter while the image is being captured you get these kind of waves and it can get really nasty and even in the main camera so imagine the vehicle is shaking then the gimbal which is supposed to stabilize also starts shaking against it and it can be very hard to attenuate those signals so there's a lot of low-level camera things that are very tricky yeah your journey have you tried using the S cameras? yeah yeah totally so the well I first drone had 13 cameras so we we went down to 7 using fewer cameras is obviously compelling from a size cost and weight although they're pretty cheap and the weight is not much which is one of the reasons that we haven't put more focus on it but basically the configuration here is allowing to see in every direction with at least two cameras if you go down to one camera and you can no longer do instantaneous stereo vision and you can still have one camera and then either do kind of monocular depth estimation which the networks are pretty good but they tend to be pretty good at the easy stuff specifically if you're driving it works well there because you always have a road and the scene is very regular you kind of know the road is flat the stuff around our trees they're vertical if the drone is totally random where you can't really rely on that stuff like imagine you're in just like a forest where in every direction is a bunch of branches it's very hard to use a monocular depth estimation there effectively so the second option is to use images between times so you have one camera the drone flies and then you kind of do the matching the problem there is that you need very very accurate estimates of the cameras location to do that piece if it's fixed in time you know this form very accurately if it's not then you just solve for that and it's very very precise things to be and then the other the other problem is things that move so if you have branches that move with the wind suddenly that's going to cause you a ton of trouble and so those are some of the challenges but ultimately I think it will move to using fewer cameras yeah question here so I was wondering how much of the image processing like once you have all the images already and you just need to convert those into a 3D model like how much of that is stuff that you all have to develop versus stuff that already you know has been developed yeah so there's a ton of photogrammetry software that will build 3D models for you for different use cases that people use there's programs that are really good used in construction versus surveying or accident scene reconstruction or anything and they generally work well but we because computer vision is our main focus and because we work on doing it really fast and on the drone we also have built our own offline version one reason is so that we have full control over it and in our own cloud system which I'll show we also use the 3D models and so we wanted to control that and we thought we could do it faster in part because we have certain advantages of the navigation cameras there's almost no photogrammetry software that will accept those and use those effectively and so we have a lot of prior data to make that process go faster and work more robustly so if we've solved already for the poses of everything with these cameras and then we kind of start the optimization of the main camera it can be a lot faster and work better alright good so okay the docking station so this is the future for us this is kind of the later stages of what I mentioned so we're going from you fly the drone it doesn't crash to you're monitoring the drone as it scans the 3D volume or does some other task to a person's not there at all the drone is completely by itself doing things and so we have kind of two versions of this the basic requirement for this is a home base where the drone can charge its battery and can upload data so this dock is basically just a box where the drone lands in the thing it slides in and then it charges the battery and it has an internet connection so this is something you would install at you know some industrial facility like a factory or an airport or a construction site that's ongoing or an electrical facility and then this is a small cute version that just doesn't have the security doesn't have the environmental protection but it's just a very lightweight pedestal that the drone can land on and so with on top of this is a whole bunch of new software that supports this kind of end-to-end use case so one part of the software is this whole kind of cloud ecosystem where we have basically a website where you can see all your drones where are they you can control the drone so I can take a drone off from somewhere from anywhere in the world and fly it around and you can stream live video so let's say you have a search and rescue going somebody might be controlling the drone and then you might have 10 people watching it in real time looking for stuff and then on top of that is kind of the mission planning so then you want to create a mission that the drone can run on a schedule so you want some imagery every hour or every day at your facility and you can create this mission in a few ways one way is by flying the mission and then the drone will repeat it so you teach it once and then it repeats it every time there are other ways where you just generate it from a 3D model or in the cloud which I'll show but first I'll try a live demo here which is always fun so this is San Francisco up here and we have a warehouse over here where we keep as a test warehouse for drones I'll show a cool visual of it but let's say let's say this drone if we want to fly so I click on that and then I'll attempt to connect and tele-operate it so this is now a live video coming from California in this test warehouse where it is 4am in the night so we'll start up some of the algorithms and take off the drone hopefully this is indoors so this is this is a test warehouse where we just have all day, all night the drones are flying test missions we can show cut so drone is now flying there's fun keyboard shortcuts you can also connect like a video game controller and fly it around so here I go in and I am flying around warehouse so I can go look at some stuff here's some other drones that are sitting here in the in the docks show you around we can go this is the drone camera they're big Legos we're trying to make this a cool warehouse by adding all kinds of interesting structures and stuff to scan so for example to get through this slot I can go into more reduced obstacles mode where the drone flies slower but it's willing to get closer to things and I can go around I can go into rooms I can plan out a mission so let's say I wanted to make a mission that goes and looks at this little thing over here a cool thing is I can also click to fly so instead of using the shortcuts I can just click on something and the drone will kind of go towards it and then I can even zoom in here and get a nice close up view of what I'm looking at and so if I were making a mission now I could go to the mission planner and kind of walk me through creating a name of a mission setting up waypoints saying take some photos here take a panorama here go through this door and you're basically just recording it so then you can save it and then you can put it on a calendar and say why this every day at noon and upload the photos then you can compare all those photos together and so for this use case because nobody's there it basically just has to work like if when things go wrong let's say you run low on battery or the wireless signal cuts out or for some reason the CPU you have a program that crashes and just like it has to be super reliable because this could be in theory kind of anywhere and one of the cool things here so when I hit return like it's built a map of this warehouse before oh interesting it's even going back so it's now backtracking via the path I came so that it knows how to kind of get back and so it will by itself kind of know how to go home and land when the mission is done that's kind of the main has to get back no matter what so now we'll just normal obstacle mode we'll go take some shortcuts back to find it's one of these little tripods it'll look down find that tag and then go land in it this is a pretty small landing pad you can see those pins are for charging the battery so we have special batteries for the docks that have pins on the bottom basically so once we're once we're done with that now mission succeeded and we can go back to looking at the main page so here I can look at you know here's here's all various kind of photos and videos of the flights that have happened and here's the missions that are created so I can browse and edit these missions and I can put them on a schedule and I can see the results of the same mission flying at different times and then there's all kinds of ways to access this data in APIs and basically pull them to some other companies website so they can do their own processing or we build 3D models from it or we run kind of AI models for detection of damage or other stuff so the main point of all this is the people that buy the dock like they don't want to think about drones they don't want to think about flying a robot or anything like that they don't even know anything about that usually and it's all about what is the data you get at the end of it and usually they don't even want image data they want something else like they want a spreadsheet of some results that might be what does this gauge say in my facility or how many boxes are on this shelf or what bar codes or you know things of that sort and so there's a lot of you know AI models, foundation models for basically doing a lot of understanding of images how to find objects by example and we're building a lot of this into basically the intelligence of the drone not just after a flight but also in flight so you can say go around this site and find me an object that looks like this and then the drone will look for that it will zoom in it will take close up photos and basically design or save time and efficiency for that purpose yeah for doing all of these tasks when are you using some multi-task models or are you doing that separately solving object detection tasks it can easily do some certification yeah good question the there's a bit of a combination for a lot of the navigation stuff it's kind of more mature and made to be very fast and some of those share important components but sometimes they also have very different requirements like they might use different ones of the cameras they need to work at different frame rates or might need to be zoomed in on some area and so often it's harder to share because of that I think with more of these semantic foundation models there's more opportunity to kind of use the same backbone for a bunch of different tasks and primarily on the main camera and so there it is the kind of thing where you want to maybe you run that at a regular rate but you get a lot of info out of it it can be tricky to train those too like balancing the data so it's a challenge yeah absolutely so the main constraints there's the physical constraints of basically the strength of the propellers and motors so this drone goes up to something like 16 meters per second in airspeed so if you're flying downwind you can fly faster and then we have an enterprise drone that's larger you can fly in higher wind but at some point you're just really fighting where you need to accelerate against the wind you get a gust of wind with turbulence and the motors just don't have enough power anymore so yeah so there's the hardware part and then the algorithm part is really like it's more of a continuous scale of how likely is something to fail so if we have power lines in front of us and we're flying at 16 meters per second versus 8 at the 16 meters per second you have to see it from further away which is harder so most of the time it will work it's just a question of does it crash one in a thousand times and even that's hard to say because it really depends on if there's a nice blue sky against it then it's easy if it's like a bright sun behind it then it's hard if it's dark it's harder so there's so many environmental conditions so we basically set the limit for this drone at 16 meters per second and we don't have a different limit that's environmentally dependent because it's kind of confusing for example if you flying on GPS instead of the vision like let's say you're over water then we'll have a bigger obstacle bubble and if you're flying with a smaller obstacle bubble because you want to get through doorways we limit the speed if you want to get through narrow gaps then we say only 3 meters per second you can go stuff like that yeah it's difficult so there's a factory calibration where we calibrate all the cameras but then things change with temperature and movement so are that state estimation system that I showed at the start basically at the same time we're estimating the trajectories of the cameras we're estimating the IMU biases and there's multiple IMUs there's an IMU on the gimbal there's one at the base of the gimbal there's one in the vehicle and so it can be hard to observe those under certain motion like if you're flying in a straight line it's kind of unobservable your gravity vector against the orientation of the drone and so for that you kind of have to make sure that you manage the uncertainty and say I don't know until the drone turns otherwise you can have something where you fly one kilometer in a straight line then you turn and the guess was totally wrong and the drone just crashes we've had that for sure and so it's a tricky problem yeah do you have a separate model for estimating wind airspeed with cameras only? yeah the wind estimation doesn't happen with the cameras it's mostly from well it uses the pose of the drone that's solved from the cameras things from the motors themselves so it's a it's more of a parametric dynamic model it's not a deep network for us right now but we estimate the speed of the wind relative to the body of the drone makes sense what's for the airspeed? airspeed as well? I mean airspeed for estimating our ground speed relative to the cameras and then we're estimating the airspeed then the wind is the difference between those so what's really being estimated is the airspeed yeah so a couple of interesting things about the dock so the first one is the visual localization so that state estimation system I mentioned before is really about local navigation so we don't remember when we go around the building and come back but for the dock we need a much more global version so that's what we call the VPS visual positioning system so here if we flew around the warehouse we have saved the map of it we saved it in our cloud and when a drone needs to fly a mission in the warehouse we download that and the drone uses that to localize against so here it's showing an example of like the drone's flying through and kind of building this map and it's similar to the 3D scan map but we're not doing a 3D scan like a person might have flown this and what that gets us is that when you schedule this, you know, a mission then you can always go back and get the same imagery because otherwise if you have even 1% of drift it's not going to work you won't get back through the door, you'll get stuck so this was a major thing we had to build for the dock and took multiple years so this is a fun example of a mission that's running every 30 minutes for 24 hours and so it's able to get back to the same point and capture the data very effectively and then you can imagine, you know, there's boxes here and you scan the barcodes on them and you have an idea of your inventory every every half an hour, yeah yeah, definitely, so there's tons of stuff like that that we're having to build now to kind of scale this dock usage so if you're indoors, you're generally okay if you're outdoors, all the regulation comes in, it varies a lot by country and use case and there's different types of waivers you can get usually it's easier to fly lower near infrastructure where other airplanes won't be, it's easier if you're not near airports or other critical facilities but depending on the use case you get some, you know there's some kind of laws and then we have a bunch of features like you can see a map of your site in kind of 3D on top of a satellite map and then you can define a volume saying this is the perimeter of my area and then we try to enforce, we call it geofence there, where no matter what you're doing the drone won't exit that area and that's an important part of it and you might put a ceiling on that as well within that area you might also have a no-fly zone, so let's say you have a roadway where cars are driving or you have some really important, I don't know percentage like equipment or somewhere where a crane operates and you define that volume in the cloud and say you can fly within our site but you don't fly within this kind of area and then that's kind of being integrated into mission planning and other stuff How do you deal with the constraint hierarchy if the drone senses that in order to avoid an obstacle we need to violate some constraint Yeah, I mean it's a it's a good question, I think at some level kind of it comes down to technically it comes down to our motion planner and which of these things has a stronger cost, so if you need to evade the geofence to avoid an obstacle because it's kind of coming at you and you have another choice and you might want to do that temporarily but if it's a static obstacle and there's a geofence here then the drone will just stop and then decide to go around so I don't know, I think that's different everyone's got a different thing that they want the drone to do so it's hard to make it configurable but not too confusing because the more configurable it is the more people make mistakes, they'll know what's happening and then they get scared, it's an alert A good example might be like if you're doing a bridge inspection often they have clearance to fly below the road but they don't have any permission to fly above the plane of the road because if any cars come by and they look at it that's considered a distraction so anytime the drone breaks that plane and goes above it's big trouble but in many other use cases when you want to safely get back home you have the drone fly high up in the air and then over so it's not hitting any obstacles and so those two things are just completely opposed to each other and it depends on the use case okay, I just want to show this is an example of we built a 3D map of a warehouse now you can tweak and define this mission in 3D in our cloud and so instead of having to fly the drone you can kind of procedurally generate a mission by saying I want to scan IELTS 1, 2, 3 or you can go and edit edit the mission by dragging it around so the global planning piece is also really interesting so the localization is where we build a map and remember it so we can know where we are, the global planning is about being able to get from any point to any other point within this site so let's say that you want to say I want to go to this room and the drone is outside and of course that assumes the doors are open but how would the drone know is because we built a topological map and so that's kind of a key part of this and then let's say somebody blocking your way the drone has to know the next best way around to get there and this is a fun example of just kind of repeatedly flying through a narrow hallway and a doorway which requires really accurate localization and it requires the global planning piece and then we'll skip that one so wind is also really interesting for landing in the very small target that we have for this dock so we built this kind of wind wall with propellers so we can simulate high turbulent winds and you can kind of see a lot of testing we do from different directions and how the drone kind of controls that and it's a tough level kind of control problem to manage that and land cool and then yeah this warehouse that we flew in this has been amazing for us so being able to basically test these things we want to get to the point where a company can have a thousand of these that they're different locations and you install them and it's as easy as buying a washing machine you're not thinking about the robot as much it's going to take many years to get to that level of reliability but that's the goal and to do that we have this warehouse where we try to get a lot of data and then have a lot of data analytics of understanding what's going wrong what's the biggest blocker, how do we fix that but then beyond this warehouse we also need to test that many different environmental cases outdoors right test in a place where it's dusty test in a place that's very cold test in a place that's high altitude and the propellers have to work harder and so that's kind of the process of making these things more and more autonomous is just work all the time and that's kind of what we're in the middle of at this point in our company yeah so we published a bunch of interesting work and have a lot of people doing good research in this area and that's it so any other questions yeah we don't have that in like a production you can't do that today but we've done a lot of demos of it and it's an obvious fun thing so the easy version of that is something like a 2D area scan where you take a big mission you split it in half you'd say let's have two drones do it or four drones do each a quarter there's more kind of advanced version where you they're actually communicating about their 3D location to each other and avoiding each other and learning stuff we've done some demos of that and then there's a more basic use case where you have say one controller and there's four drones and you say go to this GPS point go to four different angles and give me video from four directions at once so you might want that for say like a police scenario and that is something that we do have in customers use what are your own simulations offered? we do have a bunch of parts of simulations so the robot part with the dynamics and the motion we wrote our own for the kind of data generation for deep learning we have a bunch of tools built on Unreal Engine where we get realistic data from realistic scenarios and then we have a bunch of layers to create kind of our Asian cameras where we yeah the control part the control part isn't we don't do that part in Unreal Engine we just have our own code to simulate those sensors so our simulation that has the photo realistic part really is for generating offline data and then we have a different simulator that generates more of like a real-time simulator that generates images that's based on a different kind of race tracing engine that's not in Unreal Engine generally because it's faster but lower quality so there you can fly around in kind of real-time and there we have our code that manages the sensors yeah so the difference between our drone and DJI drone there's a lot of stuff that's very similar I think the main focus of us is building this sort of end-to-end autonomous missions and activity kind of product and user use cases around that so in the space of when we're talking about kind of the consumer filming type stuff so at the time that we built that DJI was really just flown manually now they have a lot of those same features where they kind of follow and film I still think our obstacle avoidance and sort of autonomous navigation at the core is better but the thing that we are now trying to build that's really our main focus is the sort of like end-to-end missions with complex navigation in different visually difficult scenarios and so that's where we are but for sure DJI is kind of the biggest competitor and the biggest company in the space yeah so this drone has been in production for a couple of years so we've sold tens of thousands of them the drone for the basic drone for people buying is around $1,100 but mostly we sell these we call more pro kits with a bunch of different things installed and there's something like $2,000, $2,500 and then there's for increasingly the focus is on these commercial use cases and there it's really about the software that's on top and that's more of like what software are you using for autonomous bridge inspection with these docks so it's a very different model there where the drone itself is very cheap compared to the whole thing and if the drone crashes you kind of just get a new one and it's a different business model yeah and now with the new programming algorithms you do the semantic segmentation yes and do you use object detection do you need to like have the algorithm for object detection or something that motion of other objects like there are some other drones that by this way so the drone understands that it would go this way so we go that way so yeah yeah we do some of it I would say we're not very good at it right now mostly the we haven't focused on that because it's challenging on this generation of drones computationally we are tracking a person or a car we track their motion we predict their motion and we so if the drone is following you and you try to run at it it will very quickly move out of the way to avoid you if the drone is just sitting there and you run up to it and grab it you can do that and then we won't be able to avoid that that's something we are hoping to change with our next generation of drone that has much more computing power um yeah and if you fly it with another drone it's gonna avoid that unless it's very slow so one question why don't you use the FPGAs for the drones because of the cost or the other yeah I mean um why don't you use FPGAs for the drones so I think um there's a cost there's a cost to the size and the weight yeah FPGAs tend to be bigger I mean if you go to like an ASIC that's built in then at scale like you you can get a lot of drones but the the reality is like so far our software has been changing so frequently that it's difficult to get it down to say we want this important part to be on an ASIC and at the same time the the chips that have the NVIDIA Qualcomm chips and other accelerators are getting much faster to the point where they have multiple accelerators on board there um and so it's it's difficult to say we're gonna have a multi-year project we pay millions of dollars to develop you know a hardened ASIC for something that maybe the next generation of chip where Qualcomm has invested hundreds of millions of dollars or billions of dollars to do that can have more capability so I think we're basically not at the scale yet and maturity yet where that is worth it yeah yeah so we we do use GPS um if we find or no we basically use GPS when we need it um and so uh usually that's very far away from things high up in the air or like over water um where there's nothing else to visually uh use um and so if you're flying indoors here there might be an okay GPS signal it won't be very good um and we'll be kind of tracking it but we're not using it yeah we don't use the deep stream library um we uh we basically have our own kind of camera processing driver pipeline and then we we do use their tensor rt deep learning accelerator but we kind of have our own stack built around that again for more control and um feeding into kind of geometric processing that might be after the deep network uh basically more or less any time there's something that's a closed source especially from NVIDIA we don't want to use it if we can avoid it yeah thank you how can I find more details through this camera do you have anything for us um so we have uh yeah so to find out more about the product there's tons of stuff on the internet to find out more about the algorithms and stuff I can recommend um this was this was for our core obstacle avoidance this was one paper that we published um that kind of describes some of the root of what became our deep learning obstacle avoidance um and then this library uh this is something we published last year and this is an open source library that does symbolic computation um and code generation uh so this is something we use for both our vision algorithms and for our motion planning optimization um so that's the biggest piece of it that's kind of open source and accessible and it's it's pretty fun and easy to use for um even in kind of uh university classes but then is easily transferable to kind of production robot um those are probably the best references I can I can do right now yeah yeah we're past time okay so thanks all that's that's it um