 Okay so hello everyone I'm Luca and as Joe mentioned I'm working in Open Robotics and I'm mostly I'm as I'm gonna present the OpenVision Computer 3 which is which is an open source ROS based smart camera that we've been developing for the past two years. So first of all okay a bunch of you know the usual introductions. Okay so that's me I'm Luca and I'm doing mostly like embedded systems development at Open Robotics in Singapore and then I have a bunch of years experience you know embedded systems design and let the last three years I've been actually working on drones in the NUS like TLAB laboratory and also a bit of intro on the company you know most most of you I guess if you are here you know Open Robotics so we are the company behind the development of ROS and this is this is maybe what you're not so familiar with there is you know our motto that we we create okay open software and harder platform for robotics but like we as a motto and like as a like as a reason behind our work we always have like openness so we try to make openness the the founding value of most of our work and okay you know just a bit okay of publicity so like we we are we have been founded you know from our co-founders are originally from California but they we have a big office in Singapore you know we are kind of spread all over the world. Anyway so to the back of the presentation so this is the outline of the presentation so I will start by giving a bit of background on the motivation why we developed OVC and then I just a tiny bit of history because the project has been going on for over two years to see what brought us to where we are today and then just give some some intro like the architecture the capabilities and how you can actually build your own and customize your own for your own application and then I will pass it to Brandon like to show some cool demos that he has been working on with the OVC. So first of all why OVC if I guess you know most of you people will be working in robotics and there is usually always a need for like smart cameras in robotics or you know like the Z or the real sense so what what so like the very basic feature that most of these cameras would offer is for example if you want to do navigation in a complex environment you will need a stereo in order to estimate depth of object and like to rule to the segmentation of objects so there is a solution for smart cameras but first of all none of the solutions existing are open source so you cannot really be customized for your own application and also some of them like the Z for example require a lot of processing from the require a lot of processing power from the host PC because they cannot do any processing themselves which is not the case for the interior sense because it can do stereo but the being closed source and not customizable point still remains for all these products so we develop the OVC which is fully open source so like all the way from the hardware to the software to the firmware everything is available free of charge on the on the github which includes an FPGA so it means you can you cannot board some of the computation to your FPGA so you don't need to have all the processing on your host computer but even if you don't want to get your hands dirty with FPGA which I don't know if many of your people work with it before but it's quite a lot of pain to get to work correctly even if you don't want to get your hands dirty we offer like the most commonly needed features out of the box which is the commonly the most commonly needed features being I like IMU readings that are synchronized with the images synchronized stereo images and some like simple feature detection like fast corner detection from the FPGA and then okay so let's let's give a brief history so the project started around two years ago you can see the date 2017 with the first OVC and which in which is there and it's also like the picture that was used in the meetup the the first OVC and all the OVC really have been sponsored by the the University of Pennsylvania so that's actually their drone that was used in the DARPA FLA program around two years ago and it includes an NVIDIA TX2 and an Alter FPGA so both the OVC one and the OVC two were tailored for the NVIDIA TX2 which I will just show actually now the the OVC two which is kind of like a similar design but it was designed to be modular so it can be more customizable in a sense so it's now a two a two board a two board set up but again both of those were tailored for the NVIDIA TX2 and as many of you probably know recently NVIDIA NVIDIA released a new computing platform which is the Xavier which the moment it was released basically made all our work instantly obsolete because you know we made so much work into developing like the the board connectors which are tailored for the Xavier for the TX2 then NVIDIA just released a new one the connector is completely different everything is completely different so it's completely useless which we which is the main issue that we had before that is that we were constrained to one specific compute module and then also extra aspects that like the sensor we were using before were very expensive so it was also not a very affordable vision system so now we it brings us to the OVC three which the OVC three is at the end of the day is literally just a USB device which you know similar to what you know the other the other cameras are so it's its purpose is not to be a fully self-contained you know computer plus camera but it just to be a device that you can you can then just connect through USB to your favorite computational platform of choice we could be a NUC it could be an NVIDIA Xavier it could be whatever whatever is the latest and most awesome computing platform and and then the nice thing about it is that okay so first of all we built as a simple version which features just cheaper sensor cheaper sensors but then we built we include like possibility to expand both in hardware so we allow we included additional connectors so you can add like better quality sensors so you can add additional cameras and in software of course because all the software is fully open source and also a plaza like we're using a computational module which has many pink compatible alternatives but then again the computational module here is not the core of this platform because we expect the people to use it with their own computation module so here you have a picture of the VC we just you know quickly highlighting some of the features so we have two so all the all the sensors are all the images are global shutter and we have two stereo two stereo monochrome sense like to monochrome sensors which can be used for stereo and one RGB sensor which is usually used for object recognition so if you want to do some deeper learning on your on your system and then we have the we have this four by five centimeter computing module who has both ARM processor and FPGA and you know some DDR some storage so basically this this four by five centimeter computing module can like runs a full Ubuntu distribution so it's literally just a small Ubuntu computer which from a power point of view computational power point of view is somewhere in between let's say better than a Raspberry Pi but worse than an Android something along those lines because it's still a quad core processor and then you can add additional features so we included a cheaper IMU because as you guys probably saw from the previous previous slides we used to have like you know this vector nav IMU who is the under of it was like in the order of thousand of dollars so no we just said okay we are gonna we are gonna have a basic version with cheaper sensors and then allow other allow everyone to just expand it by including additional sensors if they want but without forcing them to and then the USB the USB type-c connector which is both for the power and for the data that can be connected to your machine and then if you want additional connectivity of storage you know you can have Ethernet to connect to the internet or you can have SD card if you want to you know if you need more than eight gigabytes of storage which never happened in our case but you never know so as I was mentioning one of the keys like that it can be it can be just expanded out of the box so we added four additional connectors where and each connector can be can be plugged to an additional stereo camera so you could have if you wanted to up to 11 cameras running in parallel which is again you know it's a bit overkill but then we have the you know we have enough bandwidth we have enough processing power to process all this data so you know it could be interesting for future applications and then we also added you know some normal GPIO which for example nowadays we use which we use for example to yeah this is an example of an expansion board that you can design with the GPIO so for example we have some more fancy sensors like that you can see the vector nav or like serial console so basically every kind of peripheral you would need for your specific application so this was for a drone application and they wanted okay a better quality IMU for navigation they also wanted RS 232 to connect to the GPS they wanted connectivity for a lighter so you know this was a very it was a very simple design and it was very easy to get it to work compared to the main board and then okay as I mentioned you know you we also saw all the design now including the last two are open source so this is what the camera expansion board looks like which is literally just two images connected with a cable to the other board and this one is still under development this one we still didn't didn't build it yet this was from the other point of view but then if we look at the software so then the nice thing is that okay because because at the end of the day we are is the OVC is just a computer running Ubuntu you can just like you can use the whole Ubuntu ecosystem and you can run you like you so you can like build your own customer your own applications and you like you don't need to do any any low-level software or anything because you can use the existing libraries and additionally because we have an FPGA you can you can you know port your algorithms to the FPGA so to reduce the computational load on your on your machine and yes so currently we are doing fast corner detection on the FPGA okay now just just a bit of an overview on the architecture if you know anyone was interested in building their own like building it and using it for their own application so the way the way it works is that it's a USB ethernet gadget so it's something similar to when you connect your phone through USB to your computer and then you use it to connect to the internet so the nice thing about it is that it doesn't require any driver or any of any source from your machine so you literally plug it to your computer through USB you run a ROS master on your machine and then the OVC will automatically detect the ROS master running on your machine and you will start publishing all the data which in this case is images imu and features which is a nice to have because you know it's like it's basically fully plug-and-play you just need a Ubuntu and a ROS running and you don't need to install anything else and okay some details on like the hardware which we synchronized because it's very important for visual or the visual inertial odometry applications to have synchronized imu and images so we synchronize them together and we stamp them with the ROS time so again like the because we stamp and with the ROS time the the time itself is already in a format that is easy to integrate with any ROS package that you might be running and then okay as I mentioned you know we also like corner detection because this is again something that was developed with with like with U-Pen previously because they told us you know so we have a whole pipeline that we run on the drones but the most expensive the most expensive task that we that we need to run is the corner detection so can you guys port it to the FPG then we said okay let's do it we did it and now like now we just make it open source as the rest of the stack um so so we also output corner features which can be used for you know your visual odometry pipelines and now okay just the last few uh last part that is about how you can uh build like customize your own and build your own for your own application so again all the design files are available you know they're open source and I believe the license actually a very permissive license which can be used for like it's not gpl so you can use it for commercial applications I believe but I'm only 80 percent sure about that um and then and then you know as as nice to have um we also provide but like we also provide like binary images because I mean as a open source user myself I know that it's it's cool it's always cool to have all the source so it's like you know oh I have all the source I can change it I can modify it but 99% of the times I just want to take the thing and run it without bothering about you know rebuilding everything so we also provide binary image which you can just load on your on your OVC and also we provide a feature to automatically update it over the internet so so you so you don't like as from a user point of view you don't need to worry too much about you know developing or anything if you just want something that works and with the latest features so let's say let's say for example now you wanted to customize it for your own application well as I mentioned let's say the hardware is is open and it's been it's been designed with the with this eda software called keycap which is also open source and all the design all the design files are available in our repository for any source of customization you might want to do and then if you wanted to customize let's say the fpga part or if you wanted to customize the embedded linux again all the all the design files are available free of charge and the the let's say the minor issue here is that the tools themselves they are provided by stylings which is the is the company that designed the fpga and they are not open source but at least the like the version we are using is available free of charge because we are just using simple features a simple simple fpga board so we we don't need to actually pay any license like no one is to pay any license fee to actually customize their designs and finally the software which as I mentioned is uh because we are just running an embedded ubuntu then you can just you know connect to your embedded ubuntu and then like you already run sros so if you want to can run pre-existing ross packages or open cv or like anything that really is in the ubuntu ecosystem you can just run it on the computer and just as a small note here so like um actually what what what we used to do in my previous lab was something like push uh or to the fpga all the processing and just leave to the uh arm processor to the onboard ubuntu the higher level tasks like let's say uh decision making or like navigation so it's actually was it was quite simple to to make a device like not exactly this device but a similar device a fully um a fully like self-contained uh computational module for robotics application which was also cool because it's very low it's very low power it's very light it's very small but now let's go to the interesting part which is just a few simple demos about it and for that i will introduce brandon i'm learning about everything so like this is so this is my cutting yeah computer first how do i keep it okay thanks okay yeah so actually currently i'm running uh ross core on my computer so you can see that the ross core is running here and you can tell that uh the obc is connected because there's a internet interface over here so like if i actually wanted to write i could just uh ssh into the obc if i wanted to so and i mean hurry yeah but like we don't need to mess around with the obc inside so that's fine yeah so first things first i just want to show that like uh we have all the nice images so the left and right cameras are mono cameras so this is the left image this is the right image and then we have rgb feed as well in the middle camera and then like of course you can focus if you want to or not yeah yeah so so there's the image output demo uh for the next one uh we're going to look at some corners and you can see that we're finding a lot of corners so so that's nice uh let me run the corner detector and you can see that uh okay so so what you're actually seeing right okay is the corners that are detected by the fbg and then open cv in for comparison okay so like like the reason why there's actually a lot more corners in open cv is because like the threshold is actually set lower but then here the fbj algorithm has a separate threshold that we're setting and you can see that the corners that are detected are actually quite stable so that's nice okay okay let me kill that first so i can run my other demos okay uh for the next one actually uh i basically use the two cameras to actually fuse the images together to form a disparity point cloud i can actually do a live demo now but you can just look at the youtube video just to make sure that like you know yeah it works but let's actually do it live okay so like uh can i can't really see myself yeah so you can see that like the disparity actually works quite well i mean like you can see that the screen is getting captured and oh yeah yeah right i think having the lights really nicer but never mind this is fine yeah let me let me make it look 3d to you oh whoa yeah okay anyway yeah so if you have 3d point clouds right you can actually do a lot of cool stuff with it so okay let me just kill the yeah if you have 3d point clouds you can obviously do 3d mapping so actually this is the map of the office if it wants to load right but uh why not let's just make a map of this area hopefully with the lights on sorry yeah maybe all of them okay so i'm gonna need to set up some stuff let's give me a sec okay sorry okay so now if i just like selfie stick myself around the room we can slowly construct a map of the entire area so you can also tell that like uh because we have a point cloud in the ready we can actually do a bit of visual autonomy to actually tell whether to actually tell where the camera is actually pointing and where the camera's uh position is at yeah and then of course because this is rose right you can just turn the 3d point cloud that you generate into a navigable 2d map and then you can just like throw robots at it and then have fun you know uh of course you can also save this map and then i don't know maybe forever immortalize this meter in a 3d point cloud form uh yeah yeah okay so like if you want to actually look at the map itself right uh of course uh the further away your your points are from the camera the slightly worse the disparity matching will actually perform because you only have a two ten dollar cameras just like on the obc working and running but uh it's it's quite nice to see it working fairly well from like a single vantage point yeah yeah and then uh other applications that we've actually used it for uh are okay like i did cylinder segmentation for the point cloud to try to test out the object recognition capabilities of the obc so this is me with uh like a random portal and basically i'm leveraging pcl which is the point cloud library with rose to try to detect a cylinder so like uh you can see that it's detecting a cylinder and then basically uh we use that cylinder detection to basically get a robotic arm to just automatically pick up a water bottle yeah so the arm that you actually see there is the Berkeley blue arm which is supposed to be a low-cost research uh compliant arm you can probably ask me about that later so you can see that i have a movie interface interacting with the arm and i'm feeding in the cylinder detection into the arm control pipeline maybe i should speed it up but yeah of course you grab the bottle and that's nice but then like uh in order to make sure that like i'm not cheating right uh the next step actually has me tilting the water bottle and then you can also see that the cylinder is actually detected as a tilted cylinder and then the arm will just automatically move in position appropriately yeah so the reason why the arm is like that is because i just programmed it to just assume the pose that's like five or ten cm above the bottle center of mass and then to where he going for the approach yeah because it was a bit lazy but yeah so yeah it works great yep okay and then we also played around with uh two different algorithms so so actually the sterile disparity algorithms are available in the sterile image prog node that's available in the ross ecosystem users uh has two stereo algorithms that you can actually use mainly because it's built on open cv and this is just a comparison of the better algorithm and the worst algorithm yeah so as you can see over here right there are two algorithms there's the first one which is sterile bm and the second one which is sterile sgbm which stands for semi-global block matching so currently uh i want you to pay attention to this top right hand corner uh rv's window so you can see that the point clouds here are not that well formed but the moment i change over to the second algorithm you can tell that the point clouds are actually a lot more fully formed and i just keep swapping back and forth between them as you can see also that the fidelity of the point cloud is actually quite nice yep and then we go back to sbm it's not so good and then sterile sgbm is quite good so yeah and then of course if you have a point cloud you can actually do visual inertial automotry so i have a pipeline that i'm going to run now that's using the very cheap imu and the sterile camera capabilities to actually get some fairly fairly okay automotry for what it's worth yeah so so so let me just run it first okay so uh this visual automotry pipeline is actually leveraging the rtap map ross node that i was running to do the mapping just now but then uh now there's actually uh both okay okay so so if you actually look at the rv's window right you'll see a red arrow and a yellow arrow the red arrow is the post estimate that's generated by the visual automotry node and the yellow arrow is the is the sensor fusion of the visual automotry post and the imu post okay i didn't really tune the common filter that well yet so you might see some drifting but hopefully it works okay so if i turn it around right you can actually tell that like you can track it somewhat well as long as the visual automotry node has a map to compare against and because i just started it it's not really comparing that well yet but so i turn it okay maybe i've turned it the other way so if i move it up it goes up i move it down it goes down and yeah so pretty pretty pretty good for fairly cheap sensor suite uh you can imagine if you actually use this data with like encoders or better imu's or maybe like some gps you could actually get a fairly good post estimate okay uh when i drop this camera uh because there's no map it's gonna go crazy so okay and then uh i just got this working today but we actually have a slightly better visual automotry pipeline called svo that's actually done by ethc by ethc and i'm going to demo it for you yeah so uh if you want to know more about this particular pipeline you can go check out that particular website but the bad thing about that website is they only provide the binary they don't provide the source code so it's a bit sad yeah okay i'm also going to just show you how i started because it's it's quite annoying to to start it so uh i hope i didn't just kill something oh no okay i'll just i'll just rerun it yeah so the problem is uh because my computer is using ubuntu 18 uh with ross melodic but svo is only supported on kinetik and indigo so the solution is to use docker so i'm just going to start some docker nodes okay and i'm also going to go into my other workspace on my computer to do the visualization so just give me a sec okay so let's start the visualization first so this is the svo visualization and hopefully this works okay so now it's trying to look for features to detect okay they found some features already so that's nice so if i actually move it around you can actually tell that uh it's using both the imu and the visual automotry to generate the final post so i'm just collecting a bunch of features first uh if you want to see the image let me blow up the image first oh damn it oh it crashed let me restart it so you can at least see the feature detection first yeah so actually the way this works is it tries to find features in an image and then from successive images it actually tries to predict the depth to build a point called map and then uh from that you can actually get some form of visual automotry and if the map has been produced fairly well uh you can have a very dynamic environment and it won't actually run so i'm going to start waving my hands around in front of the camera soon after i minimize this image so just give me a sec so this is the point cloud from the features that's being generated you can tell that uh there's a bit of drift but it's fine if i raise my hand in front of the camera the automotry is actually quite stable which is interesting because normally if you have a stereo camera uh this will lead to this completely die yeah and then of course if i just move around you can actually tell that it's tracking fairly well yeah uh the wire is a bit loose uh sorry short so i can't really do much but so so that's nice yeah actually the the post inferencing speed for each frame is about seven milliseconds so this is something that you could actually potentially use yeah you can also like yeah if you swing it around too much then because there's a bit of a motion blur then it just loses the post yeah and actually that's it so uh i'll just pass the mic back to Luca uh yeah okay yeah um yeah okay so this is a bit of yeah okay so in uh we are okay okay oh let's do it hop right yes okay well the the surprise is gone but we are hiring so uh this is you see you can see our wonderful team at our latest wonderful team event where we went to play mini golf a few weeks ago uh so you know like we we have quite a few very cool projects happening in singapore we like the government like like automating the whole healthcare environment so no if you if you guys are into robotics and you want to like change the healthcare environment in singapore you know go and check the website send us a message and let us know oh and i think that is actually all yes um just okay just to to really kind of wrap up the presentation um i would say there is like two sides uh to this presentation so you know the first side was like the the whole hardware you know like there is this camera you can use it for your applications and so on but the second side and maybe maybe even more important that was shown by brandon is uh how the how mature the ross ecosystem is so you know once we get this data into the ross ecosystem uh how like how much open source software there is out there that we can just use to solve common problems such as you know uh object segmentation and the stereo stereo matching and like depth estimation or like mapping visual damage and all this sort of stuff so now if by by using you know sensors that are compatible with the with the ross ecosystem like what i personally believe is that you can really you know you can like speed up the development of your application and you know hopefully like by leveraging like all this open source open source solutions that someone else developed developed for you already instead of you know solving the same problem by yourself anyway yeah so that's all if you have questions me or brandon we will be happy to take them i thank you yes so you mentioned on the version two you switched to the jibber chip set for version three so now that uh jason nano has actually released that chip set as well are you do you have any plans for supporting actually you support the deep learning community as well you know so to support the gpu in but like i i guess you know because the um like again like the difference is that now this is only a device so like as long as you know let's say the jason does the jason nano have just you know a usb port that like you can use so like basically if you have a any machine with a usb 3 port you can you know just plug it and just you know get like get your data output basically so like it doesn't really make any difference whether it's a jason or is it our draw there is a full desktop or a laptop or the chip itself um i wouldn't say so because the nice thing about having an fbj is that we can have like a ton of peripherals that like maybe you wouldn't be able to have with let's say an immediate chip because you cannot reconfigure it so much so for example you know we can have i think right now we have like a lot of like different i2c communication happening in parallel spi serials like me before the cameras we have at least like 10 or 15 different like communications happening in parallel and if you if you went for a chip that you cannot really be reconfigured completely then i don't think you would be able to achieve that you know come all right all right so i will pass the the mic back to i actually talked about service reward and me and my team have been meeting talk mobile robot but uh i mean yeah since since it's relevant now i'm going to destroy it again so yeah this is one year later so actually uh we built this service robot called mobile robot uh based off of uh jose and robot framework yeah so robot stands for modern mobile robot so like it's meant to be top modular so like you can remove the the form and change all the different stuff uh we were leveraging stuff interesting uh human robot interaction principles so uh maybe focusing on the eyes and a bit of a voice so uh you can actually see the robot roaming around the exhibition so you can see the eyes so it's a cute boy and i also thought that looking at software actually makes it a lot more comfortable so like you can charge people with the little bit of robot and people will not really care that much so that's nice you'll just have kids like hanging around a bit and they really like to keep the peace out by the back which is sad but i mean so of course it's running ross uh it's running the ross navigation set uh and yeah i wonder if i sound yeah i can sound okay okay so that's this eye portion uh i just want to share that like the entire stack is actually on the hub with full with full documentation so uh if anyone wants to build their own mobile robot they can actually just build it yeah it is Google mobile robot uh please don't google moomoo because you might get scared with like uh kibbit pasta and scary stuff yeah and then actually for the emotions uh there's a package that you can actually just park and pay for any of the robots so the eyes actually can be controlled and they respond to the mobile system that you can send to control robots with and they bring uh because i can't get the sound working i can't really pay the voice lines but like you want to listen to the voices and then uh no no no i want to and i think i want to share it's uh thanks to those who uh his work has managed to get on the news so actually move on and up on the news so that's next yeah