 What I would like to explain next is if we are just doing um what I called ah what did I call it? I said I had hard vision and I called it I think trivial computer vision for the case of um yeah trivial computer vision for the case of creating your own markers or features in the environment. So, if we are going to be doing trivial computer vision why not just bypass the entire imaging process all together and just come up with some direct line of sight method in the physical world where a sensor can detect this this connection this perfect line of sight without relying on cameras. Cameras are cheap and abundant right. So, it is very easy to just grab a camera grab your favorite library such as open cv and start doing image processing, but it may be that we end up with something that is um more reliable and better engineering by just bypassing the camera all together and that is the next method that I want to cover which I think is a is a is a great idea. This was um implemented by Valve recently the game company and named the light house approach and it is based on the same principles it is based on line of sight visibility it just bypasses the camera it is like making a kind of virtual camera if you like it is another way to interpret it . So, light house approach like I said you can find this in the current valve HTC headset um it is been in the press lately um this also this approach was called the Minnesota scanner or one implementation of this which you can find in a paper it came up in the robotics literature in 1989 by Sorenson at all and so most of the ideas the engineering systems that are coming up in industry they have been around in earlier implementations in previous decades it is just that they were not uh feasible in the consumer product realm um because the components often were not uh were not good enough or there was not enough motivation based on the market um alright. So, here is the idea let us think about um the visibility or mathematical equivalence between two scenarios. So, let me um let me draw it over here. So, let us look at two scenarios one of them we have already seen we have a camera and it is looking at some LEDs. So, it images some LEDs alright that is what we have just studied. So, that is one scenario I want to consider this versus I want to have what I call a lighthouse and yes I mean the same principle as a spinning light that is used for um the navigation of ships along the coast and. So, I have a light that is spinning I will just draw the beam of light coming out. So, there is some beam of light coming out perhaps it is coherent light. So, it might be a laser spinning along in some way. So, it is rotating around so it is just undergoing a yaw rotation if you like just keep spinning around at some rate and now out here I have sensors um these are my sensors which maybe are just as simple as photo diodes maybe I can even call them one pixel cameras or or one pixel in fact one pixel with maybe just on off right. They are one pixel cameras and they do not even have 0 to 255 values they are just the pixels on or the pixels off just that simple that is what I will call a photo diode. One pixel um one bit cameras and I will put cameras in quotes here because it is being a bit silly right, but just trying to point out the simplicity of it. So, aren't these two systems very very similar if I have a camera and it takes a an image I reason about the angle to each one of these features based on the parameters of the camera correct. If I have a beam of light that sweeping along here each one of these sensors will receive a blip right there will be a um it will go from 0 to 1 here and then 0 to 1 here and then 0 to 1 here. If I know the rate of rotation of this then I should be able to recover the angle. I need the rate of rotation and I need to know the offset like I need to know um at exactly what time is this beam let us say pointing straight up. If I know that and I know its rate of rotation I should have the same information here because I will know the angles at which these photo diodes are located. I will not know how far away they are will I right did I know how far away they were here no same thing in this model right. So, I did not know how far away they were ok you could use the size of the blob and a bunch of other you know maybe image processing reasoning, but based on this PNP model that we used there is really no difference between the two all I am the only information I really have here is what is the orientation. Now I am drawing a two dimensional picture here instead of a three dimensional picture. So, we have to be a little bit careful I am cheating a bit, but let us just suppose the world were two dimensional here just for a bit. Do we agree on the mathematical equivalence between the two of these? Let us see them all right. So, I might put them in boxes here. So, so we have exhibit A using cameras exhibit B using a lighthouse and I just want to argue that the information provided by these is equivalent assuming I know the camera parameters here and here I know let us say when the laser is vertical and where the laser is located in the world and I also know the rate of rotation of the laser to some reasonable level of accuracy. And if I spin this laser really fast say I spin this laser at 100 hertz that should be equivalent to a frame rate of 100 hertz over here. Do you agree? And notice how I think I have avoided over here the problem if I have a spinning laser I have avoided the problem of blob detection right. So, over here in a single frame I have to scan an entire image looking for blobs and I have to deal with all of the artifacts of computer vision. Here I just have to deal with can the laser see the photodiode if yes it lights up the photodiode. So, very simple circuit the physical world it is as if the physical world is doing the image processing for you right. So, why make an entire imaging system when all you are going to do is detect a blob. So, why not have a spinning laser and do it this way. Now there is some disadvantage in the sense that I do have to build a mechanical system here with a with a spinning laser. So, you can put a mirror inside of here and use some lenses to spread the laser apart into a vertical stripe and then spin it around. So, you have something mechanical moving here here we seem to have no additional moving parts. So, that is one thing that is going on. If I want to make this work over here is it better to do an inside out or outside in. So, is it better to attach the light house to the headset or put it in the world? What do you think? Probably better to put it in the world I do not want to have a spinning thing on my head. So, it valve that in a recent show for example, is they put it inside of a room in the in the corners of the room up on the ceiling. They put it in opposing corners you just need one of these devices, but they put two of them to handle the case of occlusion what if you are blocking one of these devices right. So, this goes spinning along providing this information and I made a simplification by making it two dimensional how do I make the third dimension work any ideas? We agree on the equivalence here in two dimensions right that I can I can make a virtual camera here right. So, that means that if I am let us see if I am doing this with the if I am doing this with the light house fixed in the world that means I have to put these photo diodes on my headset correct or if I want to track a controller I have to put them on the controller whatever it is I have to put these around and I guess I better have some kind of circuitry that can also take into account the timings right. So, that I know exactly how many milliseconds later this photo diode received the light after this one correct. So, I need some circuitry for that should not be too hard oh what about the identification problem trying to figure out which LED is which is that solved here? Yeah, because I know based on the circuit I have built which one of the LEDs is reporting that it is been lit up. So, beautiful it solves the identification problem in a very simple way. So, if I have these photo diodes I am just drawing let us say the headset here and I have now the photo diodes placed on it just draw some photo diodes placed around you can put a bunch of them on here it is pretty easy photo diodes are cheap. So, I put some photo diodes around on this and now I have this beam of light sweeping. So, let us say I take this 2 dimensional picture here and I extrude it outward this way. So, that all along this was just a vertical beam sweeping by ok. So, if that is a vertical beam sweeping by that is what we agreed is happening here then this looks like I have a vertical beam like this and it is sweeping across right goes around the room and comes back goes around the room and comes back. So, that will give me the horizontal coordinate right will give me the x coordinates in the in the world. How do I do the other part yeah why not just make a sweeper that goes the other way. So, I can use the exact same idea and put them both inside the same base station and they are just spinning 90 degrees offset with respect to each other. So, I have another spinning lighthouse signal going like this providing the y coordinates. So, very nice right. So, that is how I get the other coordinate and now there is another problem you might expect to happen there should be a drift error that happens which is going to be that I might when I turn on the lighthouse I may see where the headset is at and I may declare that to be the origin in the world perhaps. And then as I keep going I know the rotation rate of this laser, but I should expect some kind of drift over time there should be some angular drift. So, in order to correct for that what they do is they put a bunch of lights all around and flash them all at once. So, I could make a flood light in other words right. So, let us call it a flood light. So, I could have a flood light pulse and that is for synchronization. So, for example, whenever this beam is vertical at the same time I just flash in all directions. So, I have light emanating in all directions that will cause all of the photo diodes to light up maybe not as brightly as if they were hit directly by the laser, but at least some pulse will be observed. And we can again do this in the IR spectrum. So, we do not make it look like you had a camera flashed in your face right. So, we provide a flash pulse if all of these light at the same time that means the beam is vertical. So, that gives me a signal that lets me compensate for drift. So, based on the rate of drift that you might have that will tell you what frequency to do the flashing at. I do not know what they use I have not had a chance to even try the device, but I would say that you know no more than once every 10 seconds or so maybe once in a minute would be just fine just I would have to you know I had to be one of the engineers experimenting with it to know exactly, but you do not have to do that very often. Does that make sense? Questions about that? So, this is a great technology I think they are making this available in some kind of open source form. So, you should be able to play with light house systems fairly soon into your own engineering of those should be very very convenient for making tracked objects in 3 dimensional space. You just hook up your photo diodes to whatever objects you want to track. Now of course, that there is a downside if I just want to suppose this is this were my let us actually pick take my little coffee cup here. If I want to track this in the virtual world, if I just wanted to do it with retro reflective markers or QR tags I just stick them on my cup. If I want to use this technology with photo diodes I now have to have a circuit on my cup and it has to detect the light that is hitting them and then transmit that information somewhere right. So, there is some differences based on where the computation is happening, where the sensor is placed, where is the light source placed. So, something to pay attention to, but nevertheless this is a very nice and elegant solution right. Questions about that? So, that gives you some idea of how tracking systems work. I have shown you some examples of the technology has given you some high level overview. The most critical and sensitive tracking part is the tracking of the viewpoint of your eyes. If there is just a little bit of jitter or shakiness in the tracking it may lead to a very uncomfortable experience. It is certainly an annoying experience and if there is some lag or latency in the system then it leads to discomfort and even nausea. There is also another part which I did not talk about I may come back to that after I talk about rendering which is you can use prediction to overcome additional latencies that are in the system because the next part I am going to cover is graphical rendering. It may take time to produce the appropriate images that are going to go on to the display and then get the display updated accordingly. During that time what happens? I may know the orientation and position, but by the time I finally put the right information there it could be 20 or 40 milliseconds later. So, I can use further prediction techniques to estimate where the position orientation is going to be to compensate for that, but if I have to predict too far then I may add more jitter to the system again and and there may become let us say perceptible artifacts that the prediction introduces and this may be unwanted. So, these are some of the challenges that we have to deal with in these systems. So, even if the tracking system itself has let us say effectively no latency or a very small amount of latency that can be easily compensated for the very, very simple prediction techniques we still need to predict further than that to compensate for all of the latency introduced by the next phase which is called rendering. In this case this will be the visual rendering part. So, that is what I want to cover next. Do we need one more? Excuse me man. So, it seems like it seems like we do not have enough information right. So, we have two different spinning directions that gives us two pieces of information where is the third dimension right. Let us go back to the image processing case when we had when we were using images. Remember that we do not know how far away an object is due to the perspective projection of the camera and we agree that if we see enough points then we get enough information from this PNP solution right. I went and argued the degrees of freedom coming down. So, you are right we only lose two degrees of freedom each time when I observe a point rather than figuring out exactly where the point is with a single image. So, all I am arguing is that with the lighthouse solution the same scenario exists. So, you still have the depth part is missing you do not know how far away it is that is the part that is missing, but it is the same scenario as you had with image processing. So, you again have this PNP problem to solve. So, we are ok. So, it is no worse than that. Can you spin a laser some other way and maybe make it so that you get all three coordinates in one shot? I have to think about that if there are certain geometries I might allow that, but even without that we are fine or at least back to the same scenario that we were with cameras, but we have avoided all the problems of imaging systems right. Yes. Don't they use a spinning laser to also find the distance from other cars and obstacles? Yeah, that is right you can use you can use lidars and other kinds of depth measurement sensors and that is something I have not covered here. And I think that if you take the connect and other types of what are called RGBD sensors right to provide both color information and depth information together in a calibrated coordinate way you can get significantly more information and I think that one thing we will see in the marketplace in the coming couple of years is that RGBD sensors will become cheaper more abundant and we can experiment with using those in the real world and we may be able to have highly accurate featureless tracking like where you do not have to engineer the features, but it remains to be seen because you still have the trouble of extracting away objects that are moving in your scene, but they do not correspond to your motion they are moving by themselves and so forth. But this may provide another way of doing it however I have to say that I think that there is always a temptation to rely on a very sophisticated sensor that is gathering a lot more information than you need and this is why I like this light house solution very much because it is actually not leveraging the fact that you have cameras lying around that are low cost, but reengineered you know let us say not reengineered, but engineered a solution that is particular to this problem that I think is very elegant. And so this is a kind of depth imaging system if you like and it is designed exactly for extracting the pose of the objects that are placed in the environment. Now how much more needs to be added to this kind of system so that maybe you can do tracking of objects without having to go to a full RGBD scanner or coming from the other direction will RGBD scanners become cheap and abundant enough and will the computations become simple enough so that we could use them to get very highly reliable tracking of 3D objects in the scene. Which one of those directions will ultimately went out it is hard to speculate, but it is a good question to ask. So I did not cover imaging sensors that produce depth information directly, but I think those will play an important role in this space as well. Yep. So I think that is a reasonable idea. I think you are getting information from I think even in your example you have had to use two different photodiodes though right or with the same because you cannot tell the same photodiodes being hit simultaneously right. So is there a difference between one lighthouse sweep hitting two different photodiodes versus two different light houses hitting two different diodes I think it may more or less be the same thing you are just creating more visibility rays each of which adds more constraints to the problem right either way your idea or this idea they are just eliminating degrees of freedom from the problem right. So I think it is ok to do that but I do not think it is necessarily better than this and in an implementation of this they use multiple light houses anyway but that is mainly for occlusion just to make sure that the you know that hopefully not too many of the photodiodes are blocked if I am turning my head completely away from the the the base station that is producing the laser stripes then at least I will be facing another one that is producing the laser stripes so I think there is more motivation for multiple light houses for that reason yeah any other questions these are good questions and it shows that people are I think paying attention and understanding what we are doing here.