 All right. Hello everyone. So my name is Mathieu and I'm going to talk to you about point cloud processing. So it's getting pretty late in the day. I prepared for you interactive presentation, interactive meaning I'm going to interact with my computer, but I'm happy to interact with you guys as well. So feel free to you know ask questions, interrupt anything. Yes but I'll start. We'll see when that happens. So first of all all that I'm going to do right now is actually live. It may fail. Let's hope it doesn't but it's code that is available on github at that address. So there's also links on the first web page. But I'm going to also update it after a talk in case I change something in the code. All right, so let's get started. Just a few words about me. So actually it's fun to be here because I got my PhD from this university a while back. So yeah, I feel a bit at home. And yeah, and now I'm working with Casper's at the back and third person on Rockestate. It's a small Brussels-based company and we take open geo data and we transform it into in order to make it usable by companies in the energy and finance sector. For example to predict house prices or things like that. So just so my personal favorite software stack includes these things that some of you have encountered before. But this thing that you're looking at is a Jupyter notebook on Python installed using Konda as a package manager. All right, so let's start talking about point clouds. So what are point clouds? I'm going to right, here is need to tell you what where do they come from? They typically come from lighter sensors on, for example, airborne ones. So what happens is you have a plane that flies above a certain area of land, they go back and forth and they have this scanner with a laser. It sends light towards the ground and then it comes back and measures the time it takes to go back to the plane. And from that it infers for each point an X, Y and Z coordinates. So you really have for lots and lots of points you have this huge cloud of points in 3D above the ground. And what's really nice about it's really nice that actually the Flemish government, Belgium is complicated, but at least in Flanders they have opened the raw data that they got from this exercise over the entire of Flanders and Brussels. So that means there's this huge 2.7 terabytes of compressed data that's waiting for you to download and play with it. And that's what I'm going to do here. There's, yeah, so okay. So before I really start playing, here are a few things about what you need to know if you want to play with lighter data. First of all, you're going to encounter typically two file formats, LAS, which is the standard file format with certain specifications and a versioning thereof and LAZ, a compressed version of that. Some interesting software packages. So there's the point cloud library, which is C++ based. And it's pretty powerful for many standard things you may want to do with point clouds. And yeah, so that's one. There's a computational geometry algorithm library. This is more towards academic algorithms. It's really a lot of state-of-the-art 2D and 3D computational geometry things, not necessarily with point clouds, also a lot of mesh stuff, but also just 2D things. That's also C++ based. And then there's our favorite of the day, which is called PDAL. It's kind of an analog of GDAL, but for point cloud data. You have point cloud data, which is geocoded, and you want to carry along this, and you want to have a tool that robustly transforms your data in the way you want. And it actually has wrapped some functionality from the first one from the point cloud library as well. The last one I wanted to mention, it's not open source, but last tool is something for Windows users that want to try out maybe some things. And the primary guide working on that is also responsible for the last day compression algorithm, which is open source. Okay, so let's get started now with some point clouds. So I have here this code. Don't worry too much about it. It's going to disappear as soon as I run it. Okay, let's get to the next cell. I'm going to make sure that this last file is on my computer, in case it doesn't, it downloads it from the Internet. So this code is going to run on your computer as well. And I look at a map here in case, well, maybe you recognize it. Maybe you don't, but it's here. So let me just gather a big polygon. Let's take some forest as well. There we go. So we have a huge area, including our building, which is here. Yeah, so we take that. Okay, same polygon. We have the coordinates now in the Belgian coordinate system. And we, sorry, you didn't see the code, but there's this code which actually calls the Pedal library, which really is going to take the raw file I have, going to extract whatever is inside this polygon, and it's going to load that data in memory in Python. And so it tells me that I have about half a million points. That's about 4.5 points per square meter, which is not bad. You can see a lot of details already. And I'm going to put this into some standard Python structures. Don't worry too much about it. What we want to see is more what it looks like and what it looks like is like this. So actually I have higher resolution than I expected, so let me put this like this. Okay, that's too much. Sorry about that. Okay, so this is what it is. This is the point cloud we just extracted. Do you guys recognize where we are? So, right, we don't see much. It's all red. Okay, if we zoom in, we can see, yes, it's lots and lots and lots of little points, but we don't see much. And the whole point that I want to do is to interpret this and segment things and start seeing things. So first thing I can do is separate the ground from the rest. And this actually has already been done for me in the raw dataset by the government. So they have this algorithm that detects what's ground, what's non-ground. And once I have the ground, I can actually remove the points and instead turn it into a flat surface, which I just do like this. Well, it's not exactly flat, but it's just a surface. Then I want to actually see what's more of a building type, what's more of a tree. And what's the difference? Well, points in trees, you have lots of points around. Points in buildings, it's really flat around. So I'm going to compute some measure of how flat does the neighborhood of a given point look. And if I plot this as a color on the points, I can see it does already a pretty decent job at separating trees from buildings. And then I actually set a threshold on this number and I really get a nice segmentation into two kinds of points. Those that look like buildings, those that look like trees. So let's hide the buildings for now. And let's try to actually figure out where the trees are. How many trees are there? How do we do this? Any ideas? So what I'm going to do is actually I'm going to look for tree tops. I'm going to look at points that are higher than any other point around it. So I can do this very quickly. And what I end up with is lots of points which are indeed, they are at the top of their local maximum, if you want, in this point of class that are trees. So nice, I have the tree tops. Then what? Well, I can color the tree tops in various colors. And then I can look at, for each point that is still green, what is the nearest color, nearest tree top that is, let's find its nearest tree top. So I do that. It takes a bit of... Actually they just have, all of them have a different index. I just count the trees. They just go from 1 to 500. I just have only $10, that's it. And so yeah, so actually I just look at the closest tree top and this kind of gives me a segmentation into different trees. So these are the trees that you can find around here if you go out. And then I can use those tree tops and change them to actually model the tree itself, not just the tree top, so let me do that. I move those spheres around and they kind of get the size of what the tree looks like. And then I hide the original points. I put back the buildings and I color the trees in green. And that's one first idea of what, you know, we're going somewhere. We're seeing a much better picture of what's around us. Okay? Okay, good. Yeah, you can run this on your computer. It's going to do the same thing. I mean this one has 8 gigs of RAM, I think, so it's not a power horse. Okay, so that's it. Any other questions about trees? Before I move on to... Okay, so it depends on, I need to check the, so PDAL was when I import, so I read in the data, it's doing that here. I don't think, so it's doing actually a bunch of steps. So actually first I cropped according to the polygon that I drew on the map. Then I computed the height above the ground. So that's another algorithm which assumes you already have a ground segmentation and you compute the height of every point, seeing what ground points you have around. And then some eigenvalues and normal, I'm going to come back to that afterwards. But this is really something that measures something about the geometry of points around each point and whether it looks flat or whether it looks fluffy. That's that. So but I don't think it has used PCL actually. Yeah, I mean I could. Yeah, of course. So that's, I mean this is just a toy fooling around. I could be much more serious about it and use actually the points and cover them like for example a complex hull or something. But it's already interesting to see if I'm getting somewhere. Yeah, it's mostly non-py pandas. Yeah, there's nothing really special about this. And I have no hidden code except the part and I can show you it's ten lines and so it wouldn't really fit below this picture. But it's the part where I compute the local maxima for the trees. But it's, again it's all custom, fairly straightforward code, yes. Right. Well, so we haven't, I mean it hasn't been our focus to classify as many different things as possible. For example, typically the next thing, I mean of course you want to see buildings and trees. There's also lots of cars that you can see. Besides that I'm not exactly sure but we'll see another interesting task which is not segmentation. It's really being a bit more serious than these spheres but for buildings. So let's do that. Yep, building modeling. So again we're here and I'm going to pick this building. The one that's just behind maybe that one. Anyway, philosophy institute. Let's select this guy. I'm just going to take a bit of extra stuff as well. Why not? And that's not supposed to be here. And we go ahead. So here's our polygon with its coordinates. We run a PDAL pipeline and again. So here I'm a bit cheating. Just remove this. But I'm a bit cheating. I'm using multiple last files which I'm not providing in the download at the beginning of the notebook. So I'm using more data than what you can find online but you're free to always ask. So what does it mean? Well it means I have actually here a density of 28.9 points per square meter which is much higher than the four per square meter that we had before. So there we go. We have loaded whatever is above this polygon in memory. So let's have a look. Again, I'm going to put this a bit wider. Okay, so that is that building. Maybe someone can confirm by just looking outside. I can't see from here but anyway, maybe it's that. Maybe it's another one. And so what I'm going to do here is I'm going to try to make a 3D model of it. So make really surfaces around that building. And the way I'm going to do it is I'm not even going to try to find a polygon on the ground. I'm just going to use the fact that seen from the top, this building is rectangular. And once I have figured out how to look at it from the side, then it's starting to get pretty easy. Because I can look at 90 degrees from that and then I know the bounds on one direction. And if I look at this side, well I can find the bounds in that direction. Plus it's pretty easy to figure out the shape of the roof. So of course this building is pretty special, but I'm going to use those properties and build a model using that. So let's get started. I first, so remember I talked to you about normals. So this is what I mean by normals. Here I drew the normals of every point. So it's really the direction that the points around it look like. So if it is a surface, maybe it's not a surface, but if it is a surface, then it's really which direction is the surface pointing at every point. So that's what the normal filter is doing in PDAL. So you have lots and lots of such directions. And let's for a moment forget where they are in space. We just want to care about the direction itself. So we forget where they are in space and put them on a sphere of directions. And what you can see is most points or most normals are in a very specific arc on the sphere. Yeah, obviously this roof was like this. So most points are pointing always in this specific arc, not in every which direction. So this I can use now to basically infer, let me put those points back to where they were, infer the orientation of the building and basically figure out how I can rotate the building in order to align it with the axes. No, this is inferred from the row point cloud data by looking at the points around them. This is all PDAL. PDAL provides you with the normals, some eigenvalues to tell you if it's flat or flat. So you may have not noticed, but I've just rotated this thing and notice how well aligned it's now with the box around it. It's really aligned with the axes X, Y and Z. So now I can really go ahead with my plan of fitting a box around it and fitting a roof as a surface. And so this is what I get, which fits pretty neatly the point cloud that I originally had. So I need to rotate it back to its original coordinates. And as a nice cherry on top, I can go to the internet and fetch aerial images of this area. And put them on my model. So that's what I'm doing here. I put it on the ground and I can also put it on the building and I get that. So the building is supposed to look like that. So that's that for building modeling. So any questions, any more questions? No, I just cut up my X axes into small bits and then for each band I look at what's the top high that I see here. And I'm just going to fit a surface using that. So it doesn't generalize well to crazy buildings, but it works well for these kind of roofs. Sorry? Yeah, yeah. So then I mean I guess I do create a mesh with once I have these points I want to connect them with triangles and build an actual surface. So I have a few more minutes. I just want to mention a few things to look out for in the future about these very nice tools. So I actually haven't mentioned one of them, which is IPI volume, which is this thing which allows me to visualize this, change the variables and it actually updates the visualizations itself, which is extremely useful when you're developing the algorithm. So this interactivity is really key for us to be able to do these kind of things, otherwise we just wouldn't be able to see what we're doing. And so yeah, IPI volume is a key player here as well. So on the PDF side already now they have, I'm not a core developer by any means, by the already now they have implemented a custom point in polygon algorithm in order to be able to extract very fast huge amounts of points given a certain polygon using a well-known text representation. So that's very important because you typically handled millions, billions of points and then you need to go fast. Apache Arrow support is an issue right now and it would be very interesting to have. And personally I'm working on conduct packaging this so as to make it easier for all of you guys to work and play with this. And Jupyter-wise this is more of a personal flavor kind of things. I'm pretty excited about the advancement of new things in the realm of C++ Jupyter kernels which would be a nice interface to all these other libraries that I talked about in the beginning and making them interactive would be insanely valuable. And then Jupyter-Lab is this kind of next phase interface as opposed to the notebook where presumably at some point you will be able to have the visualization on one side and then the code on the other. We can play around with the code and have the visualization however we want. So it's kind of a web-based IDE in a sense. Anyway, so that's it for my talk. No, it's cheating. So that is true. The data is super high quality, super high density, so that's really valuable. Not that I'm aware. I haven't been involved in that but as I can see it's fairly raw data. I don't know about any subsetting. I suppose I could show you as well. So we are busy as well like building actual 3D models of buildings in Belgium. We do have interesting demos on our websites. So I guess I'll just show it because it's fun. Here, demo. Anyone pick a city that they like? Kent. There we go. Sorry, they may have, but the thing is that this works for the whole of Lander's and it's automated. So yeah. I mean it's missing this thing here. There's something sticking up. But yeah, I mean it's supposed to be working best for houses, so yeah. Yes. IPI volume. It's written IPY volume in one word. Just like IPI leaflet or IPI whatever is like the live interactive version of something. Yeah, so there's really some work on that. I mean this kind of LiDAR data is definitely useful for, I mean I'm no expert by any means but we have for these kind of either agriculture studies or really studying large bodies of forest and this is actually I think probably the primary, the first use that there was of such technology was flying over Amazonia or whatever and figuring out what does the, I don't know, the tree cover look like, what kind of trees are around that thing. The data set was originally made, the original, they sponsored it. All right. Yeah. So I don't exactly know about denoising, what I know is okay, you can always subset your point cloud if it has enough density in the first place then you can always only take the points that actually look flat by looking at the points around. That will actually help a lot. For example, we don't model chimneys there, the way we do it is we remove chimneys because they don't look flat. Yeah, so, but then if you really have a noisy and shaky data set, I don't know what to do. Yeah, I don't have a, sorry, go ahead. I don't have a good answer for that. Maybe just one remark for those of you who have the printed program, there is one more presentation in this room. So if you're interested, it's called Mapping Foster.