 OK, let's go on. Our next speaker is Olivier Coutain. And he's going to tell us about how deep learning can help us improve geospatial data quality. Welcome. So that is the title of the presentation. And the use case will be focused on geospatial data and mainly with open-source map data and use cases. If we took Belgium open-source map data, something like more than 10 years ago, it was only the brown part, the black ones. And in the range, it was the official road already mapped by NGI. If we look a few years ago, a lot of roads have been mapped. And if we look something like one years ago, quite all the whole country area has been mapped. But it's only the case in some country. If you look on the worldwide, there is still a lot of places where not mapped as well as in the one who appeared in pink there. So all the blue area are still in some inconsistency map area. And one kind of issue is to detect where, because it's not as easy as it for people who work on this kind of project to be able to feel the white area. Obviously, it's not only a thing related to a feature we are not put on the map. It's also concerned about features that are not well mapped. For example, it's not the right attribute. It's not well placed. It's not on the field anymore, and so on. So there is obviously quality insurance tools helping this kind of project like OSM. And one of them is called Osmosis. And if you look at it on Voxel, there is a lot of place where still I think question. In fact, there is opening questions. And people from the project are welcome to look if there is some feature where still inconsistency are not consistent on the dataset. But it's only based on some geospatial query. So for instance, if there is a two-road, we are not connected together. So at this point, it's still based on some spatial analysis as we could do in a classical way since decades. To go for some, one way is to cross dataset together. And one example among them is to use, for instance, light pollution map, taken lightly by satellite, and to look if there is a consistency yes or no between the light pollution map and the roads on the ground, and in fact, in the dataset. So if we look on correlation between the light density and the population, there is a first correlation. But if we took the road density and not only the population density, we can see that we increase in quite a different way the correlation. And the dataset itself improves itself. And you can see that there is a slightly difference between the two. But at this point, we are still in the classical correlation dataset test. If we want to go further, we have to go deeper and have a kind of analysis pixel by pixels. And a few years ago, it was still something not available out of the box. And right now, it's something like can happen. And you can begin to play with a system where a few years ago, I was still dedicated to research here. So a few months ago, there was a kind of project to say, OK, we have a deep learning framework. We have a special imagery. What can we do out of the box with this kind of system? If we want to play out of the box with deep learning vision system, we can use fine tuning or transfer learning. The two term are closed to just grab an existing train model and use it with our use case. If we do that with a train model, it was a ResNet. And if we played with IRL imagery and we provide the labelization related to what is on the ground with several places. So there was, for example, one place for the building, one another for the vegetation, and so on. So there were a few places. And the target is to let him predict after training what he sees himself. And you can see that there is obviously a slightly difference between the two, but the prediction is not that bad. So with the out of the box solution, we are already able to do things that are quite efficient for something not that bad. It's not accurate enough to be able to automatic mapping, but it's accurate enough to detect major inconsistency. What is the first issue on here? This model, this well-known model, are based on ImageNet database, mainly based on classical photography. And this photography are three-band only. So on our data set here, we already have four bands because there is one based on infrared. So they have to remove the blue one and to shift somewhere the wavelength to the red to be able to use the model because the model is only dedicated to three-band and three-band only. But the point is that with special imagery, we have far more than three-band and sometimes it will be some kind of issue. Here you've got the model code with the user. So they use ResNet with a convolution layer. So here in these two pages, there's only the code related to the model was used. So if we want to go further than only out of the box solution, there is several contests on it. And one quite recent on this one was launched on KGO. And the point was to be able to automatically detect some kind of feature from special imagery. And on there, they provide a data set with several kinds of data sets with several kinds of bands. And for example, here you've got three bands from RGB, the classical one. But here on multispectral, you've got up to 16 bands. So if we sum up all the wool bands only from imagery, we can go to something like about 20. So it's far more than that the classical model used to be trained, to be engineered. And so this kind of result imply to build a new kind of model able to deal with several kind of band length. The other issue is related to the size of imagery. Here, it's far more than the kind of resolution. You can find an image net. So you have to tile. And to split the data set in a small area, because the GPU memory is not able to deal atoms with this kind of high resolution. And what we see here, DeepSense AI wins the fourth place on this contest. And we can see that the result is quite good, because of the resolution and the number of bands there. Yeah, it was a high resolution and quite rich kind of data set provided as an input. It's to mention that the contest was provided by a defense laboratory. And it's quite common in this kind of field to have the ability to get some high resolution data set quite easily. If we want to go further, so we have to deal with some kind of model able to segmentize and desegmentize the input area. So by convolution, we progressively go in abstract the information. And then once we perform the convolution, we do the same in the reverse way to get the same kind of resolution as an output. But here, there is a lot of pixel value. And here, there's only the pixel relative to the different places. So we get the same resolution between the input and the output images, but with less places. The trick is to say, OK, but what happens if we add some extra layer information, not only the imagery bounds, but for instance, some band from vectorial information. So if we rasterize some layer in more, there is a paper on this kind of extra information. And if they combine these two kind of operation, they can improve, again, the quality of the prediction in output. For instance, here, there is the output prediction with only the RGB input. And here, there is the output prediction with both the RGB input and a training with the extra layer information. And for some buildings, you can see that we decrease some artifacts. But here, it's already quite something. But if you want to go more, the concept is to add some extra layer in your tensor. The stuff is not related to fit and predict. The stuff is related to the way you get your data, your transform mate, and so on. So if we look at the whole chain, there is a first step related to go from your data set, label it, create your topology model, train it. And with your whole data set and your train model, you can grab a prediction. If we focus on the first step, what we need at this point is the ability to label easily your data set. There is one tool which just has been released a few weeks ago related to the ability to create automatic label from imagery and to use OSM labeled on it. And the first quick and dirty model from using them is able to detect buildings on an area. So it's brand new. But if we want to go further and not using directly the label from OSM, we want to be able to modify them slightly or in a more important way. And so what we do, we use a classical GIS tool as a post-gray SQL and post-GIS, and connect them to a mixed net for all the deep learning stuff. And the connection between the two is related to WTB Raster with a label indeed to convert between post-GIS Raster and NumPy area and dimensional area. So if we look to a mixed net, a mixed net is a deep learning framework, and it has got the ability to let us create our own dedicated data loader. And so here there is a customer data loader iterator prototype. So in quite a few lines of Python, you are able to create your own data iterator. If we create a special query able to create both the split images and is labelled, we are able to only on the user land side launch this styling by few lines. And here you've got your labelled. And here you've got the image report. And you can produce on the fly as much tiles labelled as you want. And indeed, if there is something you want to change, for example, sub-select something else or change the buffer related to your load on the label, it's quite easy to change it and to launch it again. So you have a way to configure quite as you want the labelled. You want to try with your topology model. Once you create both your labelled data set and your model topology, you have to train it. And a mixed net is interesting because it is able to deal with multi-GPU, and even if it's really needed, with multi-machine on the training stage. And there is also something interesting is the ability to use your data set with both data and labelled in binary encoding record.io file. And so to be able to parse them in a really fast way. And at the last part, so when you have to deal with prediction on the very large coverage, the only question is, could we reduce a map? And the answer is, in the question, it's really easy to parallelize the prediction because we only have to deal with the coverage. And so the point is more on the infrastructure. And a mixed net has been chosen by Amazon Web Services. So it's really easy to use the whole infrastructure to do so. If we look on the saturation, the open data set stuff is something which is everything. At the first step, when nothing exists, you have to do it all by yourself. When there is a training data set available, a few months or a few months later, you could get a pre-tine model available and then out of the book application. If you look on the road data set labelled right now, the best one is a space net. And this data set is only related to buildings and roads. That's something, but it's only on this. And it's also dedicated on big city. So for example, Paris, there is Rio, and so on. But you don't have anything on countryside or on wild area. If we look on a few years on the year before, the ImageNet open data set seven years ago was a way to obtain right now a really high efficiency with model able to deal with. So this kind of initiative is really the key to obtain a few months or a few years ago, a few years later, sorry, something efficient with a model. What are the next steps? The next step will be to deal with low resolution imagery. There is, for instance, planet or something L2 data set able to provide a wide world coverage, something like each day or each week. So it's a way to map every change on the ground worldwide. And the other is to have a feedback reinforcement learning with user. If you want to go further on this ground, there is some references. The best one is there. And the best one is this one. And if you want to play with, there is several SpaceNet challenges. The last one just closed. But we can bet there will be a round four. And there is a mapping challenge related to mapping based on SpaceNet data. And it will open in quite a few days or a week as it planed through to end in something like a free law made. Thanks. Before people start leaving, questions? Any questions for Olivier? Yes. So I might have missed it in your talk. How do you actually deal with the different resolutions from the input images you have different? The resolutions are very, very different. How do you exactly deal with that? If you look there, we just choose the output resolution we want. And if it's different from the resolution we want, we scale. And we can scale with the kind of algorithm in computer vision we want. Here it's by linear, but it could be by cubic, and so on. So in fact, we will scale it. So as input, do you have, say, for an image, you don't have RGB, you have those five bands, and then? Yeah. Because the next one, for example, the gray-scale one is huge, and the 16-band one is very, very tiny. Do you use it in the same network as input? Yeah. So the operation is called punch mapping. And if you look on data space, on SpaceNet data set, they provide both the raw data set as an input, or they also provide the punch mapping. So you have the choice to do it by yourself or to use a classical imagery operation to get the best from the both. But indeed, it's a kind of over something. Thank you. You're welcome. Any other questions? OK, if not, let's thank the speaker again.