 Alright, hey guys, I'm Stefan. Hi from the Freiburg University in Germany. It's a small university in East Germany and I'd like to give you some brief insights into our add-on Blender and add-on used for generating synthetic data for deep learning. So not for generating synthetic data by deep learning and we want to create or use Blender for creating training data, labelled training data for supervised learning models, for deep learning to train AI models for example for object detection. And we do a lot of research in the fields of robotics for example and we want to teach systems to understand their environment, to understand their surroundings and of course you can use cameras for this. This is classical image understanding or image classification process and labelled images and pipe them into an AI model. But it's common for robotics as well as for autonomous driving for example and to use depth sensing. This is for example a LiDAR sensor, a laser scanner scanning your surroundings and giving you some specific data structure which is called a point cloud. You can understand it as a particle system, like a set of points. LiDAR is a laser based system, we also have for example sonar systems working with ultrasonic waves used for water bodies for example. And what we want to do is to take this data and to find certain object classes in these point clouds we receive from this depth sensing data. And the problem as always in AI is we have a lack of data, we need a lot of training data. So this is what we have, the pipeline here, we want to train these deep learning models with point cloud data, labelled point cloud data as input sets. We have beautiful 3D environments hopefully in Blender, our mesh representation of the real world. And what we want to do now is the virtual sensing, we want to implement or use our add-on, implement several depth sensing sensors. For example LiDAR systems create a lot of point clouds and use them as input data for our AI models. And therefore my colleague Lorenzo and I have developed this add-on, it's called Blender. You can find it on GitHub and try it on yourself. Blender was built for version 2.93 LTS and we plan to publish a new version for 3.3 but it's currently under development. But you find a manual how to download the add-on, how to implement it. And there are also some dependencies especially for the output data, for the Python packages related to these output formats. So Blender contains especially the virtual depth sensing as you can see here with Susan. As you have Susan on the left side with the plane, the mesh representation. In the middle you can see a raw point cloud with a typical shadow. As you can understand if you do a ray casting or this laser scanning, all the area behind an object is not reachable. This is the reason for this very characteristic shadow. And what we want to do is we add the labels to this data that you can distinguish between Susan and the plane. Pipe this into the AI model and train to find this difference between these two objects or more. This is a very simple use case of course. Several add-ons or several modules are implemented for Blender. We start with the virtual environments. Of course we need a 3D representation of everything we want to measure virtually. This can be done procedurally for example as always usable for landscapes or for anything natural. We can use a semi-static representation which means we can alter our models and make use of publicly available repositories. For example ShapeNet where we have a lot of different 3D objects. For example for aircraft detection which is also a use case here. And we can of course implement animations. Then we implement our sensors. We have some predefined sensors here. Famous LiDAR sensors like Velodyne Ultra Poke or Alpha Poke. But you can also define your own sensor by setting up a YAML file, piping it into the add-on. And then get the characteristics of your real sensor. We implemented some arrow models. As always there are some arrows in real measurements. For example LiDAR is very sensitive to rain, to dust, to snow and fog. We implemented these effects as good as we could and used the formulas we could find in the literature. And we also implemented some kind of a random arrow measurement. Just taking Gaussian arrow distribution and add it to the models just to be closer to the real measurement. You do the signal processing. This is done by ray casting. You emit the array. Look for an intersection with some geometry. And get the distance between emitter and desensed object. You can define the process of ray casting by using different materials. For example to define the reflection or refraction effects. And you can also add a water sound profile which is important for the sonar system. For sonar for example you have effects of salinity or for the depth of water. And you can set it up in your add-on here as well. Then you do the labeling. We will take a look on this in the following slides and in the examples. And also we can do an image annotation. This is some kind additionally here. As I've mentioned before the focus is on depth sensing. But you can also annotate rendered images with this add-on. And then you can export the depth sensing data to some very famous formats. For example LES, CSV or the binary HDF5 file. This is why we have some interdependencies with some other libraries here. You are familiar with the process of ray casting. But just to show you some failures we might have with depth sensing. Especially light are here. And when we have a mirroring object or some reflectivity in our scene as well as refraction. And we might get a different representation of the environment in our point cloud data than we have in reality. As we just cast the ray, measure the time between the start and the end of the measurement process. And then receive the distance to the sensed object. But when we have the effect of a mirror or of a refraction for example by water or by glass we might sense a different object or a different position of a point in the point cloud than we have in the real world. These effects are of course implemented here due to the implementation of the ray casting blender. The process is very simple. When you have your 3D objects for example this chair you can use for example the custom properties to set up an ID for your object. When you do an object classification as you can see here we set up the category ID chair for the whole object or you can define a specific material name to use this as an ID. This is your choice. What you can also do is not to take the whole object as one class but to take different parts of it to do it some kind of a part segmentation. This is also very famous for deep learning and point clouds. Then you can set up different parts of the object as you can see here the part ID for the plate and the leg doing here for the chair. For example I've mentioned the aircraft before it's very important to distinguish between the aircraft wings and the aircraft body. This would be a specific use case here for part segmentation. And then here a very simple example for these 3 points for these 3 chairs. We have these 3D objects. We have our virtual sensor. The virtual sensor is always attached to a camera in our system. We set up all our sensor specifications, all the parameters. We have set up the different IDs to do the labeling process and then in image B we receive the raw point cloud data. This is what we would receive in reality. There's no classification. We cannot see different objects in here. It's just an unordered set of points of X, Y and Z coordinates. We have to attach to a different or link to a different object class. So now our addon in image C can use the labels we have defined before attach the labels of the different objects to the single points and pipe them into our output file or do the part segmentation in D. Use the different parts of the object. Use these IDs and pipe them into the output file. This is what we would give at the end to the AI model to train on it, to find on a raw point cloud in B the differences as you can see here in image C and D. So this is the creation process of training data for the point cloud data structure. It comes freely with its addon. It's just additional content here. You can also do image annotations using, of course, the rendering capabilities of Blender, which would be much better than in the right image here. But you can do it. You can render your images. Use the same IDs, use the same object categories and then do a pixel-wise annotation or using bounding boxes as you can see here. This is saved, for example, by the Pascal VOC format, a famous format for image annotation. This is not the stuff we concentrate on in our research for the robots, but which is very famous for these image classification processes. And additionally to the annotated image, you get the depth image, which is sometimes important, for example, for the sensing of RGBD cameras if you want to train models on these specific sensors. For Sonar, it's quite different. Sonar is for the water underground, often used in maritime robotics. You can receive a 3D representation, again with point clouds. This is also possible, but mostly you get some of these images, which are very, yeah, which look strange in the first place. But as you can see here, and this line in the middle from image A, this is the real image of a Sonar scan, is some kind of your driving line of your boat. And then to the left and to the right of this side scan measurement, the first image here is the lowest-received depth of the water body, and the signals just go to both sides, right and left from your boat, and detect the depth from this single emitting point here. And this is what we can also do here for this representation in Blainer. And this is a very simple 3D scene, of course, but you can receive the same type of data, also labeled again using these different IDs for the object classification, and then, for example, trained a Sonar classifier, which is kind of rare, but there are some deep learning models used for this. Okay, then let's just jump into some examples to give you an impression of the working process. The add-on itself is attached to the end panel, you can see here, you have to set up your camera as your synthetic scanner. You have to set up if you want to use LiDAR, Sonar, or time-of-flight cameras like the Kinect. Then you can use one of your presets, for example, VeloDyn, the UltraPoke or AlphaPoke here, say if it's a static or rotating scanner. A static scanner is like we have in the iPhones and iPads today. It's like an array doing the array costs, and when you have a rotating scanner, it moves. So you can, for example, get a 360-degree measurement. The side scan is only important for our Sonar scanner here. You can set up the parameters. This should be done by setting, or by loading the presets of your scanner, for example, setting up the field of view and the resolution. You can add animations. You can alter your objects, for example, by using these repositories like ShapeNet and create a high variety of different scanners instead of setting up a whole new 3D scene every time you want to scan. You could just, for example, set up your source folder for 50 different aircraft models and they are changed automatically for every scan, just to create this high amount of data. Swapping or random modification, this is done for randomly rotate the objects to get some movement in the scene. You can add the noise. The random noise on the one side and the different meteorological effects like rain and dust on the other side, for example, simulated by a certain rainfall rate. And then you can set up and define your desired output format, LES, HDF5, or CSV. And, of course, set up your output location. And then it's done. I mean, this is a very simple example here. We're at a small plane just to have a second object here. What you need is, you need a material for every single object just to process this ray-casting process with having the image at the object ID. And you can set up the parameters of the reflectivity, for example, at the IDs and then do the scan. You can see this is very quick. You can leave a noise scan and a real scan if you added the noise scan here. And you can pipe this data here into your AI model. This is what it looks like at the end. You have the, this is the clean point cloud without any disturbances, without any noise. And just to show the example, the different measuring points for the plane as well as for the cube as to object classification problem. We can, of course, do it with high-poly models like this. This is a 3D model of our research mine in Freiberg and was created by 3D scanning and photogrammetry. And you can add your, and this model uses textures, and you can also, of course, use the color value and all the values of the texture in your point cloud as well. So this takes much more time if the time increases with the number of polygons of your model. But in this point of application, time is not really the important factor because it's just the creation of the data. It's important at the end when you train the AI model, you want a real-time application, but for this it can take up a little longer. I think this scan took up, like, I think, 20 seconds. So it's okay. The same thing here again for Sonar. We have a water underground model. Again, created by photogrammetry. And this is a model with 340,000 polygons. So it's not, for a very big area, it's not that detailed, but it's enough for us to just get some training data for the impression here and receive some side scan data, which are quite difficult to this rotating LiDAR scanner we saw before. Okay. The thing about synthetic data is every time the question, how far do we get to the real-world data? And this is, we tried to check on this with some evaluations as well as validations. For example, here by comparing the depth images of a real Kinect and of our time-of-flight camera in Blender, we created the same scene in 3D as well as in reality, just by stacking up some boxes, rotating them, getting some different depth values here, and compare these depth images in C and D. And the results were very good. The biggest differences were in the vignette of this image, some of the surroundings, which might be linked to some arrows in our camera. I don't know, but in the main part of our sensor, the results were very similar. We did the same with a point cloud measurement for a LiDAR scanner again in our 3D model of our research mine. This is a photo in image A of the real mining gallery. This is the 3D model, and we used the part with very less arrows. As you can see here, doing photogrammetry or doing 3D scanning, there are also some difficulties in creating the 3D model at the end. We want to get some kind of a clean model, compare a real scan in the research mine with a virtual scan in the 3D model received by 3D scanning. And this is the part of the mine we compare to each other, and with cloud compare using some distance measure or cloud-to-cloud distance measurement metric to compare these two point clouds to each other. A point cloud comparison is quite difficult because you can also use the same LiDAR scanner in the same environment twice, and you get some differences in the point clouds as there are always some measurement errors. So also a comparison between a virtual scanner and a real-world scanner is difficult, but to get a general impression about the differences and about the gaps between the virtual sensing and the real sensing, it's quite a good starting point. Especially for this high-resoluted point cloud, it was a very good and very close result we could achieve with this. But the biggest point will be to train AI models with this virtual sense data and then transfer them to the reality. And this is what we need to do in the future. And just to show some metrics here, of course the computation time increases with the number of polygons as well as with the number of rays. If you set up your very high resolution LiDAR sensor, of course the computation time increases and also with the number of objects you have in your scene. If you have a very complex environment, we also had some virtual city environments, for example, and the computation time is quite high compared to this simple example we could see. But again, time is not an important factor here. So what we could do with Blender, we could create this data generator, especially for depth sensing data. We could also validate it in very simple use cases, but what we have to do is train AI models using just these data, go into reality and then detect similar objects we had in our Blender 3D environment with just using these virtual sense data. And this is the big question here, how good will the model perform and how good was our representation of the sensors in Blender. And that's about it. Hope you get some impressions. If you're free to use the add-on, it's quite easy to install and if you have any questions, don't hesitate to ask. Thanks.