 The objects in your daily life are about to become smarter and interact more with their environment. A key feature for this is to enable them to see. This is known as computer vision. With the advance of artificial intelligence, this capability is now possible on small microcontrollers enabling many new use cases. Traditional computer vision requires careful algorithm design which limits the accuracy when conditions vary in the field. The deep learning approach has proven far superior for more complex tasks with learning optimized models based on example rather than trying to handcraft rules that applies in all possible conditions. For instance, on the famous ImageNet data set, deep learning allowed to jump from 50 to near 90% top one accuracy. However, porting computer vision AI algorithm on simple microcontrollers with limited memory on performance can prove difficult. Today we are going to show how the ST ecosystem has radically simplified this task for all embedded developers. Hi, I'm Thibault. In this video, we are going to see how to develop a complete computer vision application from camera data acquisition to a neural network classifier running on a STM32 target. I'm going to present you the boards and the software needed and then I will jump into the demo. For these kits, we are going to use some of the STAI solution such as STM32Cube.AI, the Function Pack FP AI Vision 1 and the STM32H747 Discovery Kit. You can take a look at our wiki for complete step-by-step guides. Let me present these tools. First, STM32Cube.AI is our well known tool to convert train neural networks into optimized C code for microcontrollers. Next, the Function Pack FP AI Vision 1 is a great way to jumpstart your computer vision project. It contains code examples to run AI-based computer vision applications, such as food recognition or person presence detection. Today, we will take advantage of its recent feature to develop our application. Indeed, the Function Pack Vision 1 now contains a way to use your STM32H747 Discovery Kit as a webcam UVC camera. What you need to know about this board is that it is based on our high performance STM32H7 clocked at 400 MHz, offering 1 Mb of RAM and 2 Mb of flash. It also embeds a capacitive touch LCD display as well as a ZIF connector to compete with camera module. Speaking of camera module, let me present you the new B Camps OMV. This vision bundle contains everything you need to transform any ST Discovery or evaluation board into a complete computer vision system. Out of the box, it contains an ST camera module with the OV5640 5 megapixel image sensor, a camera module adapter, and a flex cable. The camera module adapter can also be used with other camera sensors, such as WaveShare or an OpenMV camera. You can test experiments with other kinds of sensors, such as a grayscale HDR global shutter camera from ST or the OpenMV FLIR adapter camera, which is a thermal camera. This enables many new use cases, like the detection of people, the detection of gas leakage, of heat loss, or motor hovering, and many more. Before starting a computer vision project, it is very important to assess the use case. What is the expected performance? What MCU do you target? Do you have low power constraints? And so on and so forth. Answering all these questions will help you to choose your data set to choose your model topology and your input size. Then, when you want to build your data set, you need to take into account a few guidelines. For instance, the data set should be representative of the runtime conditions. You should capture all your problem space, take into account rotation, scaling, translation, light changes, and so on. Also, it's always better to distribute the examples across all categories, equally. This is called the balanced data sets. And finally, using the same sensor for data set collection and for the inference time is always a good idea. Now let's see how it goes. In order to collect some images, we will start by opening the Function Pack FP AI Vision 1 and navigate to the USB webcam application. We can take the binary and drag and drop it to the board. This will flash the board. We need to make sure to press the reset button and to connect the board through the USB OTG port. Then, we need to open the Windows Camera application and select the STM32 webcam camera. The image is available into resolution VGA and QVGA. We will stick with QVGA for this demo. We can now start taking some images by varying the camera orientation and the the pose. And once we are done with it, we can just move the files to a dedicated directory and move on to the next one. Now that the data set collection is finished, I have created a zip file containing all the images and I upload it to Collab. Collab is a free Jupyter Notebook running in the cloud and gives access to GPU resources. I have already run this notebook and I will comment the main steps. First, we start by importing the necessary packages and we unzip the data set. Next, we load the data with the help of the image data set from directory function. This will create pairs of images labels and automatically resize our images to 128 by 128 pixels. We can now visualize our data set. We split it into a training and validation set in order to monitor the training process. We add data augmentation to the training set, random contrast, random zoom, rotation, translation and we visualize the transformations. The last step before training is to normalize the data. Now we can move on to the creation of our model. We will use a pre-trained mobile net v1. This technique called transfer learning allows to train a model without starting from scratch because a pre-trained model on a large and general data set serves as a generic model of the visual world. We can see here the training took place and we can see the plots for the loss and accuracy. The last step is to quantize the model. This is a very important thing to consider when working with embedded devices. This will improve the latency and the model size. We use TensorFlow Lite converter to do this and the output is a quantized model file as a TF Lite. Now that we have our model, let's import it to kube.ai. STM32cube.ai allows you to import a wide range of models from famous frameworks such as Keras or TensorFlow. It also supports the ONNX format which allows you to use MXNet, PyTorch, Matlab, etc. The goal of the tool is to produce optimized C code in terms of inference time and memory footprint for ST microcontrollers. I have opened kube.imx and created a new project with the H747 discovery kit. All I need to do now is to load the STM32cube.ai plugin. Let me import the model and show you its capabilities. When we have a TensorFlow Lite model, we can choose between two rend times, kube.ai and TensorFlow Lite for microcontrollers. Once analyzed, STM32cube.ai will give you a quick overview of the model memory footprint, complexity and parameters. In the embedded world, those parameters are critical to be able to choose the right model for the right target. We can go more in depth and explore the show graph button. We can see the model topology, the C graph and the RAM memory usage layer by layer for the so-called activation buffer. In the advanced menu, we will find options to deal with external memories. We will find the allocate input output option, the split weight option and the relocatable network option. The allocate inputs option enables you to reuse part of the activation buffer for the input buffer. Let's check this one, as our input buffer is an RGB image, it takes a lot of space. The split weights option lets you have a finer grain memory placement of the different layer weights. You could choose for instance to place some weights in external memory. The relocatable option enables the creation of a binary object which can be installed and executed anywhere in memory. This binary contains a compilation of the C files including the forward kernels and the weights. This provides a flexible way to upgrade an AI-based application without regenerating the whole firmware. Let's analyze our model again. We can see that the RAM footprint was reduced thanks to the allocate input option. It is possible to validate that the generate code yields the same output as the reference code. You can validate on PC or on Target without having to write any code. QBI can do a lot more. You can quantize your model through QBI or you can use the compress option to compress the fully connected weights for floating point models. Once you are happy with your model, you can generate the code for QBIDE ArmKill or IAR Embedded Workbench. It is also possible to use stm32cube.ai through command line. Now let's move on to the integration in the function pack. I have opened the Person Detection project into stm32QBIDE. I replaced the network.c and network.data.c files with my new model and I updated the name of the labels. We are now ready to compile the project and flash the board. The function pack handles camera acquisition available in both VGA and QVGA. It will handle the resizing, the pixel format conversion and the quantization. Note that both floating point models and quantized models are supported through the function pack. If your model doesn't fit in internal memory, you can find examples for external memory placements in this pack. Okay let's see the demo. So I can put this board in front of the camera and I can see that it is correctly detected by the board at a frame rate of 13 frames per second. I can try with other sensors for instance and we can see it's working fine. It was very easy creating this application. The STAI ecosystem helps you a lot in this process. The stm32h747 discovery kit and the BCAMS OMV are great tools to start your proof of concept computer vision application. This solution can be integrated in any stm32mcu with the m7, m4 or m33 core. This use case is not limited to the classification of boards. It can be extended to any other object in the industry. Also more classification applications can be addressed such as smart kitchen appliance, a fridge recognized product available or an oven stops cooking when the meal is ready. You can also find smart agriculture recognizing ripe fruits and vegetables, smart farming by classifying and counting animals or smart cities by recognizing car at traffic lights and many more examples. We have developed FPAI vision 1 in such a way that you can easily port this firmware to your own use case with your own data sets. Simply regenerates the files and you are ready to go. stm32cube.ai and FPAI vision 1 are already available on st.com. Please visit our webpage to find more about the STAI solution for microcontrollers.