 Hello everyone, my name is Krishna. So, I come from Intel. So, I just a brief intro about myself. So, I work on the Intel machine learning and embedded tools team. So, we have for you today this new toolkit called OpenVINO from Intel. So, anybody has heard about this or any idea about this? So, today I actually will introduce you to this tool and also I have some demo here for you today. So, we have couple of object detection samples which I have made ready using OpenVINO. So, I will show you once we go through the flow. I will then show you like how it is actually put into practice by running a couple of models. And it is a good thing about this tool is it is free. So, you can download this from the Intel website and try it out on your own as well. So, this basic outcome of this would be to get a good idea of what this tool is and the way you can get started by running some of the inference samples. So, what OpenVINO is? So, basically we know about classical CV application. So, most of the people would be familiar with Open CV and stuff. But Intel now wanted to put some extra machine learning capabilities into this especially at the edge computation. So, OpenVINO is a tool which is used mainly for computation applications at the edge. So, you the typical use case is you have a trained model. So, assume you want to do some object detection and you have a trained model. Usually you will train it in a separate manner. And once you have the trained model, you take that as the prerequisite for OpenVINO and deploy it in some kind of an edge device like this. So, this is an Intel NUC and also we have a Movidius compute stick here which I have plugged into the USB 3 slot. So, this is like you can think this one as a typical edge device. And it can be used to run inference on the edge continuously and you do not have to send your video stream or like images to a server for doing inference. So, this is one typical use case of this. Maybe we will see some more details on how it works in the coming point. So, this is just some of the benefits you can get by using OpenVINO. So, at the very low level, so it uses most of the highly optimized Intel libraries like the math kernel library and such like every year Intel has like this new platforms coming up. So, these libraries take advantage of the new architecture of Intel and help you accelerate the performance. Also you have the deep learning convolutional neural network based deep learning solutions incorporated into this as well. So, what OpenVINO has within? So, you have there are two broad aspects to OpenVINO. So, one on the left side you can see the DLDT. It is called the Deep Learning Deployment Toolkit. And the second half of it is the traditional computer vision. Previously, this was called CV SDK, Computer Vision SDK, but it's been rebranded as OpenVINO to include the whole tool chain. So, you see there is a model optimizer inference engine on the Deep Learning Deployment Toolkit. We will see how it works in the next slide. And on the classical side, we have optimized OpenCV OpenVS libraries and also media SDK for encode decode applications. The OpenCL drivers are used for heterogeneous computing. Just for example, you see there is a CPU here which is i7. You have a Moidus compute stick which uses a VPU and you don't have to do like recode for each of these hardware. So, one set of you code it once and deploy on multiple hardware using some plugins which are built into OpenVINO. In the demo, I will show you how you can use those plugins. So, this is just a foil to show what are the target platforms which we support. Most of the current generation CPUs, 6th generation to 8th generation, this OpenVINO supports. The code name is Skylake. So, Skylake onwards most of the Intel CPU and graphics we have support for OpenVINO and it supports most of the common OSS Ubuntu, Windows or CentOS. So, today I will be demonstrating on Ubuntu 16.4 LTS version. And there are some prerequisites like Python 3.4 and OpenCV which you need to install before installing OpenVINO. Of course, when you are installing OpenVINO it will show you what are the prerequisites or dependencies so that you can so this is like a this you can get it on the first page of OpenVINO downloads download the website as well. So, we will see a typical workflow. I think this is the one of the most interesting part to the OpenVINO. So, we have seen that the first part in the OpenVINO block diagram also DLDT, Deep Learning Deployment Toolkit. This part is actually a like more detailed version of what is inside the DLDT. So, these are the standard frameworks which is supported in OpenVINO. So, we have Cafe, TensorFlow, MXNet and recently Calgie was also added. So, the training happens on these frameworks typically in a high compute server environment. And then you use the trained models say for example, the cafe model and proto text if you have it you use it as an input to the model optimizer. The model optimizer will convert this trained model into an intermediate format which is used by the inference engine. Intermediate format is a set of two files. It is an XML file and a bin file. The XML file will have all the topology related information. The bin file will contain the weights and biases. So, this is fed to the inference engine which is the common API which can be used to execute your inference on CPU, GPU, FPGA and the Movidia stick. Also, there is a Gaussian neural network accelerator as well which is used mainly for speech processing task. It is a recent addition to OpenVINO. But these are the four standard plugins we support. And the GPU here is the integrated Intel integrated GPU which is a part of the CPU die itself. Yeah, so basically what is done is in model optimizer you take a trained model basically. So, but not all layers of the trained model is used for inference. And say if only the forward pass is used no we do not require a backward pass and most of the layers are not required. So, it has intelligence built in to remove such layers so that the performance is optimized. So, only the layers necessary for the inference are actually kept and that can be it is an offline activity you do not have to do it live. So, you can do it separately in some environment maybe in the same environment where you have done the training. Only send the XML and bin file it is also takes less memory as well. It is a one time activity put it on a edge device like this and then your inference runs here directly. Yeah, it is not exactly doing the training. So, you the first step is a training separately training happens on say you mean the training happens as a separate activity which and that is the input which is fed to the model optimizer this does not inference only yes. And if you have a set of say Pascal VOC data set or some data set which you want for say object detection. So, you train it on your framework of your choice maybe cafe or TensorFlow and then whatever output of such models like deploy a proto text you get the proto text and model file right that you take as input and then feed it to the model optimizer it is a python based tool. So, you just run this python code on and give the argument as your train model you just have to give the path to your cafe model it will then output the XML and bin file. So, that you take for further inference yes yeah you need to freeze it. So, you need to freeze the TensorFlow model and then you can feed it as input to open Milo. In cafe we actually samples I have is based on cafe. So, I recently did AlexNet as well the AlexNet BVLC we just have to directly input the cafe models. But for TensorFlow you need to do the freezing freezing step before yeah in that actually that is a separate thing actually for training part. So, usually we recommend beyond line of processes for that and you can use Intel optimized cafe Intel optimized TensorFlow and we also have an Intel version of python as well. So, these like instead of taking a standard available version of cafe you can actually choose to like go to the get actually it is open source you can just type Intel version of cafe. Yeah it depends on your custom layers basically we have in this if you download the open Milo toolkit right we have documentation inside detailed documentation it will give you what are the layers in cafe what are the layers in TensorFlow it is supported and if your custom layer has uses the same like whatever is listed which is supported then it is okay. But if it is something new then open Milo treat it as something which is new for it and then it has to be done. So, yes but it depends on whatever is supported. So, if whatever you have is supported in this open Milo already it has a detailed list of what are the operations or what are the internal operations it supports and it if your custom layer also supports that then it it is directly supported else it is a long step we need to do some extra work on that. I can show an example now so maybe interest of time so it will just I will just connect to this. So, once you install this is a Ubuntu 16.4 LTS system so once you install installing open Milo is very straightforward so you just have to download it from the Intel website you will get a tar file you just unzip it and run the script. So, it will take this as the default directory in opt Intel so you will have a folder like this computer vision SDK with some version number so I am using 2018 R2 version and it is a very these tools keep getting updated regularly so you can expect another two or three more versions to be coming up soon. When the latest version can be found on the website and you get lot of internal directories like this. So, you have deployment tools samples and there is a directory called deployment tools this is related to the DLD and we have the different models which are already pre-trained on networks like cafe and tensorflow and you have the XML and bin file ready in FP 16 and 32 format. So, you just have to use this and get started with your inference so this is these are some of the samples which we have. So, I will just run just to give an example I will just run one example of a face detection quickly explain you what does this mean. So, we have this interactive face detection sample as the first argument and then minus M option stands for the path to the model file and here the model is not related to any of the cafe or tensorflow it is just the IR format because once you convert the model optimizer it is all XML and bin file for the inference engine. So, if you see here it points to a FP 32 model which is the face detection edas dot XML and then minus I is the input so where does it get input from it can work on an image it can work on a video or we can even work on a stream coming from a webcam. So, we have tried all the three. So, I will just show one example on a video yeah. So, you can see here it is a face detection. So, it has only one label and most of the you can see on top the inference time and FPS as well. So, we can see most of the yeah. So, it is getting close to see this is a kind of very powerful processor i7 so it can get up to I think 40 50 FPS but we have tried the same on a low end up square board also and we can get up to 15 to 20 FPS on that. So, and this is how we it is a quite straightforward. So, you just need to know which are the options and now the interesting part. So, if we want to use by default if you do not do anything the inference engine uses CPU but there is an option to tell open V note to run this inference on GPU as well. In the way we do it is we give a option called minus D transfer device and then we put GPU and then execute and you see the plugins are different. So, if it is CPU it loads MKL DNN plugin. So, now this is happening on a GPU and if you want to see the layer by layer information as well we can do that but this is one option which we get and it is you do not have to redo any code code here right one code and you can leverage all the different hardware using the same code just we need to know what are the plugins to be loaded let us just execute it again and by GPU here I mean it is the Intel integrated GPU. So, which is a part of the same CPU die it is coming up coming close to I think 40 44 45 FPS and the other option is this Movedia stick. So, this Movedia stick is a USB form factor it if you can see here I am just showing it here. So, it just fits into one of your USB 3 slots it just needs a USB 3 slot also it has a PCI based variant also but for now we just have the USB 3 Movedia stick and if we want to use Movedia it supports 16 f 16 fp model it does not support the 32 fp. So, I can maybe run one on this now I have just tried it a little while ago. So, it is a different model. So, it is a security barrier which uses 3 models. So, one to detect the vehicle and then to detect the number plate and what is written on the number plate. So, it uses 3 models in series and all these are inputs again the same format it looks long because there are 3 inputs to be fed but it is all the same minus I stands for input and then we give the XML file here. So, once I click and we have so you can see loading plug in myriad. So, it stands for the VPU and you can see here that this image it can detect the vehicle and the color of the vehicle as well which is written here in white. So, it is little not visible I think. So, it is written it is black curve it can detect the car and also the color as well as the number plate here and the FPS I think is it is very high. So, there are 3 models here. So, and the matrix are given for each of these models here. So, this is on myriad and if you want to have more in depth like matrix we can have another option called performance count PC and which will give you even the layer by layer information as well on what layer is executing where maybe you can see here. So, there are many options supported by Open Mino. So, I think it is a free tool. So, once you start using it I will encounter lot of such options and it has very detailed documentation in this folder here. If you see this is the folder which has very detailed documentation for each of the models used and how you can play around with different options internally. Yes, it supports Windows, CentOS and Linux. In Windows it is quite similar everything is similar just you instead of the script you will have any XE5 it is all similar the folder structure is similar. Yes, like only if you want to modify some layers of it. Yeah, actually that is possible but it is on a case to case basis it is a little elaborate it is not straightforward. Yeah, but again it depends right like because we had like last two couple of weeks back we had interactions with some customers who were okay with 10 FPS. So, it was their requirement and all their main problem was they could not connect every time because their deployment was in a rural place. So, the internet connectivity is very patchy and getting back and forth data between the server and the like node was very difficult. In such cases for them FPS was kind of a compromisable factor but they needed this edge capability. So, they could train this like model separately just transfer the XML and bin file to this edge device and run locally the inference and only transfer whatever important data you need to the server for further analysis. Now, whenever you give the image so automatically the downscales it before feeding it to the inference instance or this object detection SSP model. Yeah, SSP model. Okay, I think I need to check that so maybe yes you can try it right out. So, video I am not sure how you can it is there okay but we can send it to you if you want. So, we have some sample data set we can send it to you. So, thank you.