 So it's time. Let's start again with again a gist of a talk about this time embedded state of the union Please welcome Olivier. Hello everyone So this is a bit of a follow-up to Tim's talk for those who were there an hour ago and But this time I'm going to really focus on what's new for embedded users these cases So first who am I quick introduction? My name is Olivier. Hi. I'm been a G streamer contributor developer for 12 years now At collaboration since 2007 first I started doing video calls on using telepathy for the me go may more platform with Nokia and Since then we've most diversified to basically everything that you can do with G streamer from video editing to embedded systems to transcoding cloud systems But a lot of the work that I've been doing in the last couple years has been on embedded systems Well, this is why I'm going to talk about them today Just a quick introduction what kind of embedded systems use G streamer and the answer is a lot everything that does video Audio and embedded systems you will find G streamer in many many many products there sometimes We're really surprised to find them. So for example in the TV space. It's pretty pretty pretty big On my slide you have some some LG TV some Samsung TVs A lot of these actually have G streamer on the inside on the smart side on the top left You have the Xfinity box from Comcast for those who are not from America Comcast is the biggest cable company in the world and every set of set of box that they ship is Linux box with G streamer for all the playback And then there's a bunch of others bottom left. It's you you view which is a British company more setup box a lot of the TV space has it on the Endpoint but now actually a lot of the recently it's been growing in the in the Production side as they've been transitioning from hardware and FPGAs to software-based workflows and G streamer has seen a lot of traction there Inflate entertainment another big one that people interact with all the time So in a lot of planes almost all the planes modern planes these days by modern I mean from the last 10 to 15 years You go and the end in flight entertainment every time you play a movie. That's G streamer Even in this space station, this is a really cool one. That's one of my favorites. I show it every time. It's a little Camera that floats inside the space station made by the Japanese space agency and that has G streamer inside so I Spoke about these but there's also a bunch of other devices like security cameras A lot of the high-end ones have G streamer on the inside Yes Where So it depends all the new generations are 1.0 generally like the ones that fly in the sky in the planes They're these are probably from 10 years ago. So that's probably whatever version they deployed In the very past, but for example, this is one one stand something. It's pretty recent Smart TVs, they actually some of them are actually quite like this Comcast guys are on 114 as far as I know So there's a lot of these actually they keep up because they want to deploy new features, right? They need the newer dash and the newer HLS features. So they have an incentive to keep up And they have like rapid deployment and all these things in this industry now It's not like especially in the TVs and the cable operator. They really want to deploy new features very quickly So I was saying there's all of these but there's also a bunch of others like all the industrial equipment also that You don't think about but they process video in there and they use G streamer very often because it's on many Embedded chips when you buy them G streamer is the framework that comes already working if you buy exilings FPGA that have a Video encoder there all the framework that xilings enables is G streamer So it's the quickest way to get the working product So I'll give a little summary of the things that we have done that Tim hasn't covered that are really specific to embedded First a lot of work has happened again this year around video for Linux codec support So video for Linux is the Linux kernel API that is used for Things that have a queue So video encoders video decoders amongst other things displaying things capturing from a camera some display devices Some other video processing devices scalars and things like that that are not in the display path That are used in the memory to memory mode So there's a lot of these but a big part is video encoders and decoders in The last year we've merged the HEVC encoder and the coder support we have and jpeg encoder a plug-in and Vi codec, which isn't kind of interesting because it's not a useful codec at all It's only to be able to test the kernel infrastructure So it's a fake codec that is implemented in software in the kernel to be able to test the whole codec Infrastructure inside the kernel without having to deal with the actual hardware Another nice feature that especially on the camera side some cameras took like seconds to probe Because we enumerated a bunch of things that we didn't really need to enumerate so Nikola has done a lot of work and Now the device probing is instant on almost all hardware So you can get the list of all the devices and the relevant capabilities Last year I talked about stable element names for encoders and decoders So originally for video for Linux elements in G streamer We were generating a new element name every time a device would pop up They would connect in the camera and a new element a new camera a new encoder and your fun and the element name would change would appear with a new name and Which means that some on some in a system every time you would reboot it the elements would have different names Which is good, which is okay if you're using like play bin and it's a Auto-generated pipeline because you don't care about the name But if you're doing pipelines by hand it was a bit annoying so now we also we have these still but we also have a set of elements that Have static names and then you give it the device so you can control it more manually So we had this for encoders and decoders last year this year We've also added the same thing but for transformation elements So transformation elements are things like scalers or color converters or different elements that normally convert Raw images into different different formats one thing that His being discussed that's not there yet because the kernels not there is Having good support for stateless codecs. So most codecs that People traditionally use on these embedded hardware. You would just give it the h.264 Bitstream it would give you a decoder stream It did the whole parsing of the bitstream and everything in the hardware side, which was often just firmware running on a different chip But now the new trend is to make the hardware cheaper by moving all of this parsing Onto the CPU side and by doing it in user space and software. So you need slightly different APIs in the kernel There's a lot of work happening with the request API is in the kernel to do exactly that and just remember will support it once once it's there This isn't actually we've been actually working on completely separate subject now We've merged a plugin called IPC pipeline, which I like to talk about because I think it's really cool and it allows you to Split a pipeline into multiple processes But have a single the master pipeline controls all the others So for example the typical case is that you Receive you have one process that talks to a network that download the dash stream from the internet that That is exposed and then it passes the data to another process which actually talks to the hardware decoder So you can separate the hardware decoder the part that and these are often Not as secure as you would like they do a lot of bitstream parsing and things like that so you want to make sure that this part that might be compromised by In that stream is not connected to the network and vice versa So we can actually split the pipeline in multiple levels I have one stage that talks to the network one stage that does the parsing and then separate stage that talks to the hardware and This is also useful to implement the RM sadly but it allows really you to have a Multiple levels of security and it's right now. It's not It's it's you have to create everything manually so it's not like automatic at all right You have to actually create your pipeline manually, but for these kinds of devices You actually want to control exactly what happens in what process and everything Another completely separate thing we've implemented a new mode of interlacing that actually exposes how some of the hardware works So traditionally in interlacing is that you have one frame in Traditional analog TV you have one frame that contains every outline and then the next frame or the next field contains every Evan line so instead of having 30 frames again you have 60 fields and these fields are really half the frame one odd one even odd even and One these come to digital Mostly what you did is that you put both of these fields in the same file in the same frame in the same buffer All right, so you have the odd lines the even lines are are taken at slightly different times and This this is why when you look at it like without properly the interlacing and you'll see a jagged image This is a traditional way to do it some hardware any 265 actually do it in a different way Is that you actually have the front the fields separately in separate buffers at separate times? this allows for a bit lower latency so you can have like You don't have to wait for both fields have been captured to start processing the first one And we've implemented this for GST when X and actually through how the G streamer framework We've done also a bunch of work to reduce the latency in RTP pipelines When we actually try to measure the latency of the actual data We realized that the latency that G streamer claim to have was not really true That there are a bunch of little bugs all around the different elements that actually made the data staying there longer than it should have So we did a bunch of little little bug fixes and now you can actually Push buffers with the latency that it claims to have which it which can be almost zero if you're not doing anything that any queuing We also done some bunch of work in the G streamer open max elements a Lot of this is for the Xilinx Zinc and P platform we fixed a lot of bugs. We've added support for 10-bit video formats for HDR We have a lot more DMA buff and zero copy modes So that you can connect them in different ways and Have one side or the other do the allocation depending on what's better for your specific use case We've had the region of interest. This is a really cool one. I think Basically, it allows the Application to stay in my image this bit is where the really important thing is so put more more bits there when you compress For example, you can have a face recognition algorithm. It says there's a face there So in my video everything else is not that important But the face has to be recognizable or the car plates or things like that or the sub titles If you have subtitles burning to the video, you might want to put more detail there So the text is readable even though you really compress the rest of the video a lot Another little thing is a dynamic frame rate in the encoder. So now the encoder If you change a frame rate it doesn't re-initialize everything it just changes the frame rate so you can change a frame rate really quickly Another nice thing we've done a little bunch of little improvements on the DMA buff Support one of them is that now we do explicit DMA buff synchronization It turns out that the Linux kernel when you use DMA buff first when you access them You actually need to tell it I'm going to access it and now I'm stopping accessing it so that it does the appropriate cache invalidations otherwise you get Corruption, especially on Intel hardware, which is one that actually lots of people use and We have a direct GL DMA buff uploader. So traditionally when you would import a YUV image You would have to use a shader Converted to RGB to be able to be put on the display, but some hardware and particular vivante in the IMX sex They have a they can actually take the YUV and displayed directly Entirely in the hardware so there's a direct uploader now that bypasses this shader that does the conversion so these are the things a bit of a summary of the things I found by looking at the Git log and trying to remember what happened in the last year and These are some things that are actually being worked on more or less actively But I think are kind of interesting There's a bit a lot of activity around neural networks in this tour and your own networks are all the rage this year and A lot of it is to process video to recognize things in video and this room is really good with video so We have for example Nvidia has released something called the deep stream SDK that works both on embedded Tegra and on the cloud side on the and that uses CUDA to do the actual Neural network processing, but all of the video Elements are actually using G streamer This is largely proprietary sadly But the the G streamer actual G streamer is not but all of the interesting bits that Nvidia has done is using their proprietary things But there's also a couple open-source Projects that have been released as one called NN streamer There's something called GSE inspirations meant to be released next month. They promised To do neural networks to G streamer So I expect that in the next year or two will probably have something upstream That everyone can collaborate on to integrate neural network frameworks with G streamer And another thing that's coming up now is different companies are coming up with specialized hardware So instead of using GPUs Yes But we use tensor flow say a lot of these frameworks actually integrate tensor flow or tensor flow RT With G streamer that's like the main one, but I know some people actually using other other frameworks, too Yes, so yes, also specialized hardware is coming I saw a lot of these there are like specialist accelerators that are not like GPUs but that being really designed for AI workloads and these will Require some integration and the streamer is probably going to be a key thing there Another branch that we have not merged in a long time But I have promised to review is Android camera to API so this is a new camera API that Android has had for a couple years now and that is completely More modern than what we're using right now this new API. It's exposed in the NDK so you can use a Native C API to access it instead of going through Java and going back to the C++ It does things like where you can like record a video and take a picture at the same time It allows you to capture multiple streams at the same time like the front camera and the back camera at the same time on some on some phones, etc and to have all the modern features that your Phone app that your camera application as in your phone. They're all exposed for that API. So there is a plug-in in the Git lab that is Meant to be reviewed and hopefully will merge it in the near future There's also a bit a bunch of work about remote tracing So this timer has a tracer framework that allows you to write tracers to trace things and to especially nice to Find performance brought on the next figure out what exactly is going on the pipeline is running and Now there's some interest to do it remotely so to have some infrastructure to forward this information to a separate computer So you can have the tracer running in your embedded device But never nice UI on your computer to know what's going on my pipeline and why are my frames being dropped even though Many indicators say they shouldn't be right. So where's the bottleneck? What is the bug? What have I been doing wrong? another Next step that we're going to do that. We've just just switched to Git lab Where we gain amazing new technologies such as doing a build before you merge the branch and not after and running tests before you merges instead of after so and in this move we've Right now we built for Linux x86 64 we built for Android and we built for Windows And we would really like next to build for an embedded platform To reflect where G-trumor is most used And the next step after that will be to actually run tests on an embedded platform We've built a prototype using a Raspberry Pi and The prototype used Jenkins and a lava so Jenkins to do the build and lava to actually run a device Our goal would be to replace Jenkins with GitLab CI here so that we can actually integrate it nicely in our CI We have a vague plan on how to integrate GitLab CI with lava But I don't know if anyone else has redone it so we actually have to actually see if it if it works to use the GitLab CI APIs to drive lava and Have all the the details together. So This is basically what I had to say You have any questions, yes Yes, you So the Nvidia part is already there it's called deep-stream SDK it works on Jetson and Whatever the server ones are called No, no a point of interest is that as far as I know is only available on Intel and So it needs supports in the you need support in the encoder. I don't know what the NVN API can do So but that then it will be in all platforms Hopefully in few years it will be everywhere. Well, yeah, but especially this region of interesting It really depends on the actual encoder implementation So I don't know what they have in there and their implementation Yeah, go for it Other question I have a tool questions you mentioned before this IPC pipelines Yes, great. There's another plugin called in that pipe Yes, allows to make a separate pipelines and think them together What would be advantages of this one or that one? So the main Thing that IPC pipeline does differently from inter pipe is that IPC pipeline actually controls the pipeline Entirely so when you do you put the the master one in the post state It put all of the other ones in the other processes also in the post state. So the the control is Unified while inter pipeline kind of the point of it is that it's the other way around They're really separate. So you have a standard in the receiver and you can stop Let's say stop the sender and change it and stop the receiver independently Well IPC pipelines really meant to have Things that to the application appears as a single pipeline Even though other bits might actually be separate If I would want to build very dynamic application So if it's like if you want to stop and start different pipes like the inter inter video source sync source and the inter Pipeline, whatever it's called an inter inter pipe inter pipe. Yeah, it's pretty more suitable In IPC pipeline is more meant the trash you meant for like playback cases where you You want to separate things but it's not because you want them to run separately is because you want you want them to actually Appear as as one thing the application And the second question is more on the Embedded side of it, but the same IPC. Yeah, if I'm using like very low-powered IMX6 CPU And I have some hardware source of video if I would split my pipeline in parts How would it pass the buffers between the pipelines? Would it be like a mem copy and it would be yes so moments hit so right now IPC pipeline? It's a it's a socket. So actually copy everything through the kernel For the use case that we're looking at it was fast enough But I had the plan to actually pass Memory buffers using either passing the the file descriptor or shared memory But we never implemented it because it was not required for our use case Last question Question regarding the debugging tools, are you actively working on it or is something already available? So there is something available From the guys from Samsung. I can't remember the name of it is it's open source On this github. Does someone remember the name of the thing that marching worked on? Yes oak tracer oak tracer, yeah, so that's that that's one of the efforts I would like to see it more integrated right now. It's you need to patch things to actually use it live Okay, thank you, Olivier. Thank you