 Thank you everyone for staying for the latest session. It was my first day at DevConf, and there were a lot of nice sessions, a lot of nice people, so it's a pleasure for me to... Actually, I got a feeling about what it is like, and it's a pleasure for me to present here. I hope that you will enjoy what I have to share with you. So this talk is about sharing our experience of trying to run video streaming and processing in a service measure that title says already. So I'm Nikoi Nikoi, I work for VMware's open-source technology center in Bulgaria, and we do some cool and interesting stuff there. This is kind of a pet project that I started a couple of months ago when I was trying to figure out what interesting stuff you could do with service mesh, but we'll see later. Okay, so the agenda for this afternoon is first we'll go through the introduction of the problem, then we have an initial approach to what we did in the beginning, then how we did this with Kubernetes, then we applied service mesh on top and what we have in the future for us for the project. So first, who am I? So I did a lot of things in my professional career, and you can see some of the icons here that are actually showing some of the projects that I was involved in one way or another. But the main topic that I always was involved in was networking stuff and telco domain problems from programming point of view. And that's why when I started digging into the service mesh like six months ago, maybe, I was approaching it and this is what we tried to do here. I was approaching it from the point of view of a telco domain guy. And I wanted to figure out an interesting use case and say, okay, so we have these cool new technologies, containers, orchestration on top of them and service mesh on top of them. So what we can do, what could be interesting to do? So I discussed this is actually some project that we started with a couple of other people. I was the main one developing and doing it, but it was discussed internally with several other people. And during the discussions we figured out that maybe this could be an interesting thing, like edge computing, everyone knows about it or it's kind of a hot topic today. But specifically for the telco guys it is very interesting because it allows them to do, to enable some new services, new things to sell to their customers. And with the upcoming 5G networking standards and initiatives, these things become more and more important. So what we said, okay, so fine, for a telco guy, for a telco company, it would be interesting how they could stream and process video starting from a central office, going through the edge data center and then down to the customers. And what we see here is actually three types of different customers. So you have your phone, your notebook and your VR gear. And eventually it would require different video sizes if you want. So your phone, although today the phones have larger screens, higher DPI and higher resolution than the notebooks, but for VR you can imagine that you can do that. So we started with this and then we said, okay, let's see if we want to do this kind of video processing pipeline to be able to satisfy this use case, what we can do? What would be the requirements? And actually these are the requirements. So we said, okay, we will not be processing a live streaming video like we don't want to complicate the setup. We just want to have one container or one service that streams the video by just looping a predefined video. It would be a full HD video in MP4 and then we want to be able to apply different filters on top of it. So as an example of filters, it doesn't really matter the actual function that you want to do there. So we just said, okay, let's do, let's apply a timestamp on top of it, let's apply a logo and let's be able to scale it up and down. Actually it's only down in this case, but yes, different sizes. And we wanted to be able to do this by being able to select an arbitrary set of functions. So I should be able to select to have 480p video and apply only a timestamp of it, not always to apply all the functionalities that are able there. So for the reason that you see further down the slides and the presentation, we chose to use HTTP as a transport within the mesh, so all the services are communicating and the video is transferred on top of HTTP. We could argue, and this is something in the conclusions, so kind of preview, we could argue if that's the most optimal thing, but that's what we have today. And then of course as an encoder we are using H.264. So next topic is what we did in the beginning. So the very first version is we just took VOC, containerized it with a shell script on top of it just to process the arguments and prepare the proper argument for VOC. And then, okay, we learned two containers, connect them together to talk to each other, and then, you know, we learned VOC as a client and we thought, okay, the video got resized. So it was kind of a start, but we somehow saw that it's not the ideal solution and we'll see why later. So this is a screenshot of H.TOP of this video's running, so we were having some older machine there, running kind of virtual instance, 16 virtual CPUs, as you see, the RAM is not really used, but it was a good proof of concept that with that machine, Kubernetes deployed there, you can actually do some video processing, but it was only the start. There was a lot more. So we did, and this is actually an old slide, that's why you can see that and you can probably notice that the source video is in different format and these things are not exactly as what was there. So this is what we did in the beginning as a start and we just got some numbers, so you have this source video with this bandwidth, then when you scale it up and down, you get different bandwidth. So if you're running over Wi-Fi, you can get one size of the video with one bandwidth. If you're running over 3G, you get different things like that. And we said, okay, that's good, but maybe there's something missing, because the very initial thing that we did was just composed of two stages, so you have the source and then you have the scaling. So it's not really what we wanted, because we actually wanted to be able to apply additional functions and this should be arbitrary, right? If the client wants in some way or by authentication gets the rise that there should be a logo applied on top of the video, this should be able to be done dynamically and it was not really the case of what we were able to do with the default setup by that time. So we started thinking, okay, let's think what should be the next step, the next design in our design. And yeah, this is actually just an overview of the script, how it looked like back then, and by looking at it you can easily see that, okay, you just parse some arguments and then you prepare the command line for VOC and it kind of does the job, but it's not really extensible and if you want to do more complex things, it's not there. So we sat down and said, okay, let's draw a different design picture and this is what we came up with. So essentially this is what we have today. So essentially we wrote a small application in Go that actually terminates the HTTP requests and then for each HTTP request it spawns an FNPEC and the FNPEC pipes back the processed video and this gets streamed back to the client. Now the first difference that you will notice probably is that there is FNPEC instead of VOC, so we found that actually with VOC at some point it got too much complex to prepare all the arguments. I'm not saying that VOC is a bad choice, it's a great tool, it does its job perfectly. It was just for us at some point we were not able to achieve whatever we want. So we started looking at alternatives, we started playing with FNPEC and we found that at least for us it was a lot easier to achieve whatever we needed. So things like the overlay logo, the timestamp that you'll see later, so all these things were just a lot easier for us to do. I guess that you should be able to do this with VOC also. And one of the things actually that we wanted to do is that we wanted to have the container just wrapping all these video processing details and just get these generic arguments like this is the source of your video, you should listen to this part and you should size it. And then the purpose of this shim layer or this adapter layer which we call the streamer here to actually transfer these arguments into the specific engine's arguments or calls or whatever is needed there. So that is something that we preserved from the script approach here. And in the current project that we have, this is what we call a local view pipeline. So you can run this on your local host, it will spawn four of these applications, a chain like this, and you can see how the different filters get applied, scaling and things. And it's demo time. So I will show you how this is done now. Okay, is it big enough? Yeah, this one probably not. So if you just run the script, it will just say, okay, so this is... these are the four applications running. And so if I do... I'm using Emplayer. CP, localhost, and we can select the port from... So we have 1003, which is the full pipeline, so let's do the full pipeline, and this should... hopefully it was working. Yeah. So this essentially is the video scaled down to this particular... and you can see actually down below... it's... this is the HD resolution, not full HD, so it's scaled down. And if you want to verify... so if I run this on port 2000, this is actually the original video playing here. So if I run it like this, then of course it's the full HD video without any, you know, scales and filters applied. So that's not really pressified, I know what you think, but it was something that actually allowed us to do the development on localhost without having to deploy and back and forth and do this. So, okay, so next thing is, of course with these applications already, you know, are working, we just took these things and put them in Kubernetes. Yeah, we had to write these... these descriptions, deployment plus services. We had one source container that actually was streaming the video, a couple of containers that do the processing, but as you can see, it's essentially, again, static pipeline. So I don't have a demo for that because I don't have deployed pure Kubernetes. I don't have only a demo with the service mesh, but essentially, as you can imagine, it's just like if you connect one of these ports, and that's why I actually have the screenshots here, if you connect one of these ports, these are node ports exposed, you can just see the different sizes, but you cannot actually select what you want to do. And although it's a simple example here, I mean, again, I want to reiterate all that these filters could be whatever you want. Like, you can have a big pool of filters and try to do different interesting stuff on top. So the next thing is about the service mesh. So I don't know how many of you actually are using service mesh or you know what it is about. So who has an idea of what these two is? Can you raise hands? Okay, good, perfect. So I have a couple of slides just to quickly say why we wanted service mesh. So back in the old days, so this is kind of stepping a little bit back from the current project. Back in the old days, we had the physical servers with all the physical infrastructures connected together. Then we moved to virtualization. We all know this. Virtualization just replicated from point of view of a networking guy. It just replicated whatever we had. Then we started using containers. They had isolated processes. They communicate over sockets. They need layer 3 connectivity. They don't need much more than that. That's what the CNI does. And then at some point we arrived at the situation where actually we started writing applications that care only about endpoints. They don't want to know if there's any layer 2, layer 3. And at some point we ended up in a situation where we just wanted to manage endpoints. So our infrastructure should know about all the layer 7 protocols that are out there. So all the RESTs and GRPCs and whatever the databases are doing on top of that. And that's where service mesh came. So I guess that everyone that is familiar with service mesh should have seen this specific picture. It's very popular. But if you look at it, and you can imagine that this is a pot in terms of Kubernetes, the green thing is your application, and the blue thing is the so-called sidecar proxy. The purpose of the sidecar proxy and all this, the whole mesh that is built here is to just be able to provide the underlying infrastructure so that your application can be run on top and actually you can have all the nice visibility, debuggability, and of course you can manipulate all the requests that are going within the service mesh. So it's not a really complex concept, but it appears to be pretty powerful. So what we used is a project called Istio. And this is the basic layout of its architecture. So you can see that it basically is split into data plane and control plane. The data plane is essentially this set of envoy proxies which are kind of intercepting all the ingress and ingress traffic from your application. They are controlled by the pilot, and there are other components which I guess that it's not very interesting to discuss here. So taking this as a basic architecture that we want to build our solution top, we moved to the defining how we can leverage the means of the service mesh to have our dynamic video pipeline. And we took a rather simple approach, but I think that it kind of took out the strong parts and the strong sides of the service mesh, specifically Istio. So we defined HTTP header, which we call Process Video, and it had three fields which essentially define each stages of the video processing, and essentially program your stages through this HTTP header. It's not really complex, as I said, but it did the job. And if you think about it, in practice, at each hop envoy is just checking this header and deciding, okay, are you applying the scaling or are you doing something different? Are you passing it to the next stage? And one thing that actually we can say here is that this is why we chose HTTP, because the truth is that essentially envoy and Istio can do manipulation of TCP, so kind of layer 4, and HTTP. So you can do matching and, you know, writing. But think of that with TCP, actually you don't have these kind of custom fields that you can pass through your request, so meaning that you can't really program and propagate whatever the client wants from the service mesh, right? So that's why HTTP was chosen in the first place, like in the beginning. And what we also figured out is that service mesh doesn't propagate the request for you. So the job of the service mesh is that HTTP request gets to your service and then it's your job to propagate it down the service mesh. So that's why actually our streamer, and this is how the picture looks like in real life, so once the request gets into the port, it passes to envoy, envoy checks it and says, okay, should we apply this function or not? And if it's yes, then it passes to the streamer. Passes it to the streamer, which essentially just terminates the HTTP request and then it has to read the HTTP header and pass it down for the next guy. So this means that you have to propagate this request right from entering your service mesh down to the last level, and it's your job. And if you think about it, the promise of the service mesh was that it should be something invisible, like you develop your application on top and you don't care, but the fact is that if you want to do more complex things and kind of service chaining if you want to call it, then your application has to be designed to do these things. It's not like you just deploy it and the service mesh does all the things for you. At least that's the current state, that's what we found. And the full picture, actually all the envoys and all the containers and all the data paths, at least the request paths are shown here. It's quite complex, I would say. It does its job. I think it's that if you want to scale, if you want to have kind of, I don't know, 100 functions, then it would become really complex. And that's not really great. And okay, so it's time for a live demo. Let's see how the wireless network is today. Okay. Yes, I should stop the show like this. Okay. So I will run this script. So why we run this script? Because apparently Envoy doesn't speak so HTTP 1.0 and then player speaks only HTTP 1.0. So if you want to enter the service mesh with mPlayer, you are not able to. So okay, so we have this script, which essentially runs kernel for, and then pipes the output to mPlayer to be able to play it. Of course, we can use VLC also. But okay, let's say that we are fans of fmpeg guys today. So this script essentially takes three arguments. So if you omit them, they essentially disable. So if I say low, then I should do dash, and then I say enable. They should run the lower scale video. So this is connecting to our office's data center. I hope that it would be able to connect. It's over VPN, wireless, and a lot of technology. So if the video goes through, we'll be lucky. Yeah, so this essentially is giving you the lowest, like, 240p video with only the logo applied. Now, if I do, let's say, I think it was high, and then enable, then you can see here, there is the header generation shown here. So I guess that this is going to take some time before, because it does some caching before it starts playing the video, but I hope that you'll be able to get the picture, at least, to see that it's applied. That's all. Okay, I'll let it run for a while, then we can discuss a little bit. So if you've seen them, maybe we can just... Yeah, okay, so here's the video. It came through. It's a little bit slow, as I said. It's wireless VPN, then some data center doing things. So that's actually the result of... That's what we have today. Of course, you can request directly for the original video. You can actually request any combination, which was our initial goal. So if you think about it, though, you'll have a lot more components than what you would imagine. I mean, you have all these proxies, you have the ingress gateway here, then all the envoy proxies. So for this to work in the full pipeline, like from start to the beginning, there's a lot of components involved. So one of the things that this probably is the time to... Yeah, this is how our virtual service, from the point of view of FIST, you have it looks like. So the virtual services are essentially the description of the configuration of the envoy proxies, or whatever the proxies should be able to do, like matching the headers, getting some routing decisions. Please, here, that's what we see. But I wanted to move a little bit to the conclusions. We have some time here. So if you think about it, there's a lot of processing involved in this solution. So this is whatever we wanted to do initially, but it's not really effective. So first of all, use HTTP. This means that you'll get all these encapsulations. Then on each stage, in order to apply each of the functions, you have to decode the video, apply whatever you want, like a logo on top of it, and then encode it again. Yeah, if you have to do it like five times, and if you have to consider, again, the edge video processing use case, that's not really great, right? Because it just eats resources, and on the edge you're not full of resources, you're kind of power constrained, and there are heat constraints, or whatever. So there could be better solutions possibly, and we are looking into some of them, as I will show in a while. But for now, this is the state. So our conclusion is somehow, if you want to use Istio to do these kind of things in this way, or kind of the service mesh approach that is available out there, you have to go with it. It gives you the flexibility, you could do nice things, but you just have to have the resources. Yeah, so you are also bound to HTTPS at transport if you want to be able to pass all these headers around and to manipulate them, and it's not really great. So one thing actually that reminds me here of the slide is about HTTP2. So we looked a little bit into this. It could be a solution to some of the problems, or instead of, you know, all the time just, you know, opening connections, closing connections, and maybe HTTP2 can help us with their streams, and, you know, it's binary in nature. Maybe it could ease the things a little bit. But from what I saw at least, the new VLC has some mention, at least in the documentation, I'm not sure how I should be able to use it if I wanted to, but FMPEC, of course, it's an old school, and it's not really having it today. And we're a little bit ahead of time, but let me share our future plans. So actually we have all the work that we had with all the things shared in GitHub repository. I uploaded it this morning from the site over there. So it's public now. Whoever wants to play with it can just log in. These slides are going to be available, I guess, on the site, so if someone wants, or if you can reach out to me. This is my own GitHub account, but it's public for everyone to work with it. And some of the things that we wanted to do in the future. So in the last couple of months, I was also involved in this new interesting project called Network Service Mesh. Someone may have here heard a bit, someone not. But, essentially, it doesn't have the burden of the site card process the way that the traditional service meshes have. So instead it just wires the connectivity between the containers. It doesn't rely on CNI. This is all the interesting things. I hope that we will have the chance to grow this project to something further. I was actually trying to prepare a demo based on this. I get somehow halfway through it. I'm sorry I was not able to actually finalize and show it. I hope I have the chance to do this maybe next year. And I think that it could be an interesting demonstration on top of that. Actually, the whole purpose of this project, as I said, it was kind of a pet project for me that I was doing kind of site of my main activities. And the whole purpose was to be able to demonstrate and kind of push the limits a little bit of the service meshes and these whole container technologies and see how telco-like workloads could behave there and what you could do with them. Of course, it would be really interesting to measure the real impact of the site card process because we're kind of stating things and we know that MBO is kind of pretty fast and optimal. I don't know if someone noted that, but I have a couple of patches and this icon is on my first slide. So I know some insights of MBO. I know that it's really good. Still, when you apply five of them, then probably you'll get some impact. There's no way to not have impact. And whatever we did up till now was like with a single replica, no really scaling up down, no multiple clients. I think I can run up to three clients at once and it was kind of working. But if you want to kind of get the real picture, you should scale it and have a multi-node deployment and see how all these things communicate together in a bigger environment. And this was actually my last slide. So we have at least 10 minutes, maybe more. So, questions. Yeah, yeah, yeah. So that's actually one of the goals of Network Service Mesh. So I am part of the Network Service Mesh team now and one of the things that we want to do there is to enable telco type of use cases. So like multicast is the perfect example. Thanks for asking this question of where actually you don't want to rely on this higher level protocols. You just want to be able to do the most optimal data streaming that you can do. You're absolutely right that here. If you want to scale with all these proxies and you kind of want to do hundreds and thousands of users, this would be impossible. If you consume just so many resources, it won't be... So, as I said, we just wanted to push the limits and say, okay, what we can do? What we can do with the available technologies? I know that maybe the result is not really optimistic, but that's kind of a reality check. That's what we can do today. You're right. I mean, of course, the standard way, at least up till now, to distribute video in the most optimal one is to do multicast. But yeah, with the modern tech technologies, things are shifting away a little bit from there. Now, with Network Service Mesh, we are trying to bring telcos back and all the telco type of approaches. Yeah? Yeah? So the question is what happens, or the question is what happens when the client closes the connection? So essentially what happens is that the containers, once the whole service mesh is brought up, the containers are sitting in this idle waiting for connection state, and when the connection is closed, it's just because the FMPEG for the processing is spawned as a sub-process. Right? And we just pipe from its output like an HTTP reply. And what happens is that just this spawn process is closed. It just is terminated. Yeah, but you actually... Because... No, no, no, no. Everyone stops working. So if you look at this, actually, once this connection is terminated here is alive and is established, then this process is spawned. But when this connection gets disconnected, the server notices that its client goes away because essentially this guy is a client to this guy. So when this gets the notification that, okay, the socket is gone. The connection is terminated. Terminate the spawn process. You mean the different scalers at the same time, yeah? Yeah, currently the HTTP implementation of the streamer doesn't have any limitations on the amount of connections accepted. So whatever you might reach as a limit in this specific scenario would be if the Kubernetes applied some limits on the types of CPU resources, I mean on the amount of CPU resources that you're using. Because if you spawn 5 FM packs in a single container, you would just use too much CPU, right? So you will get kind of restricted if there are such limitations, of course, applied. Okay, I guess that's the end of it. So thank you again for staying with me.