 Hi again, I'm Jan. I work at Showmax in our CMS team and I will talk about a little video going a long way. What I mean by that I will try to show you how you can compress your video quite considerably without losing any perceptual quality and mainly by using what is usually called next-gen codecs and almost as a by-product how you can manage to deliver content to your platform a bit faster. First, some short introduction. At Showmax we are serving videos on demand most of your African continent and as being mainly in Africa is a huge reason why we actually care about video size because you cannot really compare African network infrastructure to what you are used to in let's say Europe or the United States. It's not very stable, it's not very quick, network outages are quite often. But still for that reason many streaming companies don't launch in Africa or they do but it's basically just a check mark. Yes, we launch another territory but if you put a bit effort in that and it's quite possible with a bit of effort to deliver some good quality videos on those connections and by doing that you will get quite loyal and quite enthusiastic customer base. Also another reason why we should care that in Africa are really, really expensive. We will get to that in a short while and I manage we are able to deliver content quite fast now and it is important because sometimes your platform is swarmed at let's say 9.55 p.m. because at 10 a new episode of their favorite show is coming out and usually you give that file in that day and video encoding takes a lot of time so you have basically no space for error there. Usually it takes for let's say 90-minute movies about 6 hours so you don't ever get there at all. So I will dig a bit more into the connectivity in Africa. We have our main data center in Germany and just pinging Johannesburg where lots of our customers live takes almost 4 seconds. That's just based on the distance. Not much you can do with that. The only thing you can really do is you cash absolutely everything. Should it be responses from your API, should it be videos itself and you cash it preferably in the country where that customer live because there is no peering between countries for various reasons. Also most of Africa is connected via those under-seas cable and just last first day on three separate occasions three of them were damaged. So basically all of the traffic went to Secom one, one of the larger, but it cost about 80% internet speed degradation. And even if you are in that country and now cable is severe, I love the example from Kenya. They have the biggest data center and part of their connectivity is managed via basically via a microwave and it works most of the days with one exception. When Kenyan birds migrate and the huge flock of birds goes over the territory, it actually costs interference. And I will say with Kenya for a short while, check biggest exchange point is peaking somewhere around one petabyte. Kenyan biggest exchange point keeps is peaking at 18 gigabytes. That means just one show of Kazma, quite popular show on Cezna TV would always saturate that exchange point ten times by itself. And that is four countries that have 53 million people. And yeah, that's about it. And even if you live in an area with good connection, you will maybe find yourself not being able to afford the data. In Czech Republic, we love to rant about price of mobile data. And it's fair, we have fifth or fourth most expensive data in European Union. If we take Czech average monthly income, we spend about 1.3% of the income to purchase one gigabyte of data. In Africa, it would be nearly 9% for the same one gigabyte. So there's a few of the main reasons why we decided we need to compress our videos more. And there are a few methods to do that. First one is quite obvious. We can always deteriorate quality. In full HD, we can serve some mediocre resolution, but you still have some customer with big screen. Should it be TV? Should it be laptops with big monitors? It's not being really watchable by then. Or you can do what? We actually did implement what is usually called next-gen codecs. And you will say, according to white paper, up to half of the video size. In reality, it's more of a one-third. Or you can do something called kept-concentrate factor, and there is another 10% to 30%. I will talk a bit about kept-concentrate factor, but really a bit because just covering the basics of that would take the whole talk. For the sake of simplicity, we can try to imagine a movie as a set of images or frames, playing at autoplay at speed, usually 24 frames per second. And with that, lots of those frames are actually really easy to compress while the bare eye wouldn't notice. Should it be because it's minor detail or because subsequent images already have that and it didn't change. So what CRF does is it's usually the new launch FFmpeg, which is main tool for processing videos. You set up a bitrate and some output file. With CRF, you actually are setting some boundaries. And FFmpeg will try to observe the best overall bitrate. And while bitrate for most streaming companies are really conservative, you usually end up saving quite a lot of space. And here's just some quick example. The CRF parameter means this is how much I am willing to compress, higher the number, more compression, but at some point it will start deteriorating. And max rate is just saying, try not to overflow that bitrate so we can say something to the customer. You will spend up to this much megabytes for that movie. And the next thing, like the main thing for this talk is next-end codecs. We have H.U.64. Not really the next-end codec is basically a baseline for video encoding. It's with us since 2003. And it works basically everywhere. Just the files are really huge. And in the last decade, in 2014, mainly two other came out. H.U.65 as a main successor to H.U.64, which was then endorsed by Apple. And by that I mean you can play it on your Apple, on your iOS, on your MacBooks and on some TVs. In the same year, VP9 came out, developed directly by Google, so you can play it on Android, Chrome browser, some other TVs. And 2018 actually, AV1 came in and it offers another 75% compression. It's successor to VP9, but there are two main issues with that. Encoding something to VP9 takes a lot of time. If we have some 15-minutes animated movie with H.U.64 hafts and VP9, it would take on one machine about two hours. If we add AV1 to that stack, it will take 14 hours. And actually not many users will be able to play it because hardware decoders for AV1 are not on many devices actually. But I think we will actually get there. There are some companies trying to use that most notably YouTube. They are using it for their small resolution and it's basically not for bandwidth saving. It's more of a production test to see customer adaptation to that codec. And I mentioned it's quite lively in development. We tried to use AV1 right in 2018 and we used, I think it was 10 to 15 seconds short animated clip. It took nine hours. Also like two months ago, new standard came out, AV1.1 before this got implemented and it will not be backward compatible. So there's a lot of work that needs to be done there. So I mentioned that you can save at least one third of the video size using Heft and VP9 because you need both for both of the mobile devices. So why is H.264 still baseline? Well, Heft and VP9 take up to 10 times longer to encode than H.264. And you still need H.264 because you have some other TVs or the mobile phone. Also, when you present to mobile device or TV some playlist of codecs, it will not pick the one it can play. It will pick the first one and then crash. Also, setting up constant rate factor, especially for VP9 was few weeks of trials and errors. So if we want to implement this, we need to somehow update our encoding pipeline and to see how I would like to spend just a short while in explaining how to work in this start. We use three main technologies, FFMPEG. I already mentioned. We use our own CMS, which is a place where content team can set up things like audio tracks, polish cropping, also serves as a service that gives work to the actual encoders. And United Streaming Platform, known more as USP. And what is that for us? It's Apache or Nginx plugin, and many upload a video to client-facing storage, what USP does. It will split the video to what they call chunks. It is usually long between 2 and 10 seconds and generate something they call manifests. That is basically a playlist for the video. This is just an example for smooth streaming, one of the streaming formats. We can see some metadata, how many chunks the video actually have, what codec will use, what resolution, and then long list of actual chunks. So what we did with that tech tech? When a video from a provider came in, the biggest I remember had 300 gigabytes, and really absolutely beat the rate. It makes sense for, let's say, CMS, but really not for us. So we don't scale that after we create some automatic job. It will downscale the thing. It will normalize audio tracks and normalize video ratios. And after that we have some, let's say, standard output. We have the mezzanine file and we have some attributes we can rely on during encoding process. And then one encoder will pick that up, download the mezzanine file from the storage, and then sequentially transcode to every single resolution we have. We currently have eight from really small one up to full HD. So full HD have to wait until all seven rests are done. After that we set up some copy protection and upload the thing to client-facing storages. It worked like that for three years. It's quite straightforward, but it has few flaws. It's done on one machine that will download the thing, do the whole encoding, and uploading to the client-facing storages. So if we, on uploading, let's say, encounter some network hiccup, it just solves all the time we spend on encoding. And the time can be about six hours if we have 90-minute movie. And I mentioned that half MVP nine can each of them take up to ten times more. So it will be not six hours, it will be 126 hours. And customers waiting for the next episode are not going to wait that long, I think. And as I mentioned, it was encoded, transcodeed sequentially. One resolution after another, and we can say a lot of time doing it a bit other way. And for those hot episodes, we're writing to start implementing something that is called priorities. Content team, when they are selecting what they can do with the video, can say, this is priority number one, and when CMS gives work to the encoder, it will get the most priority. That's cool. That's a good idea. But if every asset takes six hours, that priority can take a long time to apply. So, sorry. If we would take 126 hours for encoding, we need to update that tech stack quite a bit. So we decided that we need to parallelize the whole encoding process. We added just a few things to the tech stack. In the cluster, we use it as a file storage, and instead of encoder that does all the work, we use something we call custom-built workers that are doing just a specific job, should it be downloading and then some stuff. Now to the some stuff. First encoder, they will come in to the messaging file, they'll download the thing and split it to a few hundred of chunks. Every single one of them will be then uploaded to Gluster. Next few hundred workers will download just one chunk, encode that, and upload the transcoded version back to Gluster, and that means the last one will download all the chunks and put it back into one video. And when we first think about doing it this way, my first thought was that we will spend at least 30% of our time on Gluster, because first encoder is putting a few hundreds of chunks, every single one after that is downloading one, uploading one, and then we download all of them and upload them somewhere else. It actually went up in just about 4% in fake sense, because we have it in the same center. Yes? Actually not. We thought about it and after some improvements to how FFM packs, how we use it, there's no loss on that. So we spent only about 4% on Gluster downloading and uploading all the stuff, which actually really surprised me. So while doing that, we get a few advantages. So I mentioned the original pipeline we spent about 4 times a video link on one video. With parallel view of one worker, which is not really parallel, we already saved half length of that video, and that's solely because we know transcodes to every single resolution in parallel. We launched one FFM pack incident that will take care of every single resolution at one time. If we use four workers, we are already quicker than the video is actually long. And if we use 50 worker, which is our encoding farm at the start, we are at 0.2. Under that you can see some real life scenarios, you have five videos, quickest of them to basically 0.3 all the way up to 0.5, which is still much quicker than four times a video link. We also have much more control over the pipeline. This is basically three of all the jobs that we generated, and every single one of them will be consumed by some worker. And that means that if I come to this page, I can actually see the progress. I can predict when the video will be done. I don't have to SSH it to the machine. I don't want content to SSH to the machine. And like the coolest thing on that, if we see that encoding video chunk, if it fails in the initial pipeline, I lost that six hours. Right now I lose that one job, and usually it takes about two minutes. So, but again, we have about 20 times quicker. My priority will get picked in two minutes instead of six hours. We have more control over the pipeline because it can restart itself. And at the start, we used to have 250 encoders in the initial state. And they were sitting idle most of the time. And that's because you usually have about 10 videos at the same time at the top. With an exception for, let's say, twice a month when new season of some sort of work can bring you into 1,000 episodes. So for the two days, all of that farm will be working. And then for next 12 days, we'll sit idle. Right now, first thing we did after doing this parallel is we scrapped all of the 2250 machines and bought slightly bigger, but only 50. And they get used for every single video now. And we can scale much better. Not only we can just throw more metal on the stuff, we can scale both horizontally and vertically. Let's say I found some machine that is supposed to be good at encoding to high resolutions for VP9 because it's currently missing VP9 takes a long time. And I can set a capability and it will get only those jobs. So I will use the strength of the patient all the time. And with that, we managed to produce the video under all of the codecs that are needed. But as I mentioned at the start, clients devices usually don't know what they want to play. They will pick the first codec from the manifest I show and if it doesn't load, well, bad for the user. And it's all the same video. So it doesn't really make sense to put it into many manifests. It will hurt the cash hit ratio and it really just doesn't make sense. And what that means that client need to tell me what codec should it play and my API needs to take care about that and cleaning up the manifest somehow. And how the client itself finds out what it wants to play. That's based on the platform. If the platform is iOS, well, it's easy. It's very tight environment. So you just need to know your iOS version. If it's above X, you play H.265. If it's below, you play H.264. With Android, that's problematic because it's that looser. There are many producers, many hardware. And especially the low end ones sometimes lie. Meaning that they will say they can surely play VP9 hardware accelerated. And, well, they actually can. The video would play because that's based on the player itself on Android. But VP9 takes a lot of CPU and without hardware acceleration your battery will run out in 15 minutes. And we phone down that one the hardware. Just, okay, if anything is so CPU heavy that it will burn the battery in 15 minutes imagine the temperature of this mobile phone. It's really uncomfortable. So after applying some band-aids, we discovered that while there are tens, maybe even 100 decoders on the phones, only four of them actually know how to hardware accelerate VP9. And so we just whitelisted them. And using that, since then, actually no issue there. So after that, Clyde knows what it wants. And it's up to my API to somehow clean up the manifest to produce only things it wants. This is an example from another streaming format. And if, let's say, the client wants hefts, it would be heft 1 in this format. So I will take the comment line with ABC1, which is H264 in HLS because that makes sense. And also I would need to delete every single line under that comment until the next comment because those are URLs to the chunks itself. It's doable. It's not really hard. I wrote that. It worked. But it doesn't sound really clean. It sounds quite messy. And also it's string comparison. It's not the quickest solution you can have. So instead of that, what we ended up with, we already are using unified streaming platform and they have something called filters. Those are basically query strings where you can say really obfuscated query string of what codec do you want, both for video, both for idea, and it will automatically take care of the whole cleaning process. So with that, I have the video done. That's what it should accept. And I'm able to send it. So we ended up with one third smaller video with no quality deterioration, meaning mostly I'm saving one third of the money for the user. The whole video is about 20 times faster, but that's really a byproduct. That's something we needed to do to ever implement hefts and we deny it. And it's really, really quicker in picking a priority. That's not byproduct at all. We needed that. If I fail, sometimes I can restart the job itself and lastly I can provide something for the clients so they know what to play and they will not crash just on the start. With that, that would be all for me. If you have any questions, please ask. They are not big. We base that on the number of available encoders we have. So we utilize the platform and have some prediction how fast it will be done. Yes, those are just abstract workers and you set up some capabilities saying I'm only accepting jobs for copy protection. I'm only accepting jobs for splitting the videos to chunks. So that's really based on that. I can buy a machine and try to set capabilities based on that. That's our job. We don't do that automatically. Yeah, it could be. It could be very well. Anyone else? Okay. Sure, sure, sure. So the question I just remembered, I'm supposed to repeat them and how do we... Could you come again? Sure, sure. So how do we actually split the video? We downloaded that message file we have and with FFMPEG, we split it at certain points that will not mess with the video because that might be an issue and upload all those chunks to Glassware and then we are muxing that together. FFMPEG can do a lot of those. It's quite a universal tool. It can do basically anything. In our competitors, I know that it can make you coffee but there are about two people that know the parameters for that in the world. Keyframes. Keyframes because you are supposed to do that otherwise you will mess up your video. We actually mess that once and what it resulted in that as video playlists have that chance, audio players does it also. And if they doesn't match because you didn't manage it correctly, it's called GOP and when they don't match on the last slide it will break the whole application down. And yeah, that was a week of work to discover. Anyone else? If we use the accelerated encoders, do you mean GPU? No, because we are serving up to full HD and that's not 4K and GPU is used on the slight amount and it costs much more. We made some calculation, we even tried a few of those but it wasn't worth the money. Did you try and dedicate a card? If we tried dedicated encoders I'm missing my colleague from the team that is supposed to do that and I think we did but I don't know we didn't use that in the end. I'm sorry. I think we tried it. It didn't work that well but I don't remember why. Do you want to talk afterwards? Sure. Anyone? Short version Netflix for Africa. I hated that one but it's like the most comparable. What we tried serving videos on demand some sports, some life events, supply event and as being mainly in Africa means we don't do just Hollywood things we actually care about okay, you are in Kenya so this is the lightest Kenyan hit on TV here you have it and it actually quite pays off. Okay, if people subscribe to the service yes, it's monthly based or weekly depending we have a few programs for that. At least documented officially or in best case released as some open source. We have a lot of things if we are documenting them internally because it's proprietary software and with how licensing work in entertainment industry you will never get away with open sourcing anything like the license agreements with Hollywood companies are something to read. It's like 70 pages. I don't think I can tell them apart actually. That's a good question answer is no but maybe it should be thank you. Anyone else? Okay. Yeah, yeah, okay. It's all just based on Django. So we are running Postgres behind Django which means when the video comes in from the provider and the messaging file is created we will in Django save lots of those jobs regenerate them so every encoder comes in we'll take it and we will mark it back in Postgre so that's how we can make that graph that's just part of the administration so you can see the video it should be corrupt like this if you want to control it oh and you can also check how far it is gone in progress. How do we do live TV? No, no. I'm not going that back into the past. I mentioned USB and it actually have another format that can really well take care of the live streams. Or you can transcode them you have some they call it obven those big ones with the antenna and on your end you will transcode it and you will get not real live but when we did that in 2018 we managed to be I think 10 to 12 seconds behind of course then like any connectivity hiccup is huge and also debugging live stream is like the most most and thrilling thing you will ever do you have half a minute to fix this okay okay anyone else thank you for making it longer because I forgot some things I guess but you have bigger questions did it thank you a lot