 I'm going to talk to you about how to reach WT Media Web from image platforms with WT Reddit. So giving an outline of the talk after interesting myself quickly, I will present WT architecture, giving a high-level overview at least of how it works. And I will present some of the WT3C specifications that are related with multimedia. And I will briefly explain how to use all those features from a browser called Core, and then how to deploy that on a media platform, so if you have to. And then I will move on to a totally different topic that is related with how to integrate WT inside these two applications. So, by the way, I'm a webkit committer and reviewer and I've been working on webkits for about 10 years now. And I also work in this room. It's part of my daily job at Igalia. If you were to talk about Igalia, we have about 110 Igalians nowadays, spread around the world working remotely. And we provide consulting services around web engines, graphics, the kernel, media, of course, various areas. So talking about WT architecture. So WT is a webkit port. So what is webkit? It's a web engine that was initially started by the KD people and then was forked by Apple and renamed to webkit. And they wanted to use it to build their own web browser called Safari. So nowadays it's mainly maintained by Apple and the SIT area. So webkit allows you to, in your application, to embed web pages using usually a widget that is provided by different kinds of webkit ports. So they are both targeting different writing systems or rendering engine. So specifically, WT is really heavily nowadays on WT process. That means that your application, it's called usually the UI process in WT, is going to spawn quite a few processes. You see only one there in that graph, but there's more. So there's the web process, there's, in charge of Cinectnl doing JavaScript compilation and runtime. And there's another process called Network that is in charge of doing all the networking. And more recently, another process was added called the GPU. It's used currently only on Apple platforms. And that's in charge of talking to the GPU like doing, I think they even do media stuff there nowadays. So WPself is one of those webkit ports. And it's maintained in a webkit.org. We did the initial upstream in 2017, I think. And it has a six months release cycle, the same as Webkit GTK. That's because we maintain Webkit GTK as well. So we decided to synchronize the two cycles. And it's the big difference with other ports is that it doesn't have dependency on any widget toolkit. That means that it's the application that's in charge of doing the final rendering of webpages and handling input events. But that's provided using third-party modules we call backends. And there are mainly two backends nowadays, the one we call FDO. Because it's relying on different libraries from free desktop.org. And then the one called RDK, which is maintained by Comcast and RDK. And in that backend specifically, they have support for custom hardware. It's a top box user. You can use that if you have a really specific hardware. But usually what we recommend is to use the FDO backend because it's the most well-maintained upstream and the quality is getting bigger around it. There are more and more users. So that's where most of the work is going nowadays. And in that backend, usually you need to have well-landed EGL extensions provided by Mesa or Drives. Even though nowadays you can do pure software rendering. Initially it was only in share, but now it's also share memory backends, if needed. So as this talk is about meeting area, I wanted to say a few words about Gistula because that's the framework we use in WB. I guess you know it already, but it's a framework that relies heavily on pipelines that interconnect elements together to form a data processing pipeline that can be used to build media players, video editors, stream servers, any kind of media application. The cool things about that is that it has a wide range of plugins and supports quite a lot of platforms nowadays. So Android, macOS, Windows, they are all supported. I think it's the main framework to use. The API is quite nice as well. So I will talk about some specifications that are included by WB. Not all of them, there are hundreds of them. You can check the webkit website if you want to have the full list. I will focus only on the multimedia ones in this talk. So the first one I want to mention is the ASOS extension, MAC. It's used for adaptive streaming on the web and it's used quite a lot by websites like YouTube, Netflix, so on, Dailymotion, Vimeo. All those websites use MAC nowadays. A nice thing about it is that there's a library called dash.js that allows you to convert any dash stream into a MAC web page. And that's really good because dash is widely used in the industry of broadcasting and streaming. So I would recommend you to try to use that in order to recycle your dash stream to MAC. In WP, we have a major backend MAC nowadays. So we have enabled it by default in built-in runtime. I will say a few words about how we implemented that in WP. So the web page would thank you. MAC's trends are used in the industry into an object or source buffer. And in WP internally, we build a pipeline that's going to demux those trends in audio and video separate tracks. And then those elementary streams are injected back into a web core or further processing. And when playback is needed, we have built a custom G-stream element that is able to inject those video and audio buffers into the pipeline of the player. So we have quite a big number now of G-stream elements inside WP. So for each backend, usually we have a source element. The first one. Another spec I wanted to talk about is encrypted media extension. And that's about bringing protected content playback into the web that was meant to replace content description that was usually provided by Flash using custom frameworks. So now it's over formalized into a specification. And it's used widely now by a vast amount of content providers. So Netflix, Disney, Apple, they all use that spec to serve protected video. And it's actually also combined with MSI. And in WP, we have to disable that by default in releases because it highly depends on the platform you need to run this on. And usually what you need is a backend called content description module that is going to be in charge of description. And we can't integrate those directly in WP. It's a third-party module. And that has to be integrated. It depends on each product. And in WP, what we did is when the G-streamer demuxer is detecting protected content, usually like in MPEG-4, it's signal with SSH box. So when that box is detected in the stream, we receive an event on the pipeline in WP. And we are then able to probe the platform for the supported CDNs using a BI called OpenCDN. It is a formalized in two different runtimes implemented there. The first one is maintained by RDK people, a con guest, and it's called a Thunder. And the second one is called Spark CDN. And it's maintained by us here. Although I have to say it's only a wrapper, so it relies internally on plugins as well. We don't currently publish any of those plugins. That might change in the future, but currently you have only the wrapper and you need to write your own old plugins. You need to use that. And then in the playback pipeline, those CDNs are used using Decryptow that are custom G-streamer elements again. And they are able to use the DLBI to decrypt and render video. In the case of secure video rendering, you also need to have a custom video sync usually in order to guarantee that video frames remain in the GPU space. So it's really a specific setup most of the time. We can't really enable that by default. Moving on to media capabilities. That spec is used by web pages to probe web engine features related with decoding and encoding. And usually they do that to provide a better user experience related with playback. For instance, one aspect of the spec is that you can know if the platform is supporting hardware accelerated playback or if the decoders are power efficient or not. That's part of the spec. We have enabled that in WPE. It's not enabled pattern time yet because the spec is not manualized yet. So maybe soon in the future we will enable it by default. The way we implemented that in WPE is by probing the GSTRIMA registry, which is an element in GSTRIMA, keeping track of all plugins available on the platform. And we are able to look for decoders and encoders there. And when a specific media type is queried using a line type, we have a mapping to GSTRIMA caps to look up for the support encoders and decoders. That's the way we implemented it. This is not the best way to do it. It's the best way we came up with. Another spec I wanted to talk about is Web Audio that allows you to do low latency playback. For instance, targeting games and music applications or DJ and any kind of web page that needs to provide audio feedback to the user. You should definitely use Web Audio. That backend has been in work kit for a few years already and it's quite matching and enabled by default. We implemented it in WP using another pipeline that is able to optionally decode audio from a file source, for instance, and then provide the decoded code 32 samples, ECM samples to the webcores. That's because in webcores there are Web Audio nodes implemented there to the internal processing and then when you need to play those samples, we have another pipeline that is able to inject those samples back towards the audio device. That's how we implemented that in WP. Another thing I wanted to talk about is media capture and that spec is defining how to access webcam, microphone, even screen capture in order to expose those devices to the web that's mainly used for WebRTC. It's never currently only in developer reads. It's still a bit new. We plan to enable that in WP 236 by default. We implemented support for that in WP using an API called GST Device Monitor. That's able to list capture devices and render devices. When we have set in one device, we are able to capture those video frames or audio samples towards special sync that is able to collect that data and inject it to WebCore for conception or for playback. Playback is done with another source element, as you might guess. For screen capture, we implemented support for that quite recently using Pipoia. If you have a system running Pipoia, you should be able to do screen casting called the Get Slam Media Capture API from JavaScript. That's working quite well, at least on my desktop. It's going to be released in WP 234 behind the setting. That's a runtime setting. When you have access to capture devices, you want to have interaction between browsers, basically, using WebRTC. Another use case of WebRTC is one-to-many broadcasting. That WebRTC feature is implemented in WP using WebRTC, where we also integrate with the instrument encoders and decoders to provide some hardware accelerated encoding and decoding features, depending on the devices. The problem with that backend is that it's not enabled by default because bundling WebRTC in releases is just too big. There's another issue related with licensing of boring SSL. That's kind of preventing the usage of the backend in most applications. To fix this situation, we have a plan to provide a gestural WebRTC backend. We have a prototype already, but it's going to be upstreamed or fully soon in Workkit.org. We plan to perhaps ship it from WP 276. Hopefully. Another thing you might want to do when you have WebRTC working is to record it. I have a prototype for that as well. It's not upstream yet because it's using un-released G-Streamer 1.19 features, maybe the GST transcoder library. So when it's released, we should be able to upstream that backend. So that's it for the specs. There's more. I think it gives already a good overview of what WP can do with Multimedia, which is already a good amount of things. So when you actually need a browser now, and we do provide one, it's called Cobb, and it has various backends that depending on how the graphics are set up on your platform could be used. So for instance, if you're a WebLand compositor, we have a WebLand client that is able to use to talk to the compositor. Next, you can use the exceeded backend, GTK4 as well. And for full-screen rendering, we have a dedicated backend called DRM, and we also have a use case related with headless rendering. And Korg is able to select the right backend. But otherwise, if you want, you can still override the default behavior using a command-line notion. And by default, currently at least, Korg provides only one WebView, so there's no doubts, for instance. There might be support for multiple WebViews at some point. We have been working on that, but it's not ready to be a string yet. Another cool thing about Korg is that it can be controlled with debuts, so that means you can do page notification using a dedicated debuts API. We'll talk more about that in the coming slides. I wanted to highlight one use case related with full-screen rendering where you don't really need a WebLand compositor. For instance, for your calculations or set-top boxes, you could use Korg and its Ethereum plugin to do full-screen rendering of the Web page. So if you have a working DRM and KMS setup on your device, that should work out out of the box, and you can easily check if you have an execute working. It should also work and Korg should also work using the same approach. And what we do is we usually import Swyland buffers or even better if DRM buffs are available, we can import them into GBM Buffer Objects and do the rendering directly using the backend. And for input events, we usually input Dutch and keyboard mouse or supported. Even multi-touch, it's quite nice. Another use case I want to talk about is LSE Korg. We had recently a use case about that related with Apple Music, and that's because the SDK provided by Apple is only working on web engines, this JavaScript API basically, and that's not easily native application, so we need some kind of web engine to support that streaming service. And it was even better on a device that has no GPU. We had to come up with a new backend that is faking rendering and it's clocked at 30 FPS by default, that means the internal clock of WB is still behaving more or less normally, so yeah, it's kind of a fake output because we don't do anything with the frames in that backend. We just notify them to WB and more specific to that Apple service, we also designed a dedicated device which that allows us to remote control the audio player make calls to the SDK. That's quite nice. I think it's a good showcase of code in a really constrained environment. So now we have a browser and a web engine, you want to deploy it in some export, I will use Yocto as a showcase for that. We do provide a layer on Metawapkits that provides all you need to enable WB into your USB. So we have recipes for backends, WB code and I would recommend to use the Bokey reference distro for any distro that provides a recent deal of gesturing operation it's important actually to use multimedia and then you need to to have the Metafree scale a layer if you want to use ENXpeak for example, the ENXpeak but I would like to recommend first to try the Naveave because nowadays it's working quite well provided that you have enough recent enough kernel on Metawapkits and it works quite well on MX6 MX8 also I would recommend to try to use the FDO back end by default either with within a reliant compositor or without any compositor if you have a working KMS setup on your device and regarding multimedia playback you are also covered using the video for Linux decoders so we have a hardware acceleration support as well and that's working quite well in WB but I recommend that in the previous slides and specifically about NX6 I was able to enable the coder driver of course I was able to to reach WHD in WB even though in some cases depending on what the web page is displaying there was a bit of stuttering in any missions but that can be avoided if you have a lower resolution video like for instance on YouTube they provide for HD and 720p so you can force 720p by default using a special environment variable for instance that provides a better user experience even if the resolution is a bit lower another SOC I wanted to talk about is the MX8M here you have to use the natural D1 driver that was developed mainly by the Colorado people on English Linux, thank you to them most of that work was released in the column part 12 and the status I was able to assess on the device I have here is of course WHD was working as expected because it's quite powerful and 4K as well for VP8 and H264 I was not able to test VP9 and HCBC because at the time I tested that so it was not in my mind yet but hopefully that can be assessed as well now we have to as I was saying before we had to make a few adjustments in WP one of them was related with video rendering we have a video sync in WP and that sync initially was supporting only RGBA so we had to make some adjustments there because the hardware decoders usually output YULI for instance, ND12 or I420 so in order to avoid compression we added support of those formats into our video sync so that's inerring better zero-copy rendering and the last format listed there is A420 it's related with VP9 with I4 that that is decoded by the VPX decoder now currently so it's a bit of a special format but we actually now ended it in WP as well and fortunately if you can't use the NAB driver you have to use the proprietary ASP driver but at least you are covered in WP as well because we if you use the AMX decoder we added support in the video sync for the video converts custom element that is necessary to be behind the G-sumer AMX decoders because the way UV format output it by those decoders is a bit unconventional it requires that video converts in order to have a good support for zero-copy I wanted to jump into a different slightly different topic now that's related with using WP in decision applications why would you want to do that for instance in the broadcasting industry nowadays at least in software like OBS you can do transitions between scenes using so-called stringer and animations and those usually are videos encoded in VP9 with F420 before because in some projects we would like to use WP to render those videos another use case I had in mind was related with live TV it's quite common nowadays to see I share information displayed on the bottom of the screen it's called lower thirds and those kind of overlays they are quite common nowadays and could be rendered with a web engine another use case is cobalt again overlays again htpl overlays can be used there so you might wonder why would you like to move to WP and just for that if you have a working solution already well usually those tools are not free software and they use a custom format for overlays so if you want to migrate to a different tool in some cases rewrite your overlays so using htma instead which is an open specific format it would be better for interoperability and also you have to take into account that there is a wide result of people now doing web design and it represents a huge work workforce that can be used to produce those overlays and they are quite competent and they can create nice websites animations and so I think it would be a good industry to move on and another aspect is related with distro but for us since it supports many platforms nowadays it's quite easy to integrate in various kinds of environments and platforms so if you want to move on over the editor it's quite possible actually I've made one as a demo already so you can write new products with that combination of frameworks so how did we use WP in Gisturma we have a custom source limit that was upstreamed in Gisturma I'll play inside and the use case for that is overlays even cloud browsers could be designed with that element perhaps internally it's using a web view the source element is able to make a video stream out of those out of that web view and even we recently added audio support for that so you can even inject audio frames from the web view into the pipeline if needed and there are two runtimes configurations if you have a GPU you can use a zero copy approach and that is underneath relying on EGL images and if you don't have a GPU you can use the software Wasterizer usually that's LLVM by recently I gave it two examples for those two use cases the first one is if you have a GPU you can build a run a web page basically with Gisturma and you can even interact with the page click and scroll so it's quite fun at least to play with Gisturma and the same can be done with without GPU although that's the thing I recently fixed in Gisturma so you need to have a quite recent version nowadays but yeah the difference between those two pipelines is the video sync that's being used so that in the first case GL video caps are used and in the second case the RAW video caps are used another example I wanted to show is how to do a release so it's quite simple you need to build a pipeline with Compositor that's kind of a video mixer and if you want to also mix the audio samples you can use a audio mixer usually most cases you may need the video parts you can do audio and we have two sources there the web page and the file source or you could have a video file source there for instance any kind of media source and you have to do two things mainly to make this work out because the Compositor is going to to be configured to overlay the web page on top of the other media source so you need to make sure the background of the web page is transparent so there's a property for that in WB source and you need to configure the z-order value on the pads of the Compositor so that one stream is on top of the other basically and then you can do rendering like in the pipeline or you could do encoding and streaming to party MP or artist PE or even web RTC and actually that's what I did in demo I wrote where I was able to to do dynamic overlays with and control them with a Node.js application so I had a kind of admin interface in that application able to edit overlays on the fly and then that was mixed with another video if it wasn't a webcam it could be any kind of video source again and the resulting video was accoded and streamed to a Janus SFU web RTC servered and then it could be consumed by other people using their browser using a web RTC it was quite cool and that demo is explained in that little video you can watch it if you want so yes, that's it that's the end of the talk I listed there some links where for instance in WPbucket.org for all the towers you need for WPbucket the gcwp docs in the gcwp.org website so useful if you want to see how you can use gcwp how you can configure it and use it in pipelines and the overlay we maintain in our GitHub space that's it, if you have any questions we'll be happy to answer