 Hello everyone. Welcome. I'm George. I work at Calabora. I'm a senior software engineer working on multimedia things like Gistimer and PipeWire. And today I'm going to talk about what I've been doing in the past one year with PipeWire, enabling PipeWire in the automotive industry. So let's start by defining automotive. So basically I've been working on a project called Automotive Grid Linux. I'm pretty sure I've heard of it before. If you haven't heard us around, there is a booth at the Building K as well where you can see a live demo of Automotive Grid Linux. And my task in AGL has been to make an audio system for cars. Now, the immediate question you might have is what's so special about the audio system? I mean, we've had audio systems for years on Linux. We have the driver layer works, Alisa works. We have Pulse Audio on top that works really well for the desktop. But yeah, what's missing? So to explain that, I need to explain to you how the hardware looks like because that's the difference basically. So on a desktop, we're used to having this kind of scheme where there is a single CPU and a single audio card. It might be multiple, but it doesn't matter. You basically choose one audio card that you use. You have your speakers connected there and you have your microphone connected there. And that's the whole thing that exists. And yeah, you have things like Pulse Audio that can manage this really, really, really well. On a car now, things are quite different because you have different nodes inside the car, different CPUs, multiple CPUs. So you have maybe one CPU that is the in vehicle infotainment system. And then another CPU around somewhere else which is maybe a hardware radio device or a CD player. There is a dedicated DSP somewhere else which is doing the filters and echo cancellation and finally amplifies the audio to the speakers. And there might be multiple DSPs actually, one for the front speakers, one for the rear speakers. There are some cars that have even speakers on your headrest. There are also microphone arrays that are on your seat belt or on the top on the roof, on the doors, on the side, everywhere basically. And this is all quite, this has become basically a very complex network that you no longer manage in the way that you manage the desktop. So some requirements basically, yeah, I mostly talked about it already. You have a lot of different hardware. And the other thing that I didn't say is that you might have streams that go around without passing through the main CPU. So you may have, for example, dedicated hardware CD player that directly puts audio on the DSP and that audio never passes through the main CPU. Although the main CPU still has control over what place, you don't actually get the audio data routed through the main CPU for performance reasons. And you need low latency because you might want to implement echo cancellation and like that. And then on the software side, on the IVI system, when you want to play something, there are some requirements there as well because you might have, you basically have apps that have a different context. There are apps that do we, for example, apps that play music that are in the multimedia context to say. And there are other apps that may be doing something special like, for example, your navigation app, which every once in a while it says please turn left in 100 meters or something like that. And that sound needs to be treated especially because you want to make sure that the driver can hear it. You want to play it on the front speakers. You want to play it amplified, lower down the music while it's playing and things like that. So you have to treat it in a different path than, and in a different way than how you treat the music. We want security because things nowadays are very hard. We want to have containerized apps nowadays that can play music or something else, but you don't want them to be able to do something that they are not intended to do. Like you don't want the music app to be able to take over the navigation special stream and things like that. And then you have emergency signals when there's something wrong with your car that needs to play some sound to warn the driver. That's also something that needs to be treated especially. And it's usually coming from the dedicated hardware that has to be certified and it won't actually send that through the main CPU again. It will just play directly on the DSP and there is a path there that is guaranteed to work because it's certified, but then the IVI needs to understand that this is going on and stop the other streams at the time and things like that. So the real problem actually to be doing an audio system in a car is routing and policy management. How do we manage the routing and how do we make policies? So there have been some projects that have tried to address this. I mean, okay, I'm starting with Palsodio. Palsodio didn't actually ever try to address this use case. It's really meant for the desktop. It makes its own policy decisions, so you can't actually tinker with it unless you build a module and some people do that. They build a module that tries to circumvent how Palsodio is internally routing streams, but it's not very nice. It is fairly resource intensive compared to other multimedia demons and it doesn't implement any kind of security, so whichever app connects has control over everything. And then there have been things like the Geneviu audio manager which is building on top of Palsodio. It uses Palsodio as a backend optionally. It can use other backend as well. It basically builds an API for policy management on top of that, but it has a couple of shortcomings like it doesn't... You must call the audio manager API so the applications need to be aware that this exists and things like that. And people that have used it, they generally agree that it's complex and not very nice. And then in AGL, we also had another API called 4A which tried to solve some of these issues of audio manager, but didn't succeed in addressing all of them and people still didn't like it so much. So then there was pipe wire. What is pipe wire doing differently to address these issues? So when I started looking at that, pipe wire of course didn't support any of these use cases, but it had something that looked like it was possible to develop it to address these use cases. So let's start with explaining a little bit of the architecture of pipe wire. How does it look like? So there is a demon, pipe wire demon, that basically handles all the media routing between applications and devices. And everything is a different process. So you can have applications sending data through MFD or DMABuff to pipe wire or capturing data. And then you can have devices either accessed directly through a plug-in to connect, let's say, to ALSA or to video for Linux. But you can also present devices through external processes like the Bluetooth manager process that we have that does all the networking stuff with Bluetooth and provides the device to the pipe wire demon. And then there is another process which is called the session manager, which is the most important thing that basically looks at the whole applications and devices graph and decides what is going to be routed where and makes all these connections. And I'll talk more about that later. That's actually what I've been doing myself. So inside that blue box, the pipe wire demon, the things look like that. So inside the pipe wire demon, there is actually a media graph with a lot of little objects called nodes that connect to each other with links. Every application is represented by an application node. Every device is represented there as well. ALSA, video for Linux, Bluetooth. And it can also load internal plugins that also provide nodes that can do filtering and processing. Pipe wire actually started from providing pulse audio for video. It was initially called Pulse video. So it can handle video data very well. That's its original purpose. Nowadays, it also implements routing audio data. And it's evolving to become a pulse audio replacement. It is modular, so you can customize it a lot. It has built-in security, something that pulse audio didn't have. There is access control per application, per device. And you can make things, you can give access to certain applications to do certain things and not other things. That's what I mean with security. It's a very clean code and much more efficient than pulse audio. It's capable of doing low-latency audio, and it actually implements the JAG API. So you can also work as a JAG audio server. And yeah, the single most important thing is the external policy management. So applications connect to pipe wire. They don't actually get linked to anything. They are not able to do anything until the session manager acts and gives them permissions and gives them links to something else to a device. What was missing from pipe wire when I started last year? There was no session manager. So I started developing a session manager. Basically, that was all my work. And I developed a session manager called Wire Plumber. That was the first session manager implementation. It's like pipe wire. It's modular and extensible. The target is to provide a reusable session manager for all kinds of use cases, automotive, other embedded, also provide a desktop session manager. And a session management API as well, so it provides a library that would allow you to write your own session manager if you want, or to write your own policies to decide what route we're, to write modules for Wire Plumber to extend its features and tools around that. It's written using G object. It provides a geo objectified API on top of pipe wire for ease of use and for enabling bindings in other languages. That would be super useful. And it also provides an API which is called the endpoint API. I'm going to explain that, which is very useful for implementing policies. I'm going to explain that. Endpoints is a concept that I brought into pipe wire. It didn't exist before. Pipe wire itself routes media through nodes from an application to a device or vice versa. But the thing is that in a car, you have all these kinds of streams, hardware streams that maybe are not passing through the main CPU and you want to control them. So I thought, how can we make an abstraction of all of these? And this is what the endpoints are. Basically, endpoints are also little objects that you can, that look like nodes. They can be linked together and unlinked. But they don't actually have media. They don't route media. They are something more abstract. So for example, you could think of a pipeline that looks like in the endpoints graph. It looks like there is a media player that gets connected to the car amplifier. And these are the two endpoints with one link. And there is no more in that graph. While the same thing on the nodes graph, it looks like there is a media player that goes into a filter that does some processing. And then it goes to a network sync that pushes things out to the car network. And then there is a car network distribution demon or whatever that eventually ends up in the amplifier. And this is a much more complex path that the policy management would have to know. And it would become very hardware specific to handle this. So this is why I thought we need to abstract this. And these two are meant to run in parallel. So you have both the endpoints graph and the media, the nodes graph. And whenever you make a link on the endpoints graph, the session manager translates that into links on the actual nodes graph and on the actual car network or whatever else is there. The endpoints API is also something useful to bring the graph closer to what Palsodio has. So Palsodio has only sources and things. There is only two things that you can link together. There is no concept of a very much more complex graph. And this is exactly what endpoints are. So you have, if you were to implement endpoints on a desktop, you could still have things like an application being represented as one endpoint and your speakers or your microphone is another endpoint. And then you can link them together no matter if there are filters in between or whatever else. And this is how it looks in the endpoints graph. So you have your applications, your media player, your voice agent, voice assistant, and they make links to the endpoints. There is an amplifier endpoint there, a speech microphone. And on the top of this graph, we have basically streams that are in the main CPU. The main CPU actually routes media data. But then we can also represent a hardware player or something else that is only routed in the car network and not in the main CPU. But the main CPU can still have this representation in a graph. And so the policy management code can still look at this and decide, okay, here is, for example, a hardware player that decides to, you know, in this example, so for example, there is a voice agent assistant there that wants to play something and it starts playing. And then as soon as it makes the link and it starts streaming, maybe the policy management understands that this happens and it enables some effects on the amplifier endpoint, which could mean that more nodes are getting linked at the point and they enable some effects. Or it could mean that something on the hardware is triggered to enable this effect where the voice is louder than the music. And then the hardware player, the user goes and clicks a button on the hardware and there's a radio hardware player that starts streaming to the amplifier without passing media through the main CPU. But because we have this graph representation, the policy management understands that it needs to unlink that media player at the top, which was previously linked, and which is in software. So this is how it works. And then we have, how do we make the connection between the endpoints and the actual media nodes? So maybe you have, again, two endpoints, an application that plays something and an amplifier, something that plays. And you want to implement volume controls per stream. So you have the multimedia stream and the navigation stream, which are, as I said earlier, there's something that needs to be treated differently. So the software DSP also, output and point that I've written there, is creating a couple of DSP nodes there that implement volume controls per stream. And as soon as you make this virtual link on the endpoints graph, it goes and creates the actual link on the media routing graph. And in the case where there is a hardware dedicated DSP that can support these streams in the hardware and you can trigger the different volumes per stream by sending some comments to the hardware. It's still the same externally. So you can make a link on the endpoints graph. And then the session manager, which has a module that implements this endpoint, it understands that it needs to go and send a message to the hardware amplifier to enable that different volume effect. So what's the status of Word Plumber right now? It currently works nicely for AGL. On the demo that we have on Building K, we use PipeWire and Word Plumber at the moment. The API is setting down. We are doing some more refactoring on the API level on the library, but it's setting down. And we just recently started generating documentation using hotdog. And I'm in the process of documenting. Right now, I started like last week, I pushed some docs. There are some shortcomings. Like it doesn't have flexible policy. Although the policy is configurable, it's not configurable enough. And it works only for the AGL demo and nothing else right now. And there is no security management implemented. We just give access to all the applications, which is bad. But we have a plan. So the next steps, what I want to do next is experiment with a scriptable policy. Provide an API that scripts can be written to influence how these decisions are taken about what to link where. And there is a runtime called Anatoli written by Bastia Nussera, who wrote this basically for this purpose, for managing pipe wire policies. So my next step is to experiment with that, see if that's something that makes sense. And if we can write nice scripts like that. But other ways of doing that would be also acceptable and welcome. The other thing is improve desktop compatibility and make a drop-in replacement for Pulse Audio that currently doesn't work on the desktop. Another next step is to enable management of video, video nodes, enable camera inputs, and screencast inputs. And implement security management. There is a design for that. I just haven't gotten around it to implement it. The source code, it's all on github for the desktop network slash pipe wire. You can find both pipe wire and wire plumber in there. You can make a merge request there. You can file issues there. The wiki is there. Everything is there. And on the Agile front, I maintain a branch of pipe wire with the Agile specific commits that I had to make, although I'm constantly pushing them upstream. So as soon as they emerge upstream, I remove them from the branch and I rebase. And then there is this meta pipe wire, Yachto layer. Agile is based on Yachto. So these are the recipes to build pipe wire and wire plumber inside Agile. Thank you very much. Any questions? The question is if I have any benchmarks comparing Pulse Audio to pipe wire. Yes, I had the benchmark that's from last year. So it's not up to date. But it was looking like I'm sorry I don't have the slide with me. But we had some comparison of pipe wire basically taking a stream, a 5.1 stream or six channels, transforming it in a couple of ways, resampling, changing the format and taking it out. That was using something like 6% CPU with 64 samples latency while Pulse Audio at the same configuration was using 100% CPU. It was failing. So it was a good comparison. It was like that. I'm sorry I don't have latest statistics. But yeah, it's looking really, really good. In terms of flexibility over here, yeah. You said you were looking to do more flexibility. Were you talking about boot time flexibility or runtime flexibility? Because like with the adoption of things like A to B in the audio space now, you're getting nodes that are coming on and off the buses all the time that are making audio harder to manage. Yeah, this pipe wire is very dynamic in how it manages hardware. So you could, it can detect things coming up and moving out all the time. Does that include like firmware? Because like you could have A to B nodes with DSPs in them, that the firmware, the DSP, which implements different functionality gets changed dynamically as well. Yeah, I don't see why it wouldn't be supported, but I never looked at this kind of hardware. Okay, okay. Okay, there's no time for more questions. Thank you very much for listening.