 Okay, I guess that's it. We can get started. The topic of today's presentation is kind of about the state of audio and Linux. How did we get where we are at the moment? Where are we at the moment? And where are we going in the future? And of course, the question of the title, are we in a golden age right now? And is it coming to an end? My name is Lars Peter Klausen. I work for Analog Devices, where I, among other things, work on audio drivers for audio codecs. So the agenda for today is, well, we will start with the history of audio, not just audio and Linux. We will look at the major transitions in both hardware as well as software, and we will look at what drove those transitions, because major transitions usually don't just happen because there's usually a reason triggering it. And we will look at the reasons what triggered them and maybe what we've learned from those transitions. Then we will look at the present, what is the current situation? As I said, are we in a golden age? And then it's the last step. Look in the future, what's coming up next, which transitions are going to come, and how are we going to deal with them? How are we going to handle them? But before we start, I quickly want to mention the concept of interdependent with this modular. And you find this in architectures, not just software and hardware architectures, but architectures of systems in general can be kind of grouped into interdependent architectures and modular architectures. And the interdependent architecture, you have a module or something like a device which provides a function, but there is no clear separation of different sub modules. Everything is fluent. The different parts, different sub modules know about each other and they know about the internals, which of course creates dependencies between them. If you want to upgrade one part, you have to upgrade the other part. In contrast to this, there's the modular architecture, which is clearly separated into different sub modules, which have clearly defined interfaces over which those sub modules communicate, which means you can easily exchange parts or sub modules without having to update all those other sub modules in the same system. But one downside of this is that you're of course constrained by the interface. You have to comply to the interface. If you're not complying to the interface, things no longer work. And if you look at modular versus interdependent, there is no clear, this is better than the other one. It all depends on the environment you're working in and the kind of application you want to run. But the concept of interdependent versus modular goes like a red line through this whole presentation. And I'm not necessarily going to mention or going to point out, hey, this is inter-modular. Sorry, this is interdependent. This is modular simply because you can't always sell as many different hybrid states in between. But what I want you to do is, as we go along through the presentation, look at things and think about, is this more modular or is this more interdependent? And maybe could we change it to be more modular or more interdependent? And how would this affect the system as a whole? So let's start with the history. Because as they say to understand the future, you first have to understand the history. How did we get to where we are right now? And we will really just focus on the major transitions and we will focus on Linux and on ALSA, not so much on the other things. And we will keep out, we will not really talk about the minor transitions like from mono to stereo, because those are gradual things that just happen. But for the major transitions, there's usually a reason. So humble beginnings. In the 1980s, roughly, there's the emergence of kind of like the personal computer space. The Apple II is released, IBM PC is released. And what you can find in those systems is the so-called PC speaker, sometimes known as the beeper. And it was really part of the really first PC, which was released in 1981. And what you could do with this device was really limited. It kind of has like two states, can be on and off. And by switching it on and off at a certain frequency, you can create a tone. But it was very limited to, for example, only specific volume. Well, you can only have like size of words on, but you can't have any intermediate states. So people use this. Some games use this like to play, I don't know, a victory melody like bbbweep or something like this. And well, implemented, those were implemented as either like a magnetic speaker that can be turned on and off, or a PC electric plate and more modern system. And on Linux, interestingly, those are supported in the audio subsystem, but in the input framework, primarily because, probably because input predates audio. And this device is so simple that it can be used for, it's often used for giving feedback, like if you enter a wrong command, you get a beep still to this day. And this was mainly used. And what we have to remember in those days, the CPUs are like really, really slow. We're running at like five megahertz or something. So you couldn't really play back complex audio wave files anyway. But eventually people wanted to have more features. This was primarily driven by the game industry. So probably many of you remember the Sound Blaster. This was the more or less first widespread consumer sound card, which was released in 1989, the first version. There was actually another sound card of which the Sound Blaster was a clone, which was released two years earlier. But the Sound Blaster had some additional features, was otherwise fully compatible. And the other company already went bankrupt three years after the Sound Blaster was released. And these sound cards back in those days were primarily synthesizer based. So you would program it to generate some kind of sound effect rather than like we have today, where we just play back a wave file. But the Sound Blaster, the differentiating feature of the Sound Blaster was actually it had one mono 8-bit PCM channel. And this essentially made it the de facto sound card for consumers. And applications like games or other like audio media players were written against the interface of the Sound Blaster, which means other hardware manufacturers which wanted to use the same interface, which wanted to enter the market, had to support the same interface, had to make themselves compatible to the Sound Blaster interface. So basically you had the first standard for audio sound cards. And then of course later on they added lots of new features, was made 16-bit. People at some point wanted to play back, like listen to maybe music, but at the same time get notifications, you got email and so on. So what they added was mixers for multiple streams and hardware and so on. So it's slowly progressing, getting more features and more and more features. So the next step, audio and Linux. In the beginning there was the open sound system called OSS, which was the default in Linux until version 2.4. You had the Dev DSP interface and it was really simple. To play back audio you would just use normal write, you would treat it like a normal file interface to capture what you would read. And there were a couple of IRCTLs for management tasks like changing the volume and so on. But one of the major drawbacks that became a problem quite early on is that, as you can see here, it's just Dev DSP, not like Dev DSP 0, Dev DSP 1. So it can only be one sound card per system and there were some other major restrictions. And it's still supported on today's Linux, OSS, mostly through emulation through other, but there are actually still, I think, 15 native OSS drivers which have seen pretty much like no changes in the last years, but they're still around in case somebody is using it and most distributions actually don't chip them anymore. So practically it's Dev, but we keep it around because we don't break user space API and in case somebody is still using it. And as long as the code is not really becoming a burden, there's no need to remove it, but eventually it will probably be removed from the kernel, especially since there's user space emulation available. And most distributions make use of this user space emulation rather than kernel space emulation. But what we should mention maybe is that OSS is still used on the BSDs, for example. The next step then is ALSA, which is the main focus of this presentation, the Advanced Learning Sound Architecture. ALSA was first developed, the ALSA development actually started quite early on, I think 1996 or so, and it was developed out of tree, out of the mainline Linux kernel in parallel while OSS existed in the mainline tree. And the first official release was in 1998, but there were some issues with OSS, as I already said, in terms of features, what it could support, but there was a... The company who created OSS decided to make core components of OSS closed source, which didn't really go well with some developers of the Linux kernel. And I think this was the final reason why OSS was replaced with ALSA, which was done during the 2.5 development cycle in 2004. And even the first release of ALSA had many, many of the major concepts that we still see today in ALSA. And the basic architecture looks like this. So ALSA is not just limited to the kernel, it's split between the kernel and the user space, there's a user space component and the kernel space component, which I personally think is a good idea in general for hardware abstraction layers because the interface or the boundary between kernel space and user space is very restricted in what you can do, which makes it cumbersome to use for end users, so you want to provide some kind of layer on top of it in user space, which simplifies things. And ALSA did this right from the start. And all applications are really developed against this user space API and not against the kernel space API. And another feature or another core concept of ALSA is that it, at the kernel level, it only describes the hardware as accurately as possible. It says, okay, this hardware can support two-channel and maybe 16-bit, and this other hardware can support two-channel, four-channel and 16-bit and 32-bit. There is no emulation like resampling or mixing inside the kernel. This is all handled in user space. And ALSA also implies a user-server client architecture, which communicates with other well-defined interface. The server offers audio services, audio playback, audio capture services, and the client uses this. But this is, again, not limited to drivers and end applications, but ALSA uses a module plug-in architecture. And there are some plug-ins which can be stacked on top of each other, where a plug-in is a client to the underlying layer, but it's a server to the upper layers. And this is how, for example, mixing was implemented in the early days. You had a plug-in in user space which would talk to the hardware driver then implement mixing, and on top of that, provide an interface to user space applications. And, yeah, and organized from an organization perspective, ALSA is split into, as a top-level component, which is called a sound card. There can be many sound cards in the system, not like OSS where there's only one. And a sound card has so-called devices, which can be really anything that implements some kind of function. It can be PCM playback, it can be capture, it can be mixer, it can be MIDI, it can be timer. In this presentation, we will really not be talking about those two, more about those over here. And each device can have sub-devices, which specify specific endpoints in this device. Like, if you have mixing capabilities in hardware, your PCM device would have two sub-devices. And this is what you typically see on a modern sound system, like on your laptop, on your desktop. Maybe you have a sound card, you have one playback device, one capture device, and then a mixer device. And what ALSA also introduced early on is the concept of controls. And those controls were used for configuring the hardware, rather than, like it was done with OSS, using device-specific IOCTLs, for example. And there are different types of controls that ALSA implements. There's volume control, and the volume controls typically also provide a gain table. So, you know, if you change this volume by so much, this corresponds to a change of so many dB in your sound output. There are switch controls for turning things on and off, like muting an output stream, and the enumeration controls, which are primarily used for routing. Like, if you have input, and you have two microphones, you can only record from one microphone at a time, you would use an enumeration control, which allows you to switch between those two microphones. And each control also has a name, which is used to uniquely identify it. And in the beginning, there was a definition of a standard naming scheme. You can see this over here. There's, like, headphone volumes, speaker volumes, playback master volume. And drivers were encouraged to follow this standard naming scheme, so applications could figure out which function this control controls, actually. But as we will see later on, as things grew more complex, not so many drivers of actually following the naming scheme convention anymore, or actually there's so much more functionality, which is not recorded in the naming scheme. So, drivers just come up with their own thing, and this means, of course, that user space applications no longer are capable of knowing what this control does and need intimate knowledge about the card itself, about the sound card itself, and can't work like in a generic way. Another very important part of ALSA is the so-called constraint system. And I think this is actually the most differentiating feature of ALSA. And the constraint system is used, or if you have written, if you have ever written an ALSA application, the constraint system is probably what you've looked at and thought, what is this? I want to play back some way files, and I don't want to send a spaceship to the moon or something, because if you look at it, it's really complicated to use, but it's also extremely powerful. And the way it works is that you start out with a configuration space. Configuration space includes all possible configurations that a device can have. This is kind of like a simplified example. You have a device that has the option of either playing back audio with one or two channels, and at 16k or 32k sample rate, but it cannot play back two channels at 32k rate for whatever reason. But if you initially query the audio state, you will get this box because the configuration space is kind of like a boundary box around all possible configurations. And then you go, or the application goes through a negotiation phase with the driver, which slowly refining what it actually wants to do. For example, the application can say, I only want to play back one channel, what rates can you offer me for this channel, and then the bounding box shrinks. And if the application had said, okay, I want to have two channels, which rates can you offer me for two channels, we would only have the bounding box around this. And this is extremely powerful, but it's also, if you just want to play back a single audio file, it's very complicated to use. And we'll get to this a bit later, but I think this is the main differentiating factor which has allowed also to be very versatile to be used in professional as well as consumer audio. But one overlook was maybe in the beginning that people wanted a simpler interface. Not everybody cared about maximum performance because this really allows you to extract maximum performance from your hardware. But we will get to this later. So the next transition in the audio hardware world is reducing cost. CPU processing power became a lot more powerful. We are now no longer running at 5 MHz, but rather 500 MHz. So the CPU power that's required for audio processing is now relatively small compared to the overall available processing power. And then some clever folks came up with the idea, well, we can save a lot of money if we remove all this hardware from our sound card and just do it in software. So the hardware was greatly simplified during this transition. There are no more synthesizers, no more mixers, just one stream, one input stream, one output stream. And that's it. And the features are moved to software and emulated in software. What also happened during this phase is the standardization of audio interfaces. The first kind of standard that was implemented is the so-called audio codec 97 standard, which is named this way because, well, it was really introduced in 1997 and it's a standard specifying audio codecs. And one concept that came up with is it splits, it cleanly splits things into a controller device and a codec device. And there's a clear interface between a controller and a codec. And in addition to what it is, sorry, on this bus, there can be transport for audio data as well as for control data for making changes, like changing the volume. It's handled over the same bus. And it's well standardized how this bus works. So anybody can basically come and implement a codec if they know how to make a codec. And anybody can come and make a controller if they know how to make a controller. And it would just, hopefully, magically work if you put those two together. And the reality, it wasn't like this. If you have a standard and multiple implementers of this standard, you will always end up with, well, a slight variation, slight incompatibilities. And so, but in the software, it really simplified things for the software world because it meant we can write a single driver, a single driver which basically takes care of all those devices. And then if a device is not fully standard compliant, we add a small quirk or patch to the driver which takes care of the device-specific incompatibilities. Because the other thing the standard specified was the standard register map. There was a standard register layout specified in the standard and all drivers and applications could rely on it being there. And what was also part, like, if you want to change the sample rate, it would always be done in the same register whereas with older sound hardware, every sound hardware had their own register. And what it also did which is, I think, really important is it added the ability to have discoverable features. So there was one register, two registers which specified which functions are implemented by this device and the standard did not specify you have to implement all these functions but left it open to the codec implementer which functions they thought were relevant. But AC97 was also quite limited and it quickly became clear that for example vendor extensions are necessary. In the beginning there was a reserved set of registers for vendor extensions but as vendors ran out of this vendor extension registers they just ended up reusing standardized registers and at this point everything breaks down because you no longer have a standardized interface of everybody. He uses a standard register map but applies their own meaning to the specific registers. And it was also limited in terms of audio performance. In the beginning was only 16-bit, later extended to 20-bits but it was also maximum of 6 channels and 96 kilohertz. So we'll see this in a moment but it was rather quickly replaced again. But it's still around today and some niche applications which requirements are enough. And the other thing that came around the same timeframe is the USB audio class. So the USB universal bus was basically the idea to standardize things like keyboards, mouse, but also audio and the standard defines so-called device classes and one device class is for audio. It's actually the first one, device class one as the device class for audio and it defines how a USB device which wants to be audio sound card has to behave which allowed people to write a standard driver which covers again all USB devices rather than having to write one driver for one device. Of course again there were some quirks that all devices were specific but it was handled in a similar way as to AC97. And the first USB standard was released in 1998 it was interestingly called 1.1, not really 1.0 because nobody used 1.0 and took us around four years to add support for the audio device class which I guess is in part down to the kernel a while to implement USB at all. So now we are at a point where the hardware has been greatly reduced and we have to emulate things on software and people came up with the concept of sound servers and I called this period the sound server wars because what happened really is the different tests of environments adopted different sound server solutions. For example the ART-ST which came from the KDE project and it was the ESD which was used by the Enlightenment desktop environment and GNOME and each sound server of course implemented their own client API and applications to choose which client API they have to use so you have like half of the Linux desktop applications work with this one sound server and half of the Linux desktop applications work with this other sound server on your system but your sound card only supports one stream so only one sound server can be active at the same time if you want to use a KDE application that plays audio at the same time as a GNOME application that plays audio bear luck. So it's not really a good solution if your solution includes only 50% of things are working it's not really a solution it's more of a problem. But luckily Pulse Audio came along the development actually started in 2004 which is just two years after I also had been integrated into the upstream Linux kernel and it was also quickly adopted by distributions shipping as early as 2007 with Fedora and 2008 on Ubuntu as the default sound server and the one thing that Pulse Audio really did right which set it apart from everybody else is provided compatibility layers like application that was written against the ESD sound server would still function with Pulse Audio application that was written against the raw R-Centerface which functioned with Pulse Audio application that was written against Arts which functioned with Pulse Audio so this was really a unified solution everyone at the war and there were of course some issues in the beginning with Pulse Audio many many people complain or complained about it and some still continue to complain about it today but it's always easy to complain if you don't remember what a method was before and I personally think that Pulse Audio is probably the best thing that happened to the Linux Audio system in the last 10 years because it resulted in things simply working right now maybe not at the start but took it a few years but now things are essentially just working and with this approach of Arts and ESD and so on it would never have worked and the other thing that Pulse Audio did offered a simplified audio API so if you want to write a simple audio application you no longer have to use a complicated outside API you can use Pulse Audio and this helped for application adoption for Pulse Audio as well so when we look at Pulse Audio today it's really a modern sound server and many innovations at least on Linux on the Linux desktop in terms of audio came from Pulse Audio it's like timer based scheduling which allows us to have low latency and power saving at the same time because usually you have to choose between one of the two with more traditional approaches and this allows both of them you have pair application volumes Pulse Audio is network capable you can play audio on one machine you can output it on speaker on the other one like you have maybe a dedicated media PC and want to play back audio on your laptop you set it up so it streams the audio to your media PC and it comes out of your speakers and it's also multi user capable which means like one person can log out or can lock the screen another person can log in and automatically all the multimedia streams of the person who logged out will be muted and the other person can use the audio hardware with more traditional approaches if an audio application was running even if it was running in the background it would basically hog the audio interface and nobody else could use it and also it's like Bluetooth integration so lots of stuff going on there and these days none of the other sound servers are around anymore and Pulse Audio is the default on all on all major distributions another thing that happened around the same time is embedded started to become more important more powerful this is when we saw the introduction of ASOC also for system-on-a-chips which was merged in 2006 actually on October 6 so basically today is the 10 year anniversary of ASOC being integrated into the upstream kernel and what the innovation that ASOC did is it split things into two different categories from a framework perspective it split things into a platform device which usually took care of the DMA which is copying data from memory to the audio pipeline or the other direction the next component was the CPU die which represents the audio controller which back in the day was mostly either I2S or AC97 today it's sometimes also SPDIF and the third component was the audio codec which was an external component which did not live on ASOC which was connected to the audio controller through the standard audio interface and on the codec you typically implemented all the mixing and controls and stuff and what ASOC required you to do if you want to create ASOC sound card is you need to write a so called fabric or card drive which combines different components into one sound card in addition you need to define the fabric like which speakers are connected to which ports on the codec microphones and so on and this greatly allowed the ability because in the old days you basically rewrote if you had similar components on different sound cards you just copied it over and did something similar there so now we have the ability to really share code between different sound cards which are made of the same components and the other thing that ASOC did introduced dynamic audio powerment and this is a graph of all the power relevant notes in the system and it allows really fine grained power tracking because on an embedded system you always want to be in the lowest power state and with older sound cards if you were lucky what they did if you open the playback device they power everything up and if you close it again they power everything down but in this case you really have fine grained tracking of what needs to be powered up in some cases you don't need to power up all the amplifiers because for example only one audio path is active and Dappen allowed this to track this and Dappen also introduced a cross device interface for managing this so different codecs could communicate over Dappen with each other which is again great for usability so let's quickly talk about more modern hardware as I said AC97 people quickly figured out we're not working out so well so SUGSEXO was introduced it's high definition audio and it's quite similar but it made some major changes the way things are organized in AC97 you have a flat register map and here it's more organized in hierarchical function groups like a root node and many sub nodes so it's much more extensible because vendors can just add a new node if they have to but otherwise it follows a similar scheme that you have to split between the host controller and the codec and what HDA also introduced it was more or less self describing the table and the bios which described which pin which input port on your device was connected to which pin which speaker was connected to which pin on the codec which eventually allowed to greatly simplify the HDA driver by passing this table and then have just one driver that works for everything of course again quirks are necessary for some exceptions but it's really just in the kernel just one driver which handles all of those devices like every laptop, every desktop and it was later also used for things like hdmi or display port audio output and yeah and the other thing that happened is mobile became a lot more important these days mobile is very multi-media driven you want to play back audio, video on your device while playing on the road and audio really has become a differentiating factor Windows go out and advertise with hey this device has better sound quality than the other device by my device and so the hardware in the mobile segment became a lot more complicated than what we saw before with ASOC became highly specialized and when you for example have a phone with android running so it's nothing like pulse audio in the sense there is one audio demon that takes care of everything there need to be today device specific components in the audio server sometimes extracted away using configuration files but it's really not you cannot create a generic distribution like Debian like you can take Debian running on your desktop or your laptop and it works out of the box you can't run it on your phone or at least you won't get audio because the distribution has to be aware of what the hardware looks like which is a big problem as we will see in a moment so now we are in the present and let's quickly look at the consumer audio stack how it looks like and as I said so pretty much everybody is using HDA on board some people still use USB but to a lesser extent in the current there is a well-tested HDA driver and a well-tested USB driver which see lots of updates on top of that we have the ASA library and Pulse Audio talks to the ASA library and then through the ASA library to the devices and on top of Pulse Audio we have a couple of different categories of applications multimedia application generally generally use G-streamer for the audio library then there are some desktop applications which either use Lib Canberra for notifications like a Bing or something like this like when you get a new message and if they want to do direct playback they typically use LibPulse and in the last section there are games which typically use the ASA library so you sometimes see very convoluted graphs of what the audio stack on Linux looks like and of course those graphs are correct but in reality it looks more like this than everything else it's an exception there is a small niche for professional and prosumer audio and this is mostly Firewire based but everything else uses either HDA or USB and all the Pulse Audio issues have basically been solved today and Pulse Audio is also aware of HDA and USB has special code to take care of those two drivers and devices knows what to do when you have an HDA card or a USB card plugged in and if you want to go more in the professional audio world there is a Jack Audio server and Pulse Audio and Jack also know about each other and they don't fight about the sound card but there is a protocol where they can negotiate usage of the sound card like Pulse Audio will give up the sound card when Jack wants to use it and vice versa so it's all very well handled but at the same time embedded has changed a lot has become a lot more convoluted which typically have DSPs now on the SOC that does a lot of signal processing it's no longer just one codec it's multiple codecs, auxiliary codecs you have amplifiers lots of inputs, outputs there is bluetooth which wants audio support so you can talk on your bluetooth headset the modem you can have voice calls while shutting so the modem is typically directly connected to the codec so you can shut down the SOC to reduce power while having a call and this is also reflected in where the development happened if we look at the last 5 years audio in total so around 14700 commits 500 of those were for the also core more or less they were not so much going on 2100 were for the HDA and all the other PCI devices we have with support for combined only was 500 and each of them were like 20 to 100 and you have to consider that a lot of those patches are actually auto generated clean up patches so we can pretty much say everything else here is like the other PCI devices are there is no development going on and then we have USB the 600 patches which is also pretty good which is the second most developed platform after HDA and then again the firewire drivers 300 in total and 10 to 80 each one and then there is of course ASOC which has about 2 thirds of the total commit count so embedded takes up 2 thirds of the total commit count of the upstream kernel and what is not reflected here is that there are many many more drivers not included and the split and ASOC is roughly 50-50 between codecs and host drivers and of course there are some platforms which are haven't seen many updates in the last 5 years because they have become outdated but there is not like on the desktop side one system which takes them all but there are roughly I don't know 5 or 6 or 7 platforms which each have a similar commit count to and are under XF development so it's much more spread out there so now let's look at the future what's lying ahead of us and the next transition and it really has already started 1 or 2 years ago and what happened is that the concept that have been pioneered in the mobile world are now applied to all power battery powered devices are already shipping right now and the reason why this has done is because power has become such differentiating feature people are no longer happy it was 3 or 4 hours of battery run time about 10 hours 12 hours and at the same time they want to watch movies while getting this run time and listen to audio and another thing that happened in the last couple of years last 10 years maybe silicon has become a lot lot cheaper to produce so what happens is that features that we at some point moved from hardware to software to save cost are now moved back to hardware to save power so in a sense we have come full circle and in this case so the hardware is now again implementing synthesizers mixing DSP for audio quality improvement and so on so how are we keeping up with this and I just want to introduce a few technologies that have been developed over the last couple of years to handle this and some ideas that are floating around which might be used to improve this one is the use case manager which as I said those codecs have become a lot more complex they implement a lot of controls none of the controls don't understand that naming schemes anymore so a sound server doesn't really know what to do with those controls and the use case manager allows to write configuration files which group the control settings by function like for example a phone call or hi-fi music and all the sound servers to do is select a specific profile and then in the background it will configure things the way it should be for this specific hardware but of course this requires writing a configuration file for each device and yeah and there are similar concepts so use case manager is a part of ALSA but there are similar concepts for example Android and the Android sound server as well as Pulse Audio has also their own kind of use case manager and what we should probably try to do is to unify this at some point so you only have to unify those files once and users can choose which distribution they want to install and which sound servers still want to run the other thing that has been under development is ALSA topology which is basically a firmware that's provided by user space which is describing how the different things are interconnected so stuff that for example for HDA comes out of the BIOS for topology comes out of the topology file and it wasn't initially intended to just describe floor graph of a DSP firmware because there's a lot of processing going on in DSPs these days and we want to know if we insert audio in one side of the DSP stream inside the DSP where it's coming out on the other side but it was slowly extended to really also cover the hardware description but one of the issues with topology is it has become very dependent on the internals of ASOC so basically directly exports the internals of ASOC and we already said that this model of platform, CPU, DAW and codec no longer really applies to the current hardware configuration which makes things a bit more complicated because what would be really great to clean this up topology has basically now made this ABI which means it needs to stay around forever for a longer time so what does it mean should we just like is it time for a major overhaul should we just like throw it all away maybe go back to OSS because there's OSS4 now which is open source again or maybe use a different solution right from something from scratch and sometimes you hear demands from people who say this OSS4 is all crap we need to replace it but I disagree with this assessment I think OSS4 and OSS4 are a really great base but they have at least OSS4 has been around for 18 years and there are simply things that happen that nobody could anticipate 18 years ago and so it's maybe time time for upgrade to make sure that we can handle the things that have happened in the hardware landscape in the past because the core concept of OSS4 I think are really good the idea of exposing the raw hardware capabilities allows us to extract the maximum and then on top of this build layers that simplify things one thing I think we probably should do is extend the component model because currently ASOC flattens the component tree everything is exported to user space it's just one sound card user space is no longer aware of the different components that are present in the system which makes it really difficult to identify which component takes care of which function because you don't know which components are present so you don't know which functions those components can take care of so I personally think we should make the components a top level concept in ALSA and then allow applications to discover the topology and because the reality really is that most future sound cards will use ASOC because as I said the hardware has changed the way the hardware is built is changed it's really made up of different lots of different components and we don't want to create a monolithic driver many monolithic drivers which then each have this kind of the same code so you probably also make ASOC a first level citizen inside ALSA and the other thing is we need to really get rid of this platform CPU die codec concept and replace it with something else and I think what's a good idea is introduce kind of like a domain bridge model domain has a certain set of parameters because the way it currently is everything runs with the same set of parameters if you configure your playback stream for 48K playback every other component in the system will think that it's running at 48K even though there might be separate converters in the system so something where you split things up into multiple domains works a lot better to handle this yeah and we probably should remove completely remove the distinction between different component types because lots of functions that were typically in the past implemented on the codec side are now also implemented on the host side some system like the example over here don't even have a codec everything is done inside the ASOC there's just a simple audio controller with a DAC and an ADC directly connected to a DMA and the last thing that people always talk about is exporting the audio flow graph because there are many controls lots of mixing going on many volume controls and at the moment audio servers and applications don't really know which control controls what and exporting this flow graph for example in this case there's a PCM device going to a DAC and then to a speaker and there's a digital attenuator over here on this path and again control over here on this path and exporting this information allows the sound server to be aware that if they make a change here it will affect the path and if they make a change here it will also affect the path and typically for example you first want to max out and minimize the digital attenuation before you start increasing the gain to minimize the noise that's introduced and yeah so summary it's not really a happy end we are for sure at the end of a golden era things are changing the hardware has already gone through the next transition and from the software side we haven't really kept up and the next few years will be rough but a few people here in the audience you've bought laptop recently who is simply not working and hopefully this will become a bit more of a pain point for people so and if we implement those ideas I'm confident that we can solve it and maybe in five years again everything will be good and everything will work out of the box again in the next few years it will be difficult so with this thanks for your attention and we have a little room I think for Q&A like ten minutes anybody any questions if I understand correctly until now everything about the sound configuration is the kernel, isn't it so you don't have anything about what happens for example in Android that you have an XML file that describes the virus connection but this is something that we are going to do in the future yes so the use case manager is the same concept as the XML files in Android so this is already implemented in also in user space oh it's already implemented in also it's in user space it's not on the kernel but it's not very widespread but it hasn't been adopted really yet do you think that this will be the final solution or it's going to be moving inside the kernel again so you don't have dependency on what is described inside the root file system because usually the device is better if it's described in the kernel or in device tree or whatever yeah that's true so yeah the root device tree and AGP are one part of the solution but those descriptions are not always complete especially with ACPI we can't really fix this because it's well the way it ships it is in the hardware and we can't upgrade it so we need to supply augmenting information from user space the bulk of the HTA drivers in the kernel by lines of code is probably quirks for biases that lie of sound server wars so do you know what was the motivation of development of sound servers and not just having an extension of ALSA user space for mixers and synthesizers yeah I think during that time ALSA went through a period where it did not have that much development so ideally the sound server itself would be included in the ALSA distribution would be a component of ALSA but to be honest these days I think pulse audio can be counted into the extended family of ALSA because pulse audio developers talk to ALSA developers pulse audio requirements drive changes in the ALSA framework so it has become more or less part of ALSA one general question can you comment on how much support you get from the vendors when they introduce a new device do they push patches by themselves or do they give access to the documentation what's the state currently there I don't want to say the wrong thing so there are lots of vendors some vendors are actively participating upstream sending patches these days it's really hard to actually get documentation because it's all proprietary so we are a bit reliant on vendors actually sending the patches I'd say that in general silicon vendors are doing a moderately good job varies with silicon vendor system integrators less so it's really all dependent on because if you make a device you would expect a device maker to supply a patch for their device the only ones supplying patches are really the chip makers at the moment hello I have a question I would like to know if the use case manager would be something similar to for instance the Geneva audio manager where you can set rules to high level applications behavior in a more or less easy way with config files like set the behavior of a source for example what audio Geneva audio manager sorry I don't know it but probably can you describe a bit how the use case manager would work the use case manager in general is just a library so it's not an application and any sound server that wants to implement this kind of function can use case manager library to in the use you have configuration files in which you describe typical use cases for your audio hardware which contrains control settings mixer setting routing settings specific to this configuration and the idea is to provide a common base platform so it's more oriented to the hardware not to the high level applications behavior yes definitely on the hardware level I have a question more for the embedded use case there you said that basically hardware vendors or system integrators they don't provide patches but well I'm one of those that actually try to provide some patches and I then got confronted with this simple card stuff mm-hmm but it's kind of limited what is the status there Mark do you want to take this so more emoticons actually at this conference I don't know if I don't see them in this room but yes a simple card is intentionally simple so if it's too limited for you there's a reasonable chance that you might just have it might be actually a good idea to write your own machine driver but you're likely to have got the pushback do you need to do can you use this but it's perfectly possible the answer is no I can't use this I need to do something more complicated so yeah it's there and a lot of people should use it but that doesn't mean everybody has to use it one over there maybe this is a bit of a noob question but can you comment on say some words on alts on android and how that compares to alts upstream like do they use the newest version is there some kind of like the use case manager on android is that go back to upstream or is there some kind of integration going on when it comes to the alzacore often the alzacore is usually unchanged in the distributions but there are a lot of hardware vendors which are still out of mainline so lots of additional drivers some of these vendors also have introduced changes to asock because they require it but the thing is asock of course upstream is also going through evolution through changes so there's a bit of a divergence in terms of what the vendors are shipping and what's upstream and the longer it takes for them to get the code upstream the larger the divergence will be but yeah so and regarding question to use case manager yes the android audio server implements something very similar the reason android doesn't use upstream alza user space is that android has a strong desireship all Apache vst style license code and all the alza user space code is gpl or lgpl so they went and reimplemented it for that reason okay I think that's it thank you very much