 Hello and welcome to my talk, I'm Alexandre Belloni and today I'll be speaking about supporting audio on an embedded system. This is actually an update of my 2016 embedded Linux conference Europe talk and since then quite a lot changed. I'm an embedded Linux engineer at Bootlin and Bootlin is a consulting company. We are providing development services, consulting and training to our customers. We are heavily focusing on the Linux kernel itself, but also some bootloaders and build systems. I'm also an open source contributor, I'm the maintainer for the Linux kernel RTC subsystem et I'm also the co-mainteneur for the microchips, ARM and MIPS SOCs. I've been doing quite a lot of audio work for customers and that's why I'm speaking about that now. So the anatomy of an embedded system would be on one side, you would get the SOC. The SOC would get two different kinds of connections to a codec and that codec will either output analog audio directly to a connector or maybe through an amplifier. So the two kinds of connections would be first the configuration connection, so that would be I2C or SPI typically, but that can also be for example the IOMemory directly on the SOC. And the second kind of connection will be the digital audio connection and we'll see a bit more in depth what this is about. So like I said, the codec configuration will happen on a simple serial bus, usually I2C and SPI. And the SOC digital audio interface is also sometimes called a synchronous serial interface. It's the one that will provide the audio data to the codec and it has multiple formats. I'll go through those formats a bit later. The example for those SOC digital audio interfaces would be the Atmel SSC, so that would be the synchronous serial controller, the NXP SSI or SAI and you also have the TI MCASP and you will have some SOCs that also have a separate SPDIF controller. The amplifier itself is totally optional, you can have some line-in or line-out connections. Some SOCs like the Atmel SME5D2 will have the codec and the amplifier directly on the SOC, so that will typically be a class D amplifier. So the signals for the DAI would be two different clocks, so that will be BCLK, the bit clock and WCLK, that will be the frame or sync clock. And then you will get one or multiple data connection, that will be TX and RRX for the codec, that will be data in and data out. Like I was saying, the digital audio interface is a serial interface. It uses two clocks, the bit clock is usually called BCK or BCLK in the data sheets and the frame clock is often called FCLK, FSCK, FCLK or sometimes it refers as the left-right clock, so that will be LRCK or LRCLK or even you can see also the word clock, so that will be WCLK in that case. So the rate of the frame clock is really the sample rate and it's also called FS, so that will really be the sample rate of your audio samples that you are trying to play or to record. There is a relationship between BCK and FSCK, that will be the bit clock is actually the frame clock multiplied by the number of channels and multiplied by the bit depth that will be the sample size. And then the GAI also has one or multiple data lines and we'll see that you will possibly get more than two data lines. So you also have a separate clock, so that will be a third clock. MCLK, that can be also referred as the system clock or CCLK and that will be a clock that goes to the codec and that's what the codec needs to be working. Some codecs will also require that MCLK or system clock to be able to use the control interface. So just know that sometimes you really need to be able to provide that clock before even playing an audio file or a sample. So that clock can be provided by the SOC when it has that kind of output or if you are using a system on module, if your system on module is also providing a pin with that clock or you can also use a crystal. If you don't have any of those, some codecs are able to use BCLK or the frame clock as their system clock and that will make MCLK optional. Usually the codecs will expect MCLK to be a multiple of BCLK and usually it will be specified as a multiple of FS so you may see that MCLK should be 256 FS for example. Some other codecs will have plenty of divisor or PLLs to be able to adjust that system clock to BCLK. It kind of depends on the codec. One DAI, so either the SOC DAI or the codec DAI will be responsible to generate the bit clock. It will be the bit clock master and one DAI will be responsible to generate the frame clock that will be the frame master. So the bit clock master and the frame master doesn't have to be the same DAI but that's usually the case but while you can see some use cases where you will see for example the bit clock provided by the codec and the frame clock provided by the SOC and that should be perfectly working. So the codecs usually have a great set of PLLs and dividers that what I was saying just before and that will allow you to get a precise BCLK from many different MCLK rates so it's usually more flexible to have the codecs as the bit clock master and frame clock master. However now what you can see is that some SOCs have specialized audio PLLs and that should be the case of the modern SOCs that's the case for the IMX6, 7 and 8 for example or you also have that on the SAME5D2 now and TI also have that on some of their SOCs now. If you have that then it's perfectly fine to actually use your SOC as the master as long as you can generate the proper bit clock and we will have some examples later. So I said that the codecs may have multiple data ins and data outlines and they may have up to one line per channel pair and usually that's done when you have multiple channel outputs let's say more than two so the example I have there with the analog devices AD1937 is that it has 8 outputs in 4 pairs and that means that for that you can actually have 4 data in line so that allows you to simply have 4 different left-right pairs on 4 different data ins this is multiplying the number of pins that you need and routing to the SOC but it allows you to avoid doing TDM time division multiplexing and some codecs may also have multiple DAIs that means that they will have one full interface for data in and one full interface for data out so they will have 2 times the bit clock and frame clock so that's also the case for that AD1937 and in that case you can see that it has DBCLK and DLRCLK those will be the data in so the D there means digital so that will be the digital bit clock and digital frame clock and it goes with the 4 data ins lines and so you can get up to 8 outputs from that codec so 8 analog outputs from that codec so that codec also has 4 analog inputs and in that case you would use the ABCLK and ALRCLK that will be the analog bit clock and analog frame clock and you also have up to 2 data out lines and so that will be analog in and data out so the digital formats that go over those lines are for example left justified and in that case what you get is that you have the bit clock going and then the left right clock in that case we have left level high and right level low on the ALRCLK so in the frame clock and then the data in or out is starting as soon as you have a change of level on that clock you can notice there that this is usually what is done by the codecs you can have more bit clocks than the actual data meaning that usually what you can do is have 32 bit clocks and only for example 16 or 24 bits of data to transmit then what the codecs will do is basically just ignore the superfluous bits that you sent it doesn't matter because thanks to the frame clock which is also as you remember named frame sync it will always be able to sync again and to know when the MSB of the data is starting then you have right justified and right justified does the opposite it will first ignore the starting bits and then get the last bits as actual data this is more convenient for the codecs because it has more time to prepare the DAC or ADC but it's also less convenient to configure because then you actually have to configure the number of bits to ignore you also have I square S which is the usual format that you will see when you are doing only left and right and in that case it really looks like left justified but everything is just shifting from one bit clock so you have to remember that it's not that important when you are reusing existing codecs drivers or SOCDI drivers but that may be an issue when you are writing a codecs driver so be careful with that also the left and right have a different polarity so in that case you will have left low and right high DSPA which is only syncing for one bit clock and then you are waiting that bit clock and the data is starting and that data will have the exact length of your sample and the right channel will just follow the left channel then you have dspb that looks a lot like dspa but you don't have that one bit bit clock delay and in that case also for dspb the impulse on the frame clock may be longer than one bit clock you just have to have one bit clock low between two samples and then you can combine that with TDM so usually on TDM you will get TDM left justify or TDM i square s which are the most common configurations that you will find so what you do there is that you still use the frame sync as your main sync but then you will let's say cram multiple channels in the same side in that case really the left right clock which doesn't matter much it can usually it also can be only one one bit clock and then you have as many bit clock as you need to be able to send all your channels and in that example you see exactly that so basically what you see is that you have 32 bit clocks per channel but you actually have only 24 bits of data so that means that you will get in total that will be TDM 8 so you have 8 times 32 that's 256 bit clock for one frame clock and then finally you have final format that is completely different in that case you have only 2 signals to option it's not an usual DI but we are starting to see more and more especially with the kind of assistant that you get like Alexa or Google assistant that are using microphone arrays you also have PDM which is encoding all your bit differently in that case that the pulse width is actually dependent on the analog level so it's completely different but you can get that kind of format which until now was quite rare we are starting to see so, do you wonder all of that we have a subsystem it is based well it is a subsystem of the kernel but it is also a subsystem of Alza Alza it is the usual Linux sound architecture and we have then ASsoc that means Alza for system on chips and it has been created a while ago now to provide better Alza for system on chips and to actually be able to reuse codec drivers across multiple architecture instead of each architecture or each ASsoc vendor having their own driver for similar codecs so it provides two different APIs one API to write codec drivers and one API to write SoC interface drivers and then obviously you will have to glue those two drivers so the components for the ASsoc subsystem are the codec class drivers that are defining the codec capabilities so that's the first defining what is available as the DAI so as digital audio interface and then it also defines the analog input the analog output that you have and finally you also get the audio control so basically the most basic control that you will get will be muting and changing the volume of your analog input and output but some codecs will have more advanced controls like muxing, actually muxing channels saying ok I want channel 0 of my digital input to go on channel 4 for example or maybe some codecs will be able to mix their input on output so you will get control to do that then you have some platform class drivers that will define the SoC audio interface so that's what I usually call the CPU, DAI and it defines the same audio interface capabilities and obviously at some point the ASsoc will be matching the capabilities from the codecs with the capabilities from the platform to understand whether it is able to play some kind of audio samples the platform class drivers from the SoC will also set up DMA when applicable and usually you will be using DMA because you don't want to copy all the audio samples from memory or from the file to the DA54 so usually DMA will be involved and finally you have some codecs to platform integration nowadays it's usually done through device 3 and previously it was requiring a machine driver in C and that machine driver will register a sound card that was the topic of my 2016 ELC talk so I will let you if you really need to write a machine driver I will let you have a look at that talk it has been recorded, it's available online and there is a note there that the codecs can be part of another IC usually what you see on really embedded system or phone or that kind of thing you will find that as part of the PMIC or some will have that as part of the Bluetooth or modem chips so the codecs can be part of anything or it can be a separate codecs but it can also be on another IC so let's have a look at how to support that using device 3 so now most of the sound cards can be described using a simple audio card and that will be using this simple card driver so the device card bindings are documented in simplecard.yml so it has been converted toyml recently and the driver only that will be in sound soc generic simple card since 2017 you also have a Northcraft base binding that is available it's documented in audiocraftcard.txt and the driver that will be handling that is soundsoc generic audiocraftcard.c which is a bit different but it's less it's still less common currently but yeah, it's definitely usable both of those require a few changes in the SOC DEI driver to be actually usable as is and the changes that have been made will be for example selecting audio mode on the Atmel SSE now I'm saying the microchip because the SSE can work as a simple serial interface but it has to be set to audio mode and before you add to actually always write your own machine driver just to be able to change that mode and for example the IMX has a kind of same issue where you have a separate mixing interface for the audio signal where you can say ok, the audio from that controller is going to those pins so this has to be configured and now the outmux driver is able to do that when you are using a simple card so let's see an example let's say that we have an ADAU1372 connect connected on an IMX's ULSAI and so the first step would be just to enable the SAI and the connect and in that case it's quite simple you just say SAI2 will enable that you use the usual pin control for SAI2 and that's pretty much it and for the codec in that case it is connected on I2C in that case the I2C1 controller and it is at location 0X3C and it takes an input clock and that input clock is MCLK so that the system clock in that case it has to be named MCLK so that's what we did there and in our case what we have is a crystal so the crystal is also defined there it is a fixed clock going at 12.288 MHz so that's pretty much it and it's defining, enabling our CPUDAI that is the SAI in that case and our codec so now let's describe the sound card the sound card in our case will be simple audio card you can give it a name so that's basically what we did there so simple audio card name simple name, IMX6UL, ADAU1372 and now we have to define how the link is down between our two DAIs so that will be a simple audio card DAI link in our case we only have one DAI link so you just have to know that you can avoid having that subnode but it's a better practice to actually have it we'll be using the I2S format and the bit clock master is the ADAU so that's the codec and that's more convenient and because we know that it will be able to generate both the bit clock and the frame clock from MTLK so that's as simple as that we have a CPU subnode and the DAI link subnode and that one refers to the SAI so that definitely means this is our CPU DAI and then we have a codec subnode and that one refers to the codec node they have been named but the name doesn't really matter what matters is that the well they have some aliases and the aliases doesn't matter what really matters is the CPU name for the node and the codec name for the codec node so that's perfectly enough to have audio playing if you do a play with that audio will be perfectly working it will be using I2S everything will be fine so let go a bit further because that codec actually has 4 channels so it has 4 input channels and 2 output channels and if you want to be able to use those 4 input channels then you will have to do TDM and in that case it's quite simple we are just saying ok DAI TDM slot name is 4 and we will also set the width to 32 and in that case we will always be using 32 bit sample width and the codec will happily ignore all the bits that are superfluous or whatever you send that is not from your sample so that's pretty much it in that case you will just have 4 channels of 32 bit and it's still using I2S format and that will be perfectly working until you realize for our third example until you realize that the codec has an hardware issue and it will not be able to generate the proper bit clock when doing TDM4 and with a 32K sample width so in that case we change everything and instead of having the codec as master we just switch to have the CPU as master so it should be as simple as changing the bit clock master and frame master that's what I did there now I'm referring to the ACSAI to DAI so it should be working right but unfortunately the result is not really what you expect when you try to play an audio file at 32KHz it's still complaining that it's unable to install the hardware parameters and looking at the kernel log will show you what's happening in that case the ACSAI is complaining that it's failing to derive the required bit clock and it's saying ok the bit clock is 4MHz right and let's have a look at the clock tree with the after mounting debug FS the CLK summary will show you something like that I removed all the clocks that were not interested but the clock tree up to ACSAI to looks like that and basically you see that the clock that is feeding the ACSAI well there is no way for the ACSAI to actually divide that clock to get the proper BCLK because when there is no ACSAI just have a small number of dividers and nothing is fitting to divide the clock to get to the 4MHz clock that we want so let's solve that it's actually possible to solve that using device tree and in that case we want to both reparent the clock using the assign clock parent and property and then to set the clock rate using assign clock rate just so that you know the IMX6UL as an audio PLL so PLL4 is actually the audio PLL and it has all the rates that you would want to be able to generate 32K, 44.1K, 48K for example and so that's the one you definitely want to use so in that case what we do is that we say ok the ACSAI to cell that would be the MUX selector for the ACSAI2 that selector we say ok it's parent will be the PLL4 audio div and its clock rate is set to well that's 196 MHz and then we have the ACSAI2 clock that we want to set to 24.576 MHz and that is selected because the ACSAI is not able to divide by 3 2, 4 et 6 so it's not able to divide for example 12.288 and to get the 4 MHz clock so and that will well after doing that so this is assigning the clocks and everything in the ACI node after doing that you will get audio working so that will be solving your clock issues now you have the fourth example you have a possible cost reduction and where the ACI is able to output its clock to feed the collect so if you want to remove the crystal from the board it's possible and then you will well save a few a few cents or a few dollars because you don't have to to have that crystal on the board so in that case you will be assigning the clocks in the codec node and this is the exact same assignment that we had before you have to do it in the codec node because you have to ensure that when the codec is probing probing it will try to get its mclkrate and you have to have it set before the codec is probing and because of the probe order you are not sure that the ACI will actually probe before the codec so you have to then now assign the clock in the codec node so the other thing that change is that the clocks don't refer to the crystal anymore but it refers to the actual ACI clock and that's pretty much it now you are feeding the ACI clock to the codec there is one property that is specific to free scale that's fsl, ACI, mclk direction output to say that that particular pin is used so now there is something that we did there is that we have replaced a 12 Mhz crystal to a 24 Mhz clock and this is actually working because that particular codec has a configurable divider for mclk and it can actually divide it by 2 so well we found a solution with the ACI as master and without a crystal so that's fine so something that you can do and that is not mandatory when you were writing your machine driver it was mandatory to do it but using a simple audio card you don't have to do it it's routing so that will be the actual audio connection that are present on the board and the first step there will be to define the board connectors you use the simple audio card widgets property and then you have some pairs of strings there that will be the type and the name that you want to give to your particular widget so then you have to route your audio from the codec to your board connectors and this will be done using the simple audio card routing property and this is also done with pairs of strings and this will be sync and source so in that case you have the analog ins of the codec so in 0 and the source for that will be line 0 you have 2 because line 0 is a stereo jack so you have 2 and they are going to 2 different channels on the codec and that's the same for line 1 that is there multiple times and it's also going to 2 different analog input for the codec and then for the output you have the headphone jack so that will be your sync and in that case that's my widget and the HP OutL and HP OutR are coming from the codec so that's basically the headphone output left and right channel if you want to know what the codec is providing and what are the names that you should use there you just have to look for soundsock dapm output and soundsock dapm input macros and definitions in your codec driver so what about the amplifier if you have an amplifier on your board well it is supported using the auxiliary devices and there is a driver for a simple amplifier when they are driven by a single dpio which will be the usual case for amplifiers so let's have a look at that with which examples I took an example from the mainline kernel so in that case the simple audio amplifier is used that's the first thing that is defined and it has something interesting it has a sound name prefix that will give the name of the widgets that you can actually use later when routing and that's what you see in that sound node you have the audio routing that is there that is reusing the au2 name in that case you have two int and two out so au2 in L, au2 in R input and au2 out and au2 out R will be the output and in that case you see that the inputs of the amplifier are connected to the codec output and the output of the amplifier will be connected to the jack or in that case the line out widgets that will be from the board how to audio through that amplifier so let's talk about troubleshooting what you can have is that audio seems to play for the correct duration you just use time for example and to know whether did play for the correct amount of time and you may not have any sound and usually it's an issue with the other control that's usually what you will get you need to unmute master and the relevant control so that may be digital for example and then you want to turn up the volume you want to check your codec analog mixing and mixing and that's exactly what could happen so maybe you are listening for one channel and it has actually the channel has been mixed to another output inside the codec so that's something you have to check you also want to check the amplifier configuration and obviously you want to check your routing just know that also for the controls that's the case for the ADAU1372 codec that I use as an example the default values for the controls are not doing anything so they basically have some invalid mixing so you definitely want to set your own mixing when using that codec you can also have no sound but then a play will be stuck and so it will time out then you definitely want to check pin mixing and because that definitely means that there is an issue in your DAIS configuration so check your pin mixing check your configure clock direction who is the master, who is the slave and whether the pincon for the pins for the SOCs are the correct ones definitely you want to be able to check the clocks using an oscilloscope is your bclk at the correct rate is your frame clock at the correct rate are they in sync and then you want to check the data is the data what you would expect regarding the sync and the bit clock usually data and bit clocks are pretty much ok but what you really want to check is whether the data is in sync with the frame clock and then check pin mixing again because it's always pin mixing and like I said just before some SOCs will have more mixing so check the odd mix or the mcsp mixing because you may not be sending audio or your signal to the correct controller you can also get that kind of right error input-output error that's usually caused by an issue in the routing something that I had even with an upstream driver that the driver was not exposing a stream named playback and that's basically the stream that Alza will look for to be able to play audio so if it's not there then nothing will work you can use this Dappem which will provide a graph of all your widgets that are enabled and you will have a nice graph with everything that is enabled and disabled and debug without your routes are working well or whether you have an issue in your correct driver for example finally you will want to troubleshoot the overruns and underruns they may because obviously by a media that is too slow whether when recording or when playing audio but it can also be caused by an imprecise bclk and in that case you will want to try to find a better PLL and divider combination just so that your bclk will be more precise which is not always easy or possible or maybe you will just have to live with your underruns troubleshooting a bit further then you will want to have to know to have a look at what the cpu, daa what the driver is doing the callback of interest there are setclkd setclk because that will set up the clock dividers and then the main one main interesting one will be hwparams that's where basically the wall set up for your stream is done and that's what you want to to look for set daa format will also be of interest because that's where you set left justify, s or whatever format you are using there so if you have an issue with your format you want to look at that same thing will happen in the codec driver callbacks where you have setclk and you want to look at what is how the clock is set if you have an issue with your codec not generating the bitclk properly because like i said some codecs will be able to generate the bitclk from their system clock or maybe from the bitclk or the frame clock in that case that's something you will want or you will have to configure and that will be clkid parameter that will be definitely codec specific so remember that using the codec aslave is uncommon and sometimes it's just untested and you may have an upstream driver and maybe nobody ever used it as a slave and so this will just not be working and when in doubt that's what i usually do when bring up a platform i'm using devmem to set the dai configuration that i want and iqc get or iqc set to set the codec dai configuration which allows you to do that while playing a stream and so well it's convenient because when you get the proper configuration doing that then you just have to have a look at what's needed from the driver to set that configuration going further once your kernel and device report are working usually what you will want to do is split let's say tdm8 or 4 so when you have 4 channel input or 8 channel input you will want to split that in individual channels or maybe in pair of channels because usually what you get is stereo audio or mono audio and this can be done efficiently in user space and you will be able to do that using as a live and i will just send you to our blog that is describing exactly that and it's describing how to do that efficiently and not have too much mixing happening so references documentation from the kernel is quite ok not always what to look for but when you have some information there you have a document from analog devices describing the common digital interface and you have the sqs specification which is kind of the base of everything so i'm available if you have more questions i will be able to answer them thank you yeah so i actually got two questions left so the first one was what is the dai and i guess this is basically what i answered at the beginning of the presentation the dai is really so that means digital audio interface and it designates the actual signal so that would be those two clocks and then data going from the soc to the codec or from the codec to the soc the other question was whether we can use simple card or audio graph card on nvidia soc and i actually don't know i seeing the question i think it would be the same case as the nxp outmux or the atmail ssc where we now have a separate diver that is doing the correct thing to mix the audio from the controller to the actual dai but yeah i don't know if this is possible with nvidia i guess seeing the question that it will require some work you will probably need to ask on slack i don't know so one question i have is we use slave with 2.6 on thousand devices no issues should be expectable with 3.x or 4.x i would say no the alza and azok have been quite stable some changes happened but that's mainly the internal representation of well internal settings of the codec and that kind of thing nothing too impacting i would say the main thing that can impact you is whether you may need to change from your machine driver or whether you want to change from machine driver to a dt based description of your hardware so yeah i don't get any other question so thank you for your time i hope you appreciated that presentation and i will be available on slack if you have more questions thank you