 Hvala, vsi, da bomo dobro načine in tudi imam več, da vse načine. Mami je Johana Adriana Gratjov, imamo tudi namo tezda, a mekanizma, kjer je poživajšte svoje prišloj, odpočično ono opresenje način, nekaj nekaj nekaj nekaj. Vse je vse načina. Ok, izgledaj. Zati svoj obdroženje. Spoletimo, če je to poči, in vseč oziraj. Zdaj zato smo zelo vse, kaj je problem, kaj smo tudi prišli dovoljati. Zato smo našli vsega soluzijstva, doljši kajs, dobro dve drži kajs, in zato smo zelo vsega zato, kaj je dovolj svoj. Zato sem svoj svetljavek, zelo vsega kajs, zelo vsega kajs, zelo vsega kajs, zelo vsega kajs, product development, embedded Linux distributions. I am a huge fan of projects like Yocto. And I also do device drivers. And during my experience with device drivers, I, over time, saw some patterns, which are always always repeating in device driver development. And this is what this is all about. Prezentacijo imamo prišličnje državov, ker je to prišličnje, ali ne vsešličnje državov. Vsešličnje je vsešličnje linux, ker je in vsešličnje linuxkarno mekanizm, kaj je zelo vsešličnje linuxkarno. Zelo, da je nekaj linuxkarno, ki ne je državov, je več nekaj. Nekaj ta patent je uspjela ena jezervátrov finishedarje. Prišli me že nekaj prišli poštih, in je to prišli poštih, na vsih patentosti. In nič je obježd možno, še lahko zelo kaj zelo prišli. Nekaj se je kaj, čece potrebenh prišli poštih, zato je za taj polaši i svojo venanje. Nekaj nekaj šelak, ko je ubiješ mač, prišli poštih, za praviz awardov. Tako, jaz. Tako, tudi je to bolj oblično odličnje, danes tudi, proti vse začeljezati, ki sem je vse zdeni, tako, da je zelo prizumitočno skupne obličnje. Tudi, len povolj, prizapravljo je nene, neseljim vse način, zelo, da je zelo nekaj silvrbolj, ne zelo posvetil začeljezati, to je zelo nekaj tez, ki je zelo povolj, in vse celo posvetila. ta je pečne z vsem, da je bilo vzljivated, kaj je bilo spremt, da je pa zelo. If you use this mechanism, you can easily upstream your driver. And of course, there are pros and cons to this solution. For example, yeah, it's a Linux only solution and maybe you want your drivers to run also in other systems, you don't want to tie them to Linux. And yes, the main idea with this is that it can in izgleda bolje boži baviti tega kod, in izgleda ti jednega treba, zvrči se v zelo objebno, in videli, in izgleda sproomočenje. Pogledaj, sredno se zelo pravimo vzelo. Ono so varma hrvori krih inželinkov, ali sem vsega. So vse vseh krih inželinkov, tako je zelo vzelo, kaj je hodil hodil in do svojoj IP. To je vsv post-kontrollers, tudi videodekodnih, tudi videokartnih in nekaj videokartnih, zelo korskih, kaj je kodil, in tukaj, in tukaj, tukaj, vsega monočipa. Vsega monočipa vsega monočipa vsega vsega monočipa. Vsega vsega monočipa oče je ač da pritakje sestimunovil, ki so na Ravenu zdaj način, je vsak, kjer je našlič, je, da je, da je, da je, da je, da je, da je, da je, da je, da je, da je, da je, da je, da je, da je, ta je. tudi in integratori biči se počutili, da počutili biči se naše spetne. Našli način, da biči biči se počutili težko kontrolno, ko je izgleda kontrolno bloga za spesivno spradaj, če se nekaj je programe delirativno vse obstrimne. ki so zelo vsega potrešnje, kot jez, nekaj je zelo konfekt, zelo počke, tako da je v kažavaj prikrat, ki jih zelo se vsega dajte, in tako dajte, tudi kanaliteh nekaj spetivnih, in zelo, ki nekaj je... well, nekaj je zelo lahko zelo, tako nekaj je zelo lahko in terface, je to, ko je način, kaj je zelo lahko zelo, če je srednje svih čeljev, Takrat, nekaj hodinje n Đostovanja njega povisuje v kvalitej hodini in sredne hodini, ali n nekaj genulijsi tukaj, neč nažalora nekaj in začelji rečenih thicknessa vseho, zupotretnomo, koristim hodini. Tukaj danes nekaj neč se v vseh grih. Zdaj to je zelo prvega in bi tukaj prv tie prokurvače in hodini. ležite nekaj, da pomeč lahko nekaj še začal. Sreč ne lahko naredimo in v imelih vseženje, in nekaj nezavostavno, kar je kompetibu. Kdaj nekaj še začal, vseženju, dve aparte s družih bar, in nekaj nezavostavno, nekaj nekaj, če vseženje, kaj ne se zelo nekaj zelo, potem ne bi nekič nekaj držen, in nekaj nekaj zelo. but usually, you only have incremental breakeages, so each new version might shuffle some registers, might move or add some new features which require some new register fields, and typically, this is compensated in the driver software. Yes. So this is the problem which driver developers encounter quite a lot, in je zrpusten. Zato je zrpusten, kako je zrpusten. Vsih je bilo tudi do vsega vsega, da je tudi vsega vsega, zato je zrpusten, kar je zrpusten, da je zrpusten, ki se početnimo. Tudi zrpusten, zato je zrpusten, kako je zrpusten, in nekaj ne vsega, nekaj, kako je vsega, začnjenje nekaj neko pošličajnih presolutnih vseh. Prvih nekaj nekih, kaj našli nekaj, našli nekaj resolutnih. In nekaj nekaj nekaj resolutnih, kaj da je vzelo, da prišli dokova vseh, našli nekaj nekaj veči. In nekaj začnjenje, ki izgleda izgledanje obrečnje zilj, in nekaj tako pošličnjenje, ki izgleda izgleda izgleda izgleda. Tako, prihleda prekazov. Pozivaj kanal je, da se je odtajnje vzelo. je to vso sounditet, ki pa došli spodilom, kako so spodilom za tebne lepo, in tipikla, kako izvarjali si ovarčni radnji problem, prišli bolj na tem delarstvu. Tess je bolj... staro zjavljenja tev, niv je pa v veči tebrna in blizenje... Prev tebe, režite, atakajte in vse izmanj, prišli bolj, vsef, zelnosti, kompensasvaš in nekaj z vzivnih nesetkov v pomečnih nesetkov. Z njemnoj družjev, ko je zelo pričeš vse, nekaj ne psih počakov ne se odvijeli svečno. Zelo pričeš vsečno, kjer je generijko vsečno, in je vsečnilo hradične vsečnje vsečnje. ta stručja. Tukaj, to je, da imamo reg mapov. Zgledaj, kako je to. Zgledaj reg mapov na toj. Tukaj biti o reg mapov. To je reg mapov na vse. To je vse, da je vse sub-system in linux na vse. To je vse za nekaj dekad, in veliko v ročnih ročnih ročnih ročnih ročnih ročnih. In inšta je forsezno za nanomemorične base, ki SPI in I2C. In in je poslednja MMO, in inča vsezno inča vsezno, kakva je do vrkama v reči, na kakoče zelo, ki je vsezno, sem da pačen, kakoče zelo, nekaj da za kono vsezno. In da sem pačen, Zato sem zelo vsi zelo pošlja. Počutim vseh zelo, izušaj, odvršaj, skupaj vseh, in ... ... je bilo vzelo. Zelo ... ... poslutim vzelo, z delenju, začnem z hrvok, pošlja je z bilo, pa vzelo pošlja. Zato videlem, da imam tvoje vživosti hrduer. Zato videlem, da je režistor, in režistor je 32 bitiče. Vse vse neč v Hexa, in zelo sem vse nekaj režistor. Zato vse nekaj režistor, ki je vse nekaj režistor, je 32 bitiče. V vse nekaj režistor, želem ki das je vse nekaj recpper, in vse vse nekaj režistor. Zato license pa je das Korea'tun, že,산 k給, Dakš nogina padne ligovi,寸ini so gladov 내�ност 다 ežiste. zato in Norwegian Revisional. Nane intensivno. Se, baho je zrenje, in se je zrenje, Monster več než vo scarfi nje na lokov, skladice reלbit do常ivosti. Rezizna deisters provincialkora, je bilo, that you can define with the RAG MAP, with the written오 avpackage the API, in which we are interested in here you can define expressions in the region as you can see, we define The field of 1 to be the first zeroes And then in the sample of version 1 and the second version, the field is at different offset and the advantage of doing this and defining RegMap fields like this is that the driver can program the hardware targeting the RegMap field API. So the driver doesn't need to know exactly at which offset, which register or field has moved, how big it is. This is very important because the RegMap fields can also change sizes. And the RegMap subsystem in the background, while you call the API, it does all the bit manipulation, which you need, which is pretty cool. Because the RegMap subsystem is very well tested and there's no bugs. There were quite a lot of bugs when implementing RegMaps for certain drivers, which I encountered just because each driver elected to implement their own bit manipulation logic. They have like a header and in the header they define all the registers and fields within the registers and they have all the shifts and that's a really common source of bugs, which RegMaps can help you avoid. OK, so this was like a bottom-up view. We will change now for a top-to-bottom view. So we start from user space. Of course the user space can be different, but in this example I just used like a multimedia user space using the video for Linux subsystem. The user space calls into the video for Linux subsystem, which has bindings for the driver and the core driver logic, which implements whatever the driver supports. In our case it's a video decoder, so it has H264 and H265 codec functionality. Basically it decodes bitstream of compressed frames in those formats. And the idea is that the driver core programs the hardware using the register fields, which we looked at before. And this is very good for the driver because it can reuse all the core logic and they can focus on writing the core logic instead of worrying about different hardware revisions. And having this kind of abstraction layer between the core driver logic and the hardware allows you to very easily add new hardware revisions to the driver, because most of the code in the driver doesn't change only the register layout or very little of the actual interface changes. OK, let's see how we define some register fields. You can see that we have a configuration for the red map itself. This configuration tells that the registers are 32 bit wide. The stride is 4. We can have a maximum register offset. This can be used for bounds checking. And we also disable locking, because in this case we don't need that. But if you have concurrent accesses or accesses from atomic contexts, like q-context, then you could have locking. OK, so we also define a field configuration and we populate the field. So the field configuration naming is a bit unfortunate. The field configuration uses a structure named reg field and the API exposed by the reg map field API is called reg map field. So you have reg field versus reg map field. So this can be a source of confusion. So the short-hand-one reg field is for configuring the register layouts and the reg map field is for the API, which we will see. So if you look at these two fields, which are the exibas read and write ideas, we configure this at different offsets in your register map for different versions. The first version is named g1 and the second version is named vc8000d. And you can see that these fields have different sizes and live at different offsets within the device register space. The first live at offsets 16 and 30 and the second version, the registers live at register 77. They were put together and they were increased in size. So you can define any number of register fields and this is how it works in general. And it's quite verbose and this is, I think, a good thing because as we will see in the pros and cons discussion, when you define the register fields very explicitly like this, this is the address and it starts from bit 0 to bit 15, when you do it like this, it's very close to the datasheet because these values were taken from a publicly available datasheet, in this case it's from the nxpimx8 datasheet and they are very easy to compare with the datasheet because they are exactly the same numbers. We have software register swreg77 starting from 0 to 15. You can just do a one-on-one comparison. This is very hard to do if you have bit shifts and all the bit magic there because you don't really know, you have to decode mentally that code. So this is one of the biggest advantages. Now, the second advantage is that you can define an abstract API on top of the register fields. We define the regmap field. You can see that we use the full name regmap field and this can be different from the configuration names. So in the configuration you can have any kind of names which resemble the hardware. But it's known that certain hardware revisions use different names for different versions to refer to the same thing or almost the same thing. You can use this API to virtualize a lot of stuff and create common terminology for registers. And what you need to do is this API, each field in the API needs to be associated at runtime with each configuration for the specific field. So you can see that we associate the Xerite ID here with the configuration for the Xerite ID. So we use v1 here, but yeah, in the previous slide we use g1, so sorry about that, v1, g1. And we associate the API with the config here. To do this, you need to determine at runtime what hardware are you using. And this can be quite hard sometimes because, for example, some hardware versions can totally shuffle the registers. And if all the registers are shuffled, it might not be very obvious that the hardware which you're using is very similar or perhaps mostly the same. And then we have a usage of the API which is like we're very simply writing some values via the Rekma field API. And this can be used by the core driver logic and that doesn't care anymore about the specific layout of the hardware. So I written a blog post about this which goes a bit more in depth and I linked some more resources. So if you want to learn more how to define these red maps and how they work and how the association works, maybe that blog post is a good further reading. So we have pros and cons. And as I've said, this is not a silver bullet. You can decide because each of these points I think can be also interpreted as a positive thing or a negative thing, depending on the use case. So, for example, red maps is a Linux-only mechanism. So it's a Linux-only subsystem. This can be a pro or a con for you. A pro can be because it's very easy to upstream stuff using red maps and if you want to upstream your drivers, it's easy. But if you want, for example, a unified driver which has layers for each operating system, this is what typically I saw in vendor drivers, you have one layer for Linux, one layer for Windows, one layer for whatever. And using a Linux-only mechanism might be a disadvantage in that case. There are quite a lot of optional features which are very nice. Some of them you can have like caching to speed up your register rights to avoid going to the hardware to get each value. You can have caching on the CPU. You can have callbacks for a lot of stuff. You can very much control what happens when you read write registers. You can do a lot of testing. You can have out-of-bonds checks. You can explicitly define which registers are read, write, or just like read or write. Usually what happens in drivers there is this nasty bug where you write read-only registers or try to read registers, which are write-only, or you can have this problem, but with red maps you could get an error or warning depending on how you configure it. Look, you're trying to access a register in a way which is not supposed to be accessed. Usually according to the documentation, assuming the documentation is correct. So a good advantage of this is that it can remove boiler print from driver code. And as we saw, the layout closely resembles the datasheet, which is quite a nice feature to have. You avoid the bit manipulation. But of course defining all the registers in this way and associating them at runtime is a bit more verbose. But what you usually do, you split the regmap code into a regmap.c file or a regmap.h file for the interface and put them separately along with the core driver code. And it's not that big deal. You put all the regmap logic into that file. And the performance impact also depends on the use case, but typically it depends on the speed of the registers that you're trying to write. And also how often you're trying to access those registers. But the speed is not such a huge problem. In one of the use cases, which we'll see, performance was a concern because it's about video decoding and you want to maximize the number of frames you can decode in parallel. So the performance is in general in the low microsecond range. The performance impact of regmaps. And that's mostly due to the regmap field API. It increases the number of register accesses because you do subfield writing and you don't want to lose values subregister writing and you don't want to lose the values which are already present in the registers. You need to read them back and update them and then write them. And in general it leads to more register accesses. But again, if more register accesses are very expensive, then you can use like a caching layer so you don't have to update frequently the hardware. OK, so let's study some use cases. The first key study was for the synopsis MEP DSI host controller. So a quick introduction to the MEP DSI. MEP DSI is a hardware technology to implement displays with a very simple serial interface. So it's very popular in mobile and mobile gaming and automotive IoT and stuff like that because it's very small and cheap and simple. So you have like a high speed clock. This clock typically is in megahertz I just randomly remember that on IMX6 it's 27 megahertz but it can also reach into the Gigahertz range. And that clock drives the display data on multiple serial lanes so you need at least one data lane. That data lane can be reversible so the panel which you connect the host controller to can also send data back and you can have multiple data lanes. Typically you have two data lanes and four data lanes. And the access via these data lanes it's also serialized like you send multiple bits in parallel but it's still like serial transfer you send bit 0, 1, 2, 3 and then you send 4, 5 and so on. Ok. And there are silicon IP vendors which implement host controllers like we have synopsis which was the host controller which I worked on to make driver more generic and you have the SOC vendors like NXP, ST Microelectronics or Rockchip which implement and add the IP to their SOC and typically what happens you have drivers a lot of drivers you have like a reference driver from synopsis you have a lot of vendor drivers for each SOC of course each driver is a bit different because each has its own special magic and stuff like that. So in MEPDSI it's standardized and there are like 4 major versions what we're interesting in is the difference between 1.3 and 1.01 there was an upstream kernel driver which implemented version 1.3 of the MEPDSI spec but we wanted to add support for the host controller which is present on IMX6 and that's older than the ones present like on Rockchip and STM and we wanted to add support and the register interface breakages they are mostly driven by the spec version so registers for versions 1.30 and 1.31 which were supported by the upstream kernel driver those are very close to each other so it was very easy to do bit manipulation some quick checks and implement the necessary support for those two hardware in the driver but once we wanted to add support for the 0 controller on IMX6 the register difference became quite big and it was hard to do all this stuff in bit manipulation so we turned to rack maps and added a rack map field so it worked quite well the patch has reached version 9 now and hopefully it will get merged soon I also have a blog post on the subject you can visit it, it's the same blog post linked earlier I used this as a case study there and looking at the challenges and results so implementing the rack map was very simple in this case but testing on multiple system on chips that was harder because when you do the initial rack map implementation you risk breaking the other SOCs which you also abstract so in my case I only had like IMX devices so I also needed to test on STM Rockchip because I modified the register accesses for them and I introduced some regressions but with the help of the community so thank you very much for everyone involved especially a big shoutout to the STM micro electronics developers who tested the patch series and provided very valuable feedback and even patches which helped fix so that was the hardest part and you need like good testing for all the SOCs which you support and want to introduce the abstraction therefore there was a problem with this driver that it needs some more attention there are certain parts of the driver like the DRM bridge layers which needs some more work and it's quite unfortunate when you send like a patch adding this very specific new feature which is quite clear in scope and when maintainers ask you hey can you also fix that so the patch series saw a bit of feature creep so that's partly responsible for reaching version 9 but hopefully it can be picked up again and driven back to inclusion so I think rack maps were definitely a success story for this driver and you can look at the code in the patch series and decide okay if this pattern makes sense for you or not the main idea in this example is that we had a standard which drove the hardware IP and that Mipi DSi standard you could test for the version in that standard for any breakages which were caused by the standard in the hardware interface layout so moving forward we're gonna take a look at the very silicone video decoders these are driven by the Hentro driver which currently is in staging but it's in staging because of the decoder API, the user space API used for video for linux that was not stable and that has been stabilized for kernel so 10 so it's being moved outside of staging okay and the problem with video decoders is that they're very very hard to implement driver wise at least for me it was hard so this hardware is quite complex especially compared to something like Mipi DSi video decoders typically have hundreds of registers and these registers are used to program the hardware to decode a specific bit stream so the registers are very specific to an H264 and H265 bit stream and yes the only bit stream is standardized so you have the H264 and whatever specs and according to the hardware also tends to be relatively stabilized but of course there's like no standard for how to write this hardware so in this slide you can see that I populated it's the same slide which we used before but I populated the lower bits which we actually used for regmap fields you can see that the hentro video decoder we had multiple versions there were two different cores hardware cores which were used to decode different bit stream formats but starting with the new version which has totally different naming like vc8000 the two hardware cores were merged into one and this caused problems for the register layout because it was heavily affected because all the registers were squashed and there were some quite non-trivial changes which were hard to for example to decode h264 high profile you needed to program the HEVC so the h265 registers and you have packages like this so we introduced the regmap layer to try and compensate for all this so the problem was that the upstream driver only supported parts of the G1 and G2 decoder we wanted to add support for the newer unified chips so performance was critical for this use case we needed to decode as many frames as possible at a high resolution so naturally we were concerned about what kind of impact has regmaps we measured it so while accessing and doing its hundreds of register writes while doing this when setting up decoding for each frame we had like a 20 microsecond or a bit less performance heat and this was relatively constant because the register access patterns are mostly the same when decoding the frames so you have to program each frame and then set a bit which tells the hardware you can start the decoding because everything has been set up to properly decode the frame and considering the performance of decoding each frame from the moment video for linux user space gives the frame information to when the hardware sends an interrupt to the processor so the entire time it takes to decode the frame including the regmap programming of the registers and that was in milliseconds so it was like three orders of magnitude per second versus milliseconds so the performance impact of regmaps in this case was very much acceptable and it was no problem however what we did have a problem with was that the driver was using the relaxed versions of the MMIO API so it was using like write L relaxed and yeah that was due to trying to avoid the memory barrier because when you do normal reads and writes to memory map regions you have memory barriers, instructions inserted to guarantee ordering and in this case for these video decoders it doesn't matter in which order you write the registers because the only thing which matters is for all the registers to be written when you set the start decoding bit so at least in theory you could achieve some performance improvements or battery saving by avoiding useless memory barriers so one of the challenges for this project was that we couldn't measure the impact of those memory barriers so we tried also on loaded and unloaded systems typically our measurements were in the noise range for relaxed versus non relaxed MMIO writes at least in this use case the problem with this is that reg maps upstream and they only have for the MMIO back end they only have normal writes they don't have a relaxed API it's very simple to add an API for that you just set a config and you do relaxed writes it's very simple but the problem is you need a good reason to justify that API addition because it's not there in the upstream kernel for a reason nobody uses it and we're introducing the first use case for that and it's you need a very good reason and at least you need to have some measurements at least in theory it's a good reason ok we want to avoid those useless memory barriers but yeah it might not be enough so this is why the first patch I haven't sent the patch series at the time at which I'm recording this video but by the time this conference happens the patch series will be public and the first one will be an RFC asking about ok does it make sense to add this MMIO interface to reg map and another problem which we had which was like the biggest problem was figuring out the hardware differences between the older revisions of the core and the newer revisions but that's different from the reg map the problem there was the new core was programmed differently than the old one and we had to guess ok for some h264 high profiles you needed to use the h265 which was very non trivial ok so that's about it the reg map part of this patch was also simple and very similar to the other meepi dsi series so you can see between these two series you can see a good example of how this API can be used and the problem hntro has is that it defines its own abstraction in parallel with reg maps it has hntro reg it has that driver and we haven't removed all the hntro reg structure reference to delete the structure but yeah that's like a work in progress and ideally driver should not try to implement their own reg maps because you already have reg maps in the kernel ok so wait forward from here so as I said reg maps are quite widely used in the kernel but not necessarily to abstract register layouts or hardware details like this and they can be used like this and a lot of boilerplate can be removed from a lot of drivers but of course this needs driver developers to actually convert the drivers to use reg maps and define the APIs and you can have like init helpers if you look at the patch series the initialization especially if you have hundreds of reg map fields you have to initialize each one and you have to define configurations for them it's like quite a lot of boilerplate code and helpers can be added for that especially if you have multiple drivers using it and you can remove a lot of code by doing it and we have upstream maintainer for dispatcher so hopefully they will hit reg map soon and maybe in the future and this is just like an idea I am not proposing hardware abstraction layers in the kernel but maybe we can have like some standardized reg map field APIs for various types of hardware like for video decoders because video decoders they all basically have a lot of common logic and a lot of common registers they have to set for example the bit depth of the bit stream is it an 8 bit or 10 bit stream and they have to set those registers and you can have stuff like a unified API to do that via reg field API so basically this is it so thank you very much for attending so I hope this reg map mechanism for selecting registers might be useful to you or at least an interesting use case for hey look how we can implement that or how we can avoid re-implementing that and if you have any questions please feel free to ask and send an email and thank you very much for your time so bye bye