 So it's 1220 my clock tells me that Welcome everyone to my talk. I'm happy and kind of surprised. That's so many people who showed up here Get a listening to a talk from technology of the 1980s I think these Gregorian Monk music Was quite fitting. There were kinds of Like we're the conspiracy crowd here or something like that Okay, so as you can feel from my title ice-cream. She is damn old, but it's still around and it It will be for a while and I I wanted to talk a little bit about like yeah, what's going on in the ice-cream world? What's up in the 21st century and I have four parts I was when I submitted the talk I was aiming like to get my regular 45 minutes lot. So now I have 30 so Some I'm not Stephen rostered, but I go more to their directions. I Can't go to that detail I originally plan to but still I want to mention those four things. It's one of the workflow thing It's it's about a ice-cream subsystem. What happened in the past years and what consequences do it does it have? Because I think it's useful for you as developers to know that And then I want to show one Complex setups where we have I am personally facing these days to see what weird things of ice-cream C Setups you can have and what the consequences of our and what I will we will be dealing with in the near future Coming from that are already some API changes. I want to introduce to you and To make sure this work and other crazy stuff works For few years ago, I introduced my fault injector framework for ice-cream C and this has some additions And it's I think it's good if you know about them to have like a robust robust setups So the workflow thing It's nothing completely new I Talk about that for a while now, but I have to repeat it some odd I Squishy it's it's a simplicity is a problem because like for a complex Example a GPU driver. There are huge teams working on it and they stay with it and they There to keep things working where ice-cream C is super simple like Hardware makers just create just another IP core and there's someone writing a driver it They pass it to my subsystem and move on to develop the next driver, which is not ice-cream C But maybe SPI or something like that So this is Coined in term like fly by or fly through subsystem. It's very hard to to have people who are Consistently working on ice-cream C So lots of people coming by and working hard to get that driver upstream and once it's upstream If I'm more or less lucky they're around to to handle the bug fixes which may use users may find later but sometimes they come people disappear completely and then Yeah, the usual Ideas that the maintainer takes over the maintainership even for those drivers, but I Squishy the subsystem itself is largely maintained in my spare time and There are two consequences of that because my spare time is very limited I can't handle all these drivers and Be you really and I was there I was running in danger of a burnout You really have to focus on the things you like to do and I like to help people getting like on speed with kernel development all that but Reviewing the 121st driver Which is like following a basic pattern. This is not really fun to do So this is where I have to do a little bit of push back and saying yes, I'm a maintainer I'm a gatekeeper But actually reviewing is a community effort and This means community as individual people but also I think like companies should Have also engaged more in reviewing other people patches because we're getting so many patches that we have a scalability problem there and to solve that we need more people and Just to show that so the left side of this graph is when I took over the ice-squishy subsystem and we back then already had 70 bus master drivers for ice-squishy for such a simple ancient not much changing bus Already 70 but over the years We grew to more than 120 and there's no Signs that this is Like not further increasing It's so simple that people say. Oh, yeah. Well, we will do this custom controller for the GPU and need another driver and I Can tell you they on even on hardware side It's always the same mistakes happening and then we had to have another quirk structure and what not I Would really suggest just focus on a few IP course and we make sure that this thing works and But yeah, this is my dream. This is real. This is reality But I think you can understand this is Way too much for me to handle So often is the one response I got mainly last year was hey, what about group maintainership? You just let go don't try to keep everything to yourself But for me, it's a flyby subsystem. I There is not really a group where I can share the work with So I think it's a different setup and it was it became pretty Visible when this year there was a period of three weeks when I was ill and I really couldn't do I square C maintenance work So I wrote this mail saying I think it's annoying no catastrophe, but it shows that I am the single point of failure and this is of course if Not it not a good thing to be and I explicitly asked I'm open to group maintainership if you think you're a reliable candidate. Please get in touch with me. I Got one single response Sebastian. Thank you very much because That may made me feel like I'm not completely ignored which was get well soon So this is this is kind of funny, but it's also a bit like a cynic, you know, so I Would like I can only repeat. I'm open to group maintainership, but Which group So what I did a while ago and this is I think important to know I Divided my Maintainership first into two parts first the I square C core and then the I square C drivers And as you can see from this may excerpt from the maintainer's file I'm a explicitly listed for the I square C subsystem for the core and the status is maintained and I'm not explicitly listed for the host drivers and The status is said to odd fixes and this is very very intentional. I'll be there if I catch some bark or some There's something for some ancient driver. I try I try my very best But it's it's really that it's I try my very best to keep up with that But what I really really want I want a maintainer for every driver. I Think this is from from my flyby system perspective. This is the only thing I Can see that will work and I tried that it works. Okay ish so far By chasing people 66 drivers have a dedicated maintainer now, which is 50% if you can if you do the math and Yeah, I'll try keep following that road and see what happens some maintainers are really responsive per driver some disappear again that was kind of expected but still for me as a as a subsystem maintainer it It took off some pressure which made me Prevent me from burning out. So I think in general. It's a good move. It's not perfect, but I think it's the best we can do I have a few companion. I want to mention them because they helped me really this is Because I'm not very familiar with ACP. I'm totally an embedded guy. So I Thank Mika for stepping up and also Andy, are you here? Okay, and there's also Andy and Yarko from Intel who assist him and Yeah To take this away from me all this ACP ice if Peter Rosen is very good for ice-square seam Maxing and complicated setups to getting the locking right and John Delverma who was previously maintaining the ice-square C subsystem is still around for the PC hardware, which I'm not Super familiar with so thanks to those guys. I'm not completely alone, but still I think ice-square C is under maintained And I also want to call out renaissance, which the company who is largely contracting me and who is really Driving ice-square C development. I mean, you know these days Linux can be an ice-square C slave Not only a master and the upcoming changes where I will talk in a bit It's really it's largely to their funding that this Exists in this day today today. So on the one hand, I'm very thankful for them But on the other hand as you can see it's also a single point of failure if there's some management decisions to cut that problem So that's that's where we are from a subsystem Point of view things are working. Okay ish, but there's it's it's not ideal with this single point of failures thingy and I am the maintainer of the core and I'm looking for driver maintainers Also when you when you send on a new driver, I will likely ask you if you want to be the maintainer Otherwise it will stay often or whatever That's the best I came I came up with so far not entirely happy situation, but that's the way it is Let's get more technical It's it's not always like this that you have this one master and some clients and they're talking to each other because I Learned the term gmsl and it became a bit scary to me This is a serial link which transfers mainly graphics data either either from a camera or to a display and But it has in this high Bit rate stream you can encapsulate I square C to that So here are all the other interfaces needed for that Here's a regular I square C bus with some kind of D serializer and it has four outputs where you can connect in our case cameras to it And this is here the high-speed link and there's I square C encapsulated and then you have here here the serializer do Reverting all these high-speed thingy and after that you have plain I square C again The thing is so you see four channels there which can do which can have cameras and have I square C The thing is now the cameras Have some I square C devices with their addresses and they are all the same So by the niffen They are all the same, but you can reprogram them to another address at runtime so they basically say The hardware people said we will let software sort out that problem how to give them unique addresses and This is something and entirely new We don't have support so for that. We need that and There are some more gory details like Here on this camera side, there's also a small microcontroller who talks I square C to the other client. So we have a multi master setup and Now add to that that for some I don't know for some reason if you power up this device All channels are open and not closed So if you start the system and plug everything and everyone is talking I square C to everyone With basically the same addresses and it's a huge mess We are the first step we need to do is like We need the I square C core to something like a give me a dynamic address I can use to reprogram this client there Which is not Super easy because it's a non probable bus. So the first idea was hey, yeah, okay Let the user device Let the user specify a range which is safe to use But then we realize okay, it's everything very high level there were there much more details to that But then we found out okay, this is pretty fragile because if if the setup is slightly different with another device Using exactly this address you reserved for the pool then it might happen you run out of Addresses in the pool and the system won't work. So the next thing was we kind of and force People to describe the buses fully and consider every not used address is available Which is kind of a paradigm shift because currently especially in device trees You just describe the devices, you know, you need and Let the other be undescribed But I'm still not super happy with it because as With device tree we can do a lot of things, but then there's also a CPI and then you have to rely on broken firmware and Relying on them describing the buses fully so you can Assume the other addresses are free to use It's getting dangerous again, so work needs to be done in that area where we are working on it Luca is there Luca is there GMSL is not only the candidate who needs that and there's also a technological fpd link Which is basically the same except that it's very different Yes, it has a high-speed link and it encapsulates I square here or that but it's not here that the Devices are reprogrammable, but here in this guy. They have an address translation table. You have to reprogram so that technically on this side the devices here have another address and they have there and Despite all this being very confusing it also means that the time when you need a dynamically Dynamically assigned address is different from our case with our case when we want to reprogram that we need it at device probe time When this guy wants to reprogram the fpd link case if this guy wants to reprogram the address table It needs as much earlier that device creation time Don't need to understand that fully. It's just showing you that people are oh and fpd link is hot pluggable Awesome because I square she is so designed to be hot pluggable Yeah, so this is what Linux in the 21st century is kind of looking like People are using it because they know it and there's technology technology available around it Yeah, but they get very creative how to how to use it and there are clearly cases Linux is not prepared for yet And we're we are working on That step-by-step it's a bit painful as you probably can guess oh By the way, that was only half the truth the setup is much more complex because we have two of them and if you as I told you by Power up they have all the gates open or have two of them are having all the gates open So there's much more communication and you have to ensure that both are have first their Gates closed before you can do anything One by one. So you have to hear potential race conditions or you have to make sure both are handled already So not much fun. So we're starting slowly with API changes. The first thing is Is that we now usually If you declare a new ice crazy device Or a new dummy device whatever they return a null pointer if something goes wrong But if we want to go into the direction of a dynamically assigned addresses, we need to make a difference We need to know does it fail because the address is already busy or if there's no memory Kind of obvious But that means means that all these ice crazy new functions need to be converted from returning null to returning an error pointer And of course doing this in the ice crazy core is pretty easy, but we have to I Have to Convert each and every driver and make sure it still Works and do some mass conversion things And this is where I'm currently work what I'm currently working on Of course, I use Coxinell to do the main work with semantic patches, but there's still a lot of Things to handle manually because we touching I'm touching very very old code Which is very creative in handling things And this needs a lot of manual review To make sure people understand that there's an API change I want to rename all of the functions So if there's I will of course convert all the in kernel the in tree users, but if there's out of tree users No, no, no, no, I Want them to have that drivers break at build time so they know they need to do something And this is why if you if you're used to ice crazy new dummy. It's now called new dummy device New secondary devices new ancillary device new device new client device and you probe device I haven't done yet. I will figure out some fancy name what kind of Side effect what a lot of developers asked me or not a lot, but quite some is they want managed versions of these functions and We only have the new dummy device already, but when cleaning up this with my Mass conversion, I could already see yet it removes quite some codes handling these these new devices, so The other ones will come so look forward to removing Quite some boilerplate code There's one thing which I also want to tackle it while being here is that One thing I never liked about new dummy device. It basically says to that bus Reserve this address But you never know who requested that so in the CISF as entry you also see some addresses blocked But you have no idea which driver does it does it own for what reason and we already had a better function And I really want to encourage especially in the future you to use that is this new ancillary device Which has a lot of benefits first You can be more flexible because you can provide an address in your device tree in case your chip doesn't use the default address and Even if it just has a default address look at the parameters. We have the client which is requesting The dummy device. Do you know what a dummy device is? Who doesn't? Okay, sorry There are I square C devices using more than one address Maybe they have to register sets or they are talking directly presenting Information from the monitor or whatever But still it's one driver using this multiple addresses And the old way was you can have only one address per driver So the dummy devices are Used in the driver to say I need this address to block it So no one no one else especially use of space can accesses or mangle with it. It's mine So you can have one driver like occupying multiple addresses. That's that's what a dummy device is for But as I said, you never knew who requested that address and I really if you if you find out that your driver needs that I would I'd suggesting you use this new function Because then you know who requested is and you even have a name Which gives you a roughly an idea what this address is is used for there is no infrastructure yet to present this At runtime I'm quite unsure if I want to put this into CIS FS or debug FS but for sure I can tell you I will hook here to present you this kind of information and And Yeah, I I strongly recommend to use that in the future because if you have a complex driver Then this will make debugging a lot easier. I think Another API change which recently came it does not have so much to do with that complex setup. I showed to you Is we have no special callbacks for atomic transfers Because sometimes You want to shut down the system using a PMIC or whatever some some device and then you need to see it Send an I square C transfer to actually do that, but interrupts are gone by then and This was never specially handled in the core It mostly worked by accident And this was not a very good idea So we have if you if you write a bus master driver We now have a special special callback for that if it does not exist. It will fall back to the method we had until now Which is like it works by accident But I strongly suggest if you if you need that oh And the core will print out a warning now if you try to access it with an interrupts are disabled It will say ooh, this is dangerous. I try my very best, but it is dangerous And if you ever see that warning you should go into the bus master driver and Try to implement an atomic callback, which does not use any interrupt So you can have I square C transmissions very very late Because I think you want to securely shut down or reboot your machine We have that and yeah, please make use of that during all this development of course needed to like Create test cases or verify it actually works and some of the times I hooked up what I presented two years ago where I Describe the possibility that you can have GPIOs connected to the same I square C bus under test and then these GPIOs to crazy things to the bus so that the I see bus master should get confused or should flag errors or whatever so special situations you can handle and I'm seeing I'm Running a bit out of time, so I will make it a bit fast. I hope I don't You can still get it The first thing which I found out that there was a condition where you could have an Unintentional right on the I square C bus if things go wrong things go wrong means in that case either the bootloader do does some I square C and When it's not properly Terminated I see linux takes over in the middle of it and then you have an unstable condition on the bus or the other way around You have a panic in the linux kernel in the middle while an I square C transaction is going on and Then you reboot and the bootloader gets an I square C bus with her which has an inconsistent state and these Fault injectors are there to Reprove or make it possible to have this inconsistent state And this one is especially dangerous because it can trigger an unintended right on the bus It is basically a standard right As you will see on the scope except that the last clock cycle is not issued That means the device is all set up prepared to do a right and then left alone and The bus isn't an inconsistent state because the data line is held low By the client because it says I'm ready for for getting a right There is a technique technique called bus recovery mentioned in the I square C specification And we had that implemented in our driver and that basically means toggle nine times That's what what was the present state when I was implementing this fault injector the problem is The device here gets this missing clock cycle for preparing the white and says okay. Yeah now. I'm ready to write and then we clock again As the the data line is not driven meaning we basically write to ff somewhere Don't know where and this is of course something you really really really really don't want to have so the solution there was To fix up the there there is the mechanism in the core to do this toggling and you see it Here it's just the data line is pretty here. It looks a bit more complicated. What we basically do We send a stop after every toggle to tell the client Whatever you've seen forget about it. So and good to get a consistent state on the bus again So this is why I really if you if you if you Working on on a bus master driver You really have to get your logic analyzer out and do stuff like this to see if things look proper my plan is someone to have references on The I square C wiki so my my dream is to have a script which driver authors can use Which triggers some communication and then on the wiki you have scopes like this where you can See how it should look like and then you take your scope and pray that it really looks like this And if not you have to do something But as I said if you if you're not paying attention you can have an unintended right which was pretty pretty scary, I think Another fault injector we have is inject panic because It basically it waits for some communication to happen By checking the clock line and then it just Interferes with a panic which is exactly for the situation. I mentioned before with the atomic transfers that with that you can hopefully Create a situation where an inter I square C transaction gets interrupted and then you can see if your Atomic transfer works as expected or even if that works If you don't need that if your boot loader when it reboots can handle this inconsistent state on the bus This is what this fault injector is for and the last one I square C is by definition multi master. It's rarely used, but I see it's more The usage is increasing. I see that from the mails going to the ice crazy list and It's not so easy for a bus master drivers to simulate that So this fault injector basically also it waits for a communication and then pulls the data line low like and another master would do so Your master under test will surely lose the arbitration and then you can see If it correct if the interrupt fires correctly if your interrupt handler handles this correctly and waits until the bus is free again I was kind of happy because hardware wise getting a true multi master setup can be a bit of a hassle But this is super simple. It's really just Writing a parameter to this this fault injector file, which is the delay or the duration of the Interference, how long do you want to keep the line low to simulate the multi master situation? And then you trigger Transaction to an address with had has a lot of ones the good thing is you there doesn't need to be a device Responding to it because we're interfering with it anyhow But you send out a request with lots of ones and we're pulling down the data line to zero and your ice crazy master should Detect there's something wrong. I don't see the address on the bus. I was sending out It doesn't really matter if there's a device listening and with that you can also simulate this Kind of rare setup, but of course for a driver We want that to work because it's a condition is more and more uses multi master set up Okay, this was my rush through the half an hour. I got this day Like I said, I would have liked to go more into details, but certainly not this year if you have questions, I Think is somebody I think we can have five minutes of questions if you're not too hungry You cannot we can ask a few right now Otherwise I'm here at the conference and my email address and I think you all know how to reach me if that's really That's needed. So thank you so much so far and