 OK, thank you and welcome for coming here, everyone. Let's a bit too low. Let's do it like this. To my talk here at OSES Japan, I'm so happy to be in Japan again. It's a lovely conference so far, and here's my contribution to it. With the talk about object lifetimes, and if this is a threat for safety certification, which is now an ongoing trend, a lot of people some people are aiming for it, and here's what I want to say about it. So first, let me introduce a problem shortly. It's not a technical, so much technical talk. If you want the technical talks, see this slide. And from this slide, you can also see that it's kind of a current trend. So Lauren Pichard started this discussion at last year's plumbers. Bartosz picked it up and contributed more by saying what Lauren said, oh, it's more about managed devices. And we found out it's not only about managed devices, it's actually worse. It's about general lifetime issues. Then I investigated further and found out that not only the subsystems we know of, but some more are affected. And then quite a few just a few weeks ago, Bartosz went again to Linux plumbers saying this is basically a problem with a lot of provider-consumer settings we have within the kernel. So someone provides something like an i-square-sheet bus and someone consumes the i-square-sheet bus. It can be a GPIO, it can be an MTD device. Lots of things we have, and some of them are vulnerable. You see Bartosz and I are playing a little bit of ping-pong here. And he called it like the Netflix mini-series. So it's mainly the both of us who are having a keen interest in that and are driving to push this a little further because we need this thing, this needs to be fixed. Some of the issues are very, very old. So just a simple overview of the problem. Let's assume you have some SOC and here's a platform device with an i-square-sheet controller. I'll be talking about mainly i-square-sheet because I'm the i-square-sheet maintainer, so I know this space. But it could be a GPIO controller or whatnot. Let's assume here it's a platform device and this describes the real hardware on the chip and then to enable to give this functionality to use those space, we have a logical device, intermediate device, which is an i-square-sheet, the i-square-sheet adapter. This one instantiates that, so we have a ref count of one. This is a lot of ref counting here. And then this is a good case. This is how we want it, this is how it usually works. In most cases, let me emphasize first hand, this is not like, it may sound like Linux is totally broken, it is not. You know, in most of the cases, Linux does a very good job of doing all these things. But now, as we're approaching safety certification, we really need to get more of the corner cases ready, especially if we know that we have corner cases. And we should now, it's now the time, I think it's time to tackle that. So in the good case, here's a consumer in this, it comes from user space, it could be kernel space, it doesn't matter. And it wants to talk i-square-sheet, so it connects to this device and we have a ref count of two because this and this are talking to this device. This is the normal case. Normal cases that someone user space is finished talking to it, this goes away. So this connection goes away, ref count goes to one. And obviously then, if this device is someone going away, it will cancel this connection because it instantiated it, so it can instantiate it. Ref count is zero, this device can go away and finally this device can go away. Everything's good, this is how we want it. Now let's go back to the case where we have here the provider, again it's generic case with the intermediate device and the user space calling to it. Now one of the problems we have is what if before this user space device says I'm finished, it stays around, this device is going away. And we have two problematic, we have more cases, but I'll describe two problematic cases here. The first one is, which is sadly true for I2C as well, if this physical device embeds this logical device, it's not separated. That means whenever this goes away, this goes away. It's gone. Then we have a huge problem because this device is still there and wants to talk to this one. It has, it's even ref counted. Actually it's a prime task of an operating system to ensure that the ref counting means something, but in this case it's meaningless, you know. So this is really bad. It's not so common in the kernel but there are some subsystems which are still affected. This is a really bad case. And another problematic case too, which we have is not, if it's not embedded, but still if there's no like synchronization between those two, so this device goes away and this one does not know about it. Then we still have the problem that they can talk because this did not disappear, but when you want to talk here then there's nothing and you will also have problems. So we need to make sure that this device can go away, this one somehow knows about it and can then report an error so we can gracefully handle the error. But with these two problematic cases we're not there yet. And to see that I'm not making this all up I have a small demo. So I'll be running, thank you. I'll be running a QMU instance now with a kernel 6.7 RC4. So it's from this week. And I will show you what I do now. I will open a file from the MTD subsystem. I will just open it. It could be a file system wanting to read from it. I just used a one liner here to open it. And now I'm going to unbind the driver. And in my presentation in proc this crashed at this level so some things have happened, which is good. But what I do now is I still have this user space process open, which is still trying, it could be a file system which is trying to read whatever. And now I'm trying to close this one and then bam. That's a crash. And then it's not a super bad panic. We get to the log in prompt again, but still this is not anything where you can gracefully handle this problem. And this is not some rare subsystem. I mean, MTD is important, right? I could have shown you, if I show you the same thing about I square C, you would see it's not crashing, but it's blocking because it has some protection to see, ah, there's still an open file descriptor. I'm not removing the stuff. I'm waiting for the file descriptor to be closed. But that means it's kind of a deadlock, you know? The one is waiting for it to be closed. The thing with the open file descriptor wants to do something, but it cannot do something anymore. So this is still not gracefully. It's still not an error saying, hey, ooh, my provider is gone. I can't do anything anymore. So this is also bad. So we have that. And so Mike, this is what Bartos and I were working on on various different problem spaces. And my question, since I was interested in safety, recently does this affect safety certifications? Can you get safety certifications for Linux if we know we have such problems, especially if we have these problems for such a long time? If you look at the problem for MTD Core, there's a comment saying, oh, we should do this better, basically, from 2009. And this is a comment in the I-square-C Core. This is old code and should ideally be replaced from 2015. But the code is pre-git. So, and nobody ever took the challenge to fix this correctly. So you wonder, especially if you have a generic, if there was a people, if you try to address safety, they try to argument with a safety by process. Argument, Linux has proven that you can fix issues, they can fix issues over time. They could even include pre-empt RT after a certain amount of time. So they can do basically anything, right? So with that kind of argument, they try to approach safety. But how will this affect safety if you know, okay, we have this problem for what, 18 years and nobody took care of it? Will this affect? And I asked a few people around and the answer is like a lawyer. Lawyer answer, maybe. I cannot give you a definite answer because it really depends on the assessor if the assessor knows about it and takes it as relevant or not. My gut feeling, but don't quote me on it, is it's not so much of a problem. But it might hit you if you're going specific because another approach or the main approach in certification is that you define a subset, you get a set of requirements. I need that from the Linux kernel and I want that part of the Linux kernel to be safe. So and what you could now do in theory is I say in my requirements, we're not unbinding devices and all this goes away. Right, I will just boot it, everything is there and it will never ever be removed. There might be cases when that works but I really want to think about the following slides to see if that approach is really good for you. On top of that, this is not a defensive approach. I mean, if you want safety, you want a safe product and if you know you have problems I think a good attitude is to let's fix that because we have so complex systems even if you can set up your requirements in a way that these problems I'm showing you are not affecting you there might be other problems we don't know yet which might be affecting you. So if you solve the problem at the core I think you're in a much better shape. But let's still go to this. Okay, can I have a specification where I define I'm not unbinding? What you definitely lose is all hot pluggable buses because with hot pluggable buses an unbind event can happen always and you don't know when. And even if you say, ah, well, I don't have a USB port where users can plug in stuff. I have seen, I have been in a project in a railway related project where they were hard soldered USB devices some kind of a USB stick and with a combination of not perfect USB design and hardware and a bit of noise around the device there were sometimes sporadic unbinding devices although the device was soldered. So you still can have that. And so hot pluggable buses are the big problem and if you see, I got this slide thankfully yesterday from my friends from Renaissance. They have a great demo there in their booth and in their talk yesterday they were explaining their setup which is I think quite of nice, go check it out. But what I immediately saw was here this exclamation mark is for me it was not in the original slides. Here they have an alcohol sensor and it says plug in. So even in complex systems today with all the functionality you want also in cars not only for consumers plug in is a thing. So, and when you have plug in you're bound to these life cycle problems. Another thing is if you want like to recover from bad states or from stalled sub devices I've been talking at Plumbers to a Qualcomm engineer so this is a Qualcomm based phone. And he said though they connect in their phone to another processor and you use remote proc to talk to it and they have a watchdog to make sure that the firmware on that remote processor is alive. And when this watchdog fires Linux needs to take measures so it will come up again. And what they wanted to do originally is like unbind the device, recycle, power cycle it, restart the clocks, everything like so they can get into a clean state and have that sub devices this coprocessor again. And they had really to struggle to get the unbind correct. They were totally running into these issues. You could try to work around it in a driver but I think you need if you have a clean unbind and bind this is the cleanest solution because you want to, your driver should be able to clean up properly, right? In any case, so a corner case which is for now theoretically for me it's when you want to reboot. Usually reboot is pretty safe because all user space processors are killed, file systems are unmounted. So you probably have, if you have a provider you won't have any users at the time it is killed. While working on all this I found a theoretical problem at least in the I2C subsystem, maybe more which is a rare race condition where I2C could stall when trying to shut down the device or to reboot. I'm working on the proof of concept I'm sorry I'm not there yet, maybe it's all wrong. So let's take this with a little bit cautious of that but if I found the race and if I have a proof of concept I will show you. I will tell you for sure. This all was started by Laurent Pichard from the media subsystem and I know a lot of people in the media subsystem complain about lifetime problems because they have complex device structures and interconnections with them. I have not researched yet so far because they use a lot of stuff like I2Cs and GPIOs so the building blocks for media are not proper yet and I think the building blocks need to be saved first until we talk about media. And also the media guys have their own set of problems in that regard which they need to fix first. But I think what we are currently right now seeing is just the tip of the iceberg. I mean it's like I thought that Bartosz and I are playing ping-pong with our talks and our research and usually when we meet after a while the first, one of the first things we say is like oh it's actually worse than we thought because we found something new again. So I think it's a good thing to address this issue. So now I'm telling you about lots of corner cases and about that your specification might be vulnerable but how do you find out if your specification is vulnerable? And what I've been working on is I call it Linux Lifecycle Issue Test Suite which is a set of tests which will trigger all the problems I mentioned and with a description of the problem, if existing a path to a solution and which source files are affected. So why did I knew another test suite? I mean there are plenty out there. The problem I had with my test suite is that for example case health test within the kernel tests are expected to pass because we want to do regression testing. You're not so expected to fail, especially not if your test failing means the kernel crashes, the kernel blocks or gets you back to init. So I clearly need a test suite which has a device under test and I was not able to find one so far. If you know one, I'm all ears. There used to be something like LTP slash DDT but it's stale for 10 years now. But if you know a test suite which has lots of tests for devices under test where I could just add my tests to, I'd be more than happy. So far the planners, I'll get this test suite running and once something turns green here and passes, then I convert it to case health test and submit it to the kernel. So we then have the feature of the regression testing. What it basically is a collection of configuration, the test scripts and self. How do you trigger the things I've just shown manually, like you've seen the MTD crashing? A specific kernel configuration. I need a specific root file system. So we have build root configuration. As a test framework, I didn't write it from scratch. I use Avocado from Red Hat and then some scripts to run the test suite. And it's all be all open source. The test itself, I want to have a repository for the whole test suite and I want to have a website where all this is, if you don't want to run the test, you can just check how it is. And I really wanted to present it here but I could not get it done because it was much more work as I anticipated. The first thing was I started testing with real hardware but to make tests easily accessible, I switched to QM and had to find devices which kind of are suitable for my test cases. I wanted to run a generic root file system. Eliza provides one and I really wanted to use that but it has just too many features I don't need and some features I need which they don't have. So I end up ending a simple root file system with build root. And I think Avocado is a nice and fun project but you can really see that their usage is, use case is very different from mine. And so I was hacking quite a lot to get it suited to my needs which was kind of fun but on the other hand, I would actually prefer to write the tests instead. So again, if you know something where a test suite which is using devices under tests, then I'm all ears. But I can show you a little demo from that result of that test run. So it's not super much but it's just a test suite, right? So here's the test, UART is the so-called good known test case so with the things how we want it to be. Test cases for SQLC and MTD all run in QMO with the latest kernel. The final website should have iteration over different kernels. So you probably can see the development or improvements there. And here on the whiteboard, you can see which files are affected. So those are the files causing problems. And the good thing about this is that Eliza, the Eliza project now has a mechanism was working on a mechanism where you can map your specifications to source files and test themselves. So if we combine their efforts that someone has a specification of what I want to do regarding safety and you combine it with this test suite, then you're immediately known if you're affected or not by these life cycle problems. Another idea is that there's this database called vulnerable code, which might be an interesting point to deploy all this information currently from a glimpse. They can say this or that package is affected. So if I say the Linux kernel is affected, I think this is a bit too coarse. We need to find our granularity for that to see in this kernel, the subsystem is affected. So you'll know if you're affected or not and what's the next steps. The good news is for all the issues I presented here, we have a path, we have potential solutions. We have drafts for that. We know, especially at Linux Plumbers, a few weeks ago, Bartow talked about his idea how to solve a lot of these issues in one go. This is mainly that you add kind of a wrapper to the subsystems which protects the physical device. And then you use RCU to make sure that you have critical sections. If the user space is going to read from this provider, then the subsystem will make sure that it's not going away while it's accessed. And on the other hand around, if you're removing the provider, RCU makes sure that nobody's reading from it. And so this is an approach where you can put stuff into subsystems away from drivers and still have a good protections. And we're also thinking about like, add another abstraction, put like a layer, make a layer out of it so then other subsystems can just use this extra layer to have the safety. So they don't have all to reinvent the scale from scratch. And at Linux Plumbers, at the talk, there was quite a few people attending the talk and actually we came to a consensus. So it was sad that, yeah, that's what we want to try to fix these issues. And after all these years, we were all very happy to see a path forward, I think. You know, some issues are pre-git. And the roadmap we have is, we're fixing two or three subsystems because proposal is I square C and GPIO because that's where the things Bartos and I do maintain. Greg, for some reason mentioned USB, I don't know why. And then we see how this works out and how we can make this generic layer from these two or three existing implementations. And when we have that, then we can go fix all the other subsystems. And from that point on, it will be really great if we have, so then that will be where a lot of people can chime in and say, okay, I'll take this layer and fix my subsystem. This is quite some work. The problems is when work takes such a long time, it's a moving target, of course, because the subsystems we want to fix still evolve besides our intentions. And they will be intrusive, so back porting will be pretty hard. It's probably a job of its own, but we have proven in Linux that we can do stuff like this. But still my call out here is the sooner, the better. Let's not wait until the technical deck increases anymore. Yeah. As I said, it needs serious efforts to go upstream, but Bartosz and I really, really, after all this year wanted to see that fixed because we want Linux just, we can't stand that Linux has problems in that area. That's against our, I don't know, our pride. And also I want a, I want safe products out there. And if I know there's a problem, I want to fix it. But currently we're working very, very part-time on that. So Renasus is a bit supporting me. Bartosz is a bit supported from Renaro, but with the current work we can spend into it, this is going, this is a matter of years until we have this generic layer. So the hint, we both, I work for Renasus, Bartosz works for Renaro, but at the basic level, we are both contractors. So my conclusions, the kernel definitely has longstanding issues regarding object lifetime issues. These might or might not hurt the general safety by process arguments that was a lawyer like maybe, I was saying before. Depending on your use case, it will affect your specific safety process. If you know that you're using a vulnerable subsystem, then I think you should do something about it. To find out if you're affected or not, there is a test suite developed. It's under development, and I hope to release it as soon as possible. And a draft for solutions to these problems exists and wants to be implemented. And yes, we are looking for funding to drive, I will be honest, we are looking for funding to drive this forward so that we can finally make a check mark on this and have this issue solved once and for all. And this was what I wanted to talk about. Thank you much for your attention. And if you have questions, I'm all eager to hear them. Thank you. So I'm kind of a dumb dumb at this level. So forgive me if this is a dumb question, but why do you think this problem has lingered for so long? You talked about it predating GITs. It seems not great. And then kind of related to that, from people who have embedded deployments or here we have a lot of automotive grade Linux, have you seen any creative hacks to work around this sort of problem? Or is it just most people aren't affected by it in practice or they just reboot their computers? That's not a dumb question. So I think one reason is part of the industry doesn't, it's really corner cases, right? Mostly work, Linux is working great. And part of the, some industries do not care so much about it, like my stepson is used that his phone crashes once in a while. He doesn't, so he reboots. So, and other part of this is a lot of work. This is dirty work. And I hear, heard, experienced Linux kernel developer say, oh no, I'm not making my hands that dirty. And yeah, there'll be dragons, basically. Fair enough. I think Rick wants to add something. Hi, I wrote that comment about 28 years ago. No, 25 years ago, when we first converted I2C to the driver model. When we created this, we did not have removable I2C devices. We did not have removable anything but USB. So when we did this, we sit off, anybody cares in the future. And then bind and unbind was a debugging aid. And the fact that people use it in production is so scary. So that's why. I mean, it's just grown into something and then it works for most everybody. Although unbinding devices, it's not necessarily tied to unbind from Sisyphus. You can just, well, use hot plugging device, remove a hot plugable device. Yeah, hot plugable device is to emulate that. Hot plugable devices, yes. But I2C, there's never been on a hot plugable bus before. Now it can. And to add that, what Greg also said at Linux Plumbers, the solution Bartosznow provided is a pretty up to date solution because back the SRCU wasn't available back then. So we couldn't have that solution, but we can have that now. Yeah, thank you for the insights. I would like to add something on that just to make you less happy even more. I think there are more race conditions like that. So specifically the reboot case comes to my mind because I ran into something like this on the, well, it's a nasty downstream kernel from some AI producing IP vendor. So the point is actually you may also have races over the hardware. I think in that case, so currently we pushed it away to the vendor, I said, okay, do you have a problem there on reboot or power down? I think it's a case of a race around the MMU, IOMU, so shutdown of OMU coming before the transactions are done. I also come to my mind regarding interrupts. Is your test case covering these cases? So basically you're shutting down also certain hardware bits. If there are in-flight activities, can this be covered as well? I mean, this is all the nasty corner case we can have in shutdown scenarios. And this is actually a real case we can say, well, we have to be able to power down or to reboot the device. If it's stuck there, we have a major issue. So what you describe is another set of provider-consumer problem. Provider is going away, consumer is a problem. And what I said about reboot, I think this is theoretically also with I4C. So I'm just working on writing a test case for exactly that. The thing is, initially I wanted to go with a vanilla kernel. This kernel I was showing you was 6.7 RC4. If you really go for that test cases, I'm quite sure you need additional modified kernel code or additional drivers or whatnot. So probably when I release something, we will also have a kernel tree which has extra code to make these rare race conditions visible. Isn't there something for error injection in the kernel already? To make these kinds of things artificially reproduced? What if something goes wrong? Would it help for that? What I first, I think that race conditions are super hard to... So I would like, I prefer to write code which kind extends the critical sections which are in a usual reboot, very small. And error injection does not help you to get you a large timeframe where this race can happen. So this is where what I would approach. Sorry, I hope I'm not becoming that guy. This is... Do you know if anyone's done any security research in this area? I'm just curious if there's any cases of people like using this to get access to systems or something like via these hanging devices or anything? I haven't... I'm really coming, speaking safety here. I haven't researched security. I'm not an expert on that, sorry. No, no, no, no, I'm sorry. I don't think... Greg has something to say. These are issues like mind and unbind. You have to have root already. And otherwise, if you have physical access, that's another issue. So physical access, we don't really care about. Sometimes you do, but this isn't really a security issue. It's a crashing issue. The good thing about this is it's really Linux imminent, so it's pretty hardware-independent. This is why I switched from a board to QMO because you can, as long as you have some MTD device, you can show it's broken. It doesn't need to be a specific MTD device. So this is a good thing. Talking about Rust for Linux, Rust has a very different lifetime approach. So in Rust code, should we care about the new approach about the new Rust code? If we write the bindings to the media subsystem or I2C subsystem? I understand the question. I am not a Rust expert enough to comment on the how Rust manages, no, it manages lifetime issues very well, but I don't know how this maps when it connects to existing code in the Linux kernel. Obviously, if you write whole Linux in Rust, then you would be in a much better position, but this is, of course, a very different story. So I'm not enough a Rust expert to comment on that. But maybe Greg is. So the big problem with Rust in the kernel is this, C has a lifetime rules, Rust has a different one interaction there. Nobody's attempted that yet, and that's what I keep saying is gonna be the hard part. Look at all the drivers, they're not even getting close to that. That being said, the Rust object lifestyle, lifecycle is totally different. It's dealing with code and copies. This is real lifecycle. So Rust code is gonna have to be modified to deal with the C side. So Rust is not gonna solve this problem at all. But that being said, we've done a lot of driver core changes to make the Rust side easier. And like I said in plumbers, this work is going to make it easier. So this is actually gonna make it possible that Rust can work. Right now, Rust can't work to solve this. Yeah, because it adds, that's somewhat of what I said, it adds on top existing C code. The only solution would be to re, if you have everything Rust, it would be different, but we don't have that. Okay. So I'm wondering what to care when the new approach comes in and we are doing very difficult tasks for handling as in kernel lifetime approaches in Rust. So does it, does the new approach affect the current lost lifetime in handling? But you don't know that. So thank you. Well, if we have the new scheme, it will be better for everyone that can I say. Rust or C. Okay. Time is over. Thank you very much for coming. Thank you very much for the great questions and I hope you still have a great conference.