 Okay, our final keynote this morning is Kate Stewart. Kate is the Vice President of Dependable Embedded Systems at the Linux Foundation. Today, Kate will discuss building dependable systems with open source. Please welcome Kate Stewart. Konnichiwa. You guys are getting me? Okay, great. My name's Kate Stewart, and I have been focusing for the last couple of years here at the Linux Foundation, all about what we need to do to help embedded open source systems become dependable. In system engineering terms, dependability is a measure of a system's availability, reliability, maintainability, safety, and security. And so, let's consider the car. I'm proud of you wanting to be a dependable car, because Dad's just been talking to us about it. This is a nice little sigwek. More of the components are built in this car on open source every year, such as the AGL's device. Sorry about the noise. Today's car is likely to also have cameras embedded into it for the driver to see back views and side views. There's also radar sensors to measure the distance and to provide alerts when you're getting too close to something and pass through a lot of slides. But anyhow, back to this. So there's radar sensors to measure the distance and to provide alerts when you get too close to something, another object. So this helps keep drivers and passengers safe. This is a good thing. In addition, there's also proximity information coming in now. So you'll see dashboards that are incorporating the navigational assistance to help the driver get to a destination. You know, using GPS technologies and communicating with external services to understand such things as traffic conditions. We're all using this today, either on a phone or in the car. And all of this information is, you know, helping to keep the drivers and passengers safe and arrive at the destination sooner. But you know, as we're moving more towards the autonomous stuff, LiDAR and other sensors are now being incorporated. And we need to filter this information and react appropriately to what's being detected if you don't have a human evaluating it. So this is going to be the challenge. So trained AI models are being connected to these sensors. They are doing some filtering. They are working it through for this. And the data sets being used to train these models are now important. And so the input from all these multiple sensors needs to be coordinated. And we need to basically take it beyond what we've been using to date. So modern products are clearly now more than just hardware and software. Cars went from being hardware and software to be, so just being hardware, you know, from 1910 up to 1970s and 80s. And now they've came to hardware and software. And now they're evolving to incorporate AI, models trained on data sets and working and expecting external services to work. So from that safe driving car example, we now have more ingredients that we have to consider in our analysis. And so hardware, software, training data sets and communication to remote services. So we need to leverage system engineering. The total discipline of system engineering was created for solving complex systems interacting. And we need to pull all the factors into our analysis now. So I just, there's an AI artificial, AI incident database that's public, anyone can go look at it. And what it does is it tracks any social media references to incidents with AI. So I looked at what was happening for the cars for last year. These are all from this year. And so we have various incidents, case numbers. And if we go and start drilling down on one of them, it was an article in the Washington Post and there was pulling out more data together too. And so when you start looking at it, you see things like, it contains, there's concerns that quite frankly, the AI didn't recognize the school bus stop sign. It's not been trained to handle interactions with motorcycles and emergency vehicles. Over 800,000 cars in the US today have the autopilot capability. And almost half of them had to be recalled to fix the fact that it wasn't recognizing traffic lights, stop signs, and speed limits by the autopilot. And so like last year, the self-driving car capability was fully enabled from 12,000 to over 400,000 of these cars. And then in this last year as well, we've been living through this experiment that two thirds of all the reported incidents to the National Highway Transport Administration in the US were from Tesla. So all the other car makers have an less aggressive rollout of the technologies and are only a fraction by comparison. But we are living with this and we need it to be changed and be safe. So what we need to look at is, that's just what's happening with the AI. What happens when, oh, we need to save some money here, we've got to remove a sensor. Well, actually, this actually happened in some of these cases and they said, they had to reintroduce some radar sensors. And so there's cars out there without some of this and this comes with it. So we've got a lot of factors that are gonna be coming into play that we're gonna need to analyze to understand if we are working with something that's safe or trying to make it as safe as we possibly can. And certainly when I read these sorts of things living in the US and living near a Tesla dealership, it worries me. So, as you can see, there's more ways things can go wrong right now. And so what we need to trigger out is, how can we get all that information together and expand from a software bomb to a true system bill of materials? We've had hardware bombs for forever, okay? We've been adding software bombs in recently because of security. We really need to get more inputs of our more of ingredients incorporated to get so that we can do these safety analysis properly. So to do this, we're gonna be needing the standardized metadata from all of the supply chains. We need it coming from hardware, the software, the data sets, this whole slew of data provenance and we need it from the services. And we need to pull this together and be able to get it so the way that we can connect it and make recent choices about it. And to do that, it also needs to be accurate. And the one thing we've been learning over this last year with the S-Bombs is you capture the data when it's created in the product's life cycle and that'll give you the most precision, okay? When you're actually looking at your sources or when you're looking at your design, there's information there that the engineers know and that should be surfaced up. And then what we need to do is connect the design to the code, to the build image, and so forth. And create knowledge graphs. And this isn't just for cars. We're gonna want this for our whole entire critical infrastructure. We need to get this level of analysis so we can do safety properly. And so we're gonna be needing to look at how we can get this metadata to work in the ecosystem. Today, when I talk to people who've been working on this, it tends to be collections of papers, files, reports. It tends to be spreadsheets. It tends to be manual processes. This doesn't scale to where we need to go. It just flat out doesn't scale, okay? When we're talking about changes like one big bug fix an hour into the Linux kernel, manual processes don't scale. And we wanna make sure that things, there's a security issue, we've dealt with it, and things stay safe. So what we've been looking at for about two years now is evolving SPDX into profiles so that we can provide a framework for connecting this metadata. And so it's about components, the processes, the requirements, the evidence to support product line management at scale. And so that's what's been missing. So SPDX, the 2.2 is an ISO standard and it supports exchanging the metadata between the systems today. So the software bill of materials, it's got everything you need for software bill of materials today. And it also supports traceability between requirements, code, test, and evidence. That's there today. However, what we're doing with 3.0 is we're transforming it into being able to be effective in a database so that we can actually pull all these elements in as they emerge over time and be able to understand the current state at a point in time. That's what databases are for. And so we're also bringing in profiles to capture domain-specific information and extend beyond what we've got today towards the AI and the models as well as the dataset provenance because, okay, has this dataset been trained with stop sign that it or not is kind of important for kids coming out of buses. The other thing is, can we extend to support the product life cycles? The information you have when you're building is different than the information you're gonna get when things are running in the field or you've deployed it in a test environment. All of these things need to come together. So we're looking at three and then we're already working in the SPDX community. We've got working groups active today working on starting to articulate hardware. So we need to tie the software to the hardware to those specific chips as Dan was saying and put these pieces together. And then also quite frankly, deal with virtual hardware and virtualized environments like digital twins are gonna be needed here, especially when you don't have easy access to work with things. So the services also are gonna be coming in and are being used today. We need to be able to represent them and tie those into the reasoning so that you know when there's dependencies you can actually check them. And all this is being combined in the safety profile. So SPDX3, the 3.0 model repository has all the profiles for three of it today now. And we are prototyping the serializations. We've got sample JSON, emitting serializations working and Python libraries and so forth. And so anyone who wants to go kick the tires we would welcome input right now. And right now if you use the core, the software, the licensing and the light profiles, you have effectively all the functionality we have it today in 2.3. However, on top of that you'll see, well, we're adding a security, build AI and data. And that is what is going to start to let us expand out beyond just the software space into the system space. So we can start to reason about what we've got and what the real dependencies are. And we're starting to make sure that we can support and say, where has this information come from in the life cycle? Okay. Because that information gives you an element of, did someone try to reason about it from the third-party tool? Or did the people who were working on it generate it and check it off in a test that is valid? And this brings in some of the things we're doing in OpenSSF that are going to be playing into this as well. And we've basically worked on very hard to align with the S-bomb types coming out of the CISA working groups and efforts and publications. So we're lining up with that and we're lining up to take and make sure all the relationships in two, three are supported and also adding in the concept of some of the prototype work that we're going on and expanding them to contain the life cycle information as well as one to many so we can be more precise when we're expressing things. So, SPDX is very much focused on component modularity and so putting relationships between components allows us to create that knowledge graph that we're gonna need. So we're gonna need it for doing that efficient and accurate safety and security analysis. And so we can take the safety artifacts right now and map them into these S-bomb types and say, okay, put these things here, this is the type of thing it'll belong with, so forth. So we're prototyping this stuff right now as paper exercises and we're starting to see some tool emerge. And with this type of model, I'm working with safety people, we can represent of a model type of analysis today with all the artifacts for a safety plan. And then you can connect it up to the code and you can connect it up to the tests and you can connect up the evidence. So you can take your design S-bombs, you can go your sources and you can basically then take the S-src S-bombs and you can say it generates a certain set of executable images. The paper is working here, okay. And then what happens though when you start to see a dependency? Well, all of a sudden now you have a way of reasoning about this so that you say, oh, there's a problem with this executable image, okay. Did it come from my supply chain? That was the build supply chain, was there a problem in that way? Or was it coming in from something, was it something that was generated from the source code? And in some of these cases we've been talking about, has this come from training data too? Will it be now an issue that we might be having up here? And if it's coming from the source code, was it an issue with our coding guidelines? Or was it an issue with the requirement for the code? The reason why, types of deal. And then as you sort of work your way back, well, potentially is there some problems with the requirements in the code in order to do the analysis? These sorts of ability to reason about this stuff at the higher component level as well as part of the full system is what's going to be needed to automate this all and to get away from the paper processes, okay. And quite frankly, get rid of the false positives but anyhow, on the security side. So component level, we can do component level traceability today with SPX 2.3. All of those relationships exist, working. You can use it as it is and play with it around like that. But one of the things that's going to be really great for getting rid of false positives is you can go inside the components and say, hey, this file actually made it into my image. So if there's a bug in this component, which files are actually affected? Is it actually in my image or not? Was it built into that image? You can be the same as that. And we're doing that today in this effort project. Come talk to me afterwards and I'll show you. If we can do that, then we know, okay, these tests have to be run to satisfy the requirement. That generates evidence and says the requirement's satisfied. So when a bug fix happens, what happens then is you can figure out, you know, for these requirements, I need to rerun these tests. Oh, I might need to generate a new requirement here and add a new test. And then that gets part of this big database we're keeping on behalf of a product line so that you can stay compliant, okay? So we need to start thinking beyond where we are right now to how do we do this properly and make it efficient for everyone? The other side that's going to come here is, well, okay, I'm using this food.c if I've made a fix there. How do I change, you know, which requirements actually we're losing that source file? We can make that traceability happen. And we basically can then say, okay, these other tests, I wanna check to make sure I haven't caused a regression. These are the pieces that will let us have trust that we are done after we apply the security fix. We have it very much crisp and clear exactly what has to be rerun. So, you know, how can we establish requirements for open source code and that the system engineering and safety analysis need? Well, all these pieces of code were put in with Ys. There's a reason why each piece of code was accepted. And that's part of it. There's also man pages that say what it should be doing. These are all things that are there. So, there's four projects I work with that are starting to look at trying to figure out how we can figure our way through this and effectively surface up the requirements that are in a place. Everyone has sort of cultural knowledge of what is happening on certain things, but it is not written down in a way that you can connect requirements for a system yet. So, how do we get there? First, I'm just going to say Yachto. Yachto, basically, we're using, it's, you know, it's not embedded in the system, it creates one for you. So, it creates your tool chain. And when it does that, it generates an S-bomb of it. And then when that tool chain runs to build the rest of the pieces, it creates S-bombs for those pieces. And so, today in Yachto, reproducible binaries are supported, which is one of the key things we're going to need here. And Yachto generates the SPD S-bombs by turning on an option. That's all you need to do. You'll get a lot of S-bomb data showing up out of this if you actually just turn the option on. And so, the system view is done right now. That system level view, I said we need to go to, is through UUIDs today. That's through a master index. And then we're participating in creation. So, they've come in and they wanted to get that system level right. So, they've been working with our community on the build level profile. And there's product line bomb generation working with SPDX that they're prototyping and they're doing more work on linking their tests in with the components. So, the nice thing about working with Yachto is any feature we work with in Yachto scales throughout the entire ecosystem. Elisa is looking at, for Linux, how do we start working around this tremendously changing code base? What are the requirements? How do we start to get our heads around this elephant? And so, there's been groups of people working in various units for the last couple of years. And one of the groups that just formed last year which is putting these things, the systems together as concepts and looking at the analysis of Linux as part of a system. And so, one of the things we're doing is working with some of the folks at AGL and we're basically looking at integrating this stuff all together into a reference system that anyone can take and download and swap pieces in and out of. So that we can then start reasoning about it. And you'll see that we're using the Zen project, we're using Linux and we're using Zephyr and we're gonna be putting a simple application on top of it all. So we have the safety constraints. And we'll use that to do the reasoning. And I would like to be announced a brand new open source tool just became available for requirements tracing. It's called Bazel. And Red Hat has contributed this to the Elisa project. And what it does is it lets you trace requirements and to code to test and do it in an open fashion so that people can review it and peer review it. So we can start to finally break apart and build a community where people are caring about the pieces they care about most and putting the requirements that they care about and crowdsourcing things together. We've been missing this. Every industry, every people in the industry have been doing it for themselves when they're using Linux and we had no way to share. So we've created a way so we can share this effort. And the Zephyr project is participating here. For those who aren't familiar, Zephyr is an RTOS and we started off with safety and security for resource constrained devices in mind at the very start. And we have our initial certification focus. We're working with TubeSood on getting 461508 right now and we are also using Zephyr to prototype requirements traceability. We're using StrictDoc in this project which is another open source tool. And we're basically working on putting together the hierarchy of requirements and matching it to the code. So we're doing that exercise in Zephyr which is a much smaller footprint than the Linux kernel to see how it's gonna work and see if it'll work well for the developers and the maintainers in that project. And then in the Zephyr project, Zephyr is also focusing on safety and security because of the hypervisor. So what it's trying to do is see, it has a working group that is focusing on safety as we're going along. So Zephyr today is basically looking at safety critical. People are using it for safety critical systems today but they're working at the maintainer level of improving the coding style with MSRC. They're adopting things one by one and they're basically looking at features to improve the real time and reduce interference. So they're looking at the piece, that fundamental piece that everyone's gonna be needing and they're working on it right now. And the project members are looking and there's project members actually looking at getting 61508 and 26262 certification. So they have a functional safety working group that I participated in and others are getting welcome to join. So if you wanna continue the discussion, there'll be a bof tomorrow afternoon. And I've got links in the slides to the release of working group on the system side. Come join us there. Talk about Zephyr, Zen or Yachto. Happy to talk about all of those tomorrow in the bof. Or please, if you have other projects, open source projects that are working towards safety, come talk to us and see if we can help bring you out, bring this all together and build upon each other's work. And then as you've seen, we've got a framework now for connecting this all together at the requirement level that was missing before. So we can start to compose systems from the requirements and do the system engineering properly. And join us on the SPDX side if you wanna start talking about that and integrating things. So I guess I'm gonna leave you with the, everything's sort of old as new again. But system engineering practices need to be to apply to today's systems. And they are probably being applied within companies very diligently, but we're not doing it out in the open. And so we're wasting a lot of time and a lot of work and getting everyone frustrated. And so manual isn't gonna suffice for the scale of change in open source as well as features and functionality. And so we've got to integrate open source efficiently into software engineering. It's overdue. And we're gonna need a community to participate here. So, and I'm just gonna leave you with a hint is don't expect the upstream project maintainers to take the lead here. We've got to find new people to basically do it. And if we're lucky, they'll tell us we're wrong. Thank you.