 Hi everyone, my name is Stefano Stabilini and I work for AMD and I'm one of the extend maintainers. And my name is Bruce Ashfield, I am a maintainer of a locker and I work in the doctor project for my other open source work and I am also working for AMD. All right, so I would like to talk to you about a little bit about system advisory. Next slide, please. So in today's environment, there are often cases where there are multiple CPU clusters with heterogeneous CPU clusters such as Cortex-Rs, Microblades, Cortex-Zs where more traditionally Linux runs. And in the Cortex-Z processors, there are usually multiple execution levels. There are, there is user space, there is kernel space, there is secure execution level like Securial 1 where Opti runs, there is trusted firmware running at Tier 3 and all of these components are completely independent from each other. So it's natural in this environment to, you know, to have to, you know, to think about multiple execution domains. There is an execution domain for Linux and there is an execution domain for Zephyr and there is an execution domain for Opti and so on. So it's natural to have a full system that is divided up into multiple domain with each component being the owner of each software component with the owner and the user of a specific set of hardware. So in such an environment so rich in terms of multiple heterogeneous clusters as well as heterogeneous software components, we need a way to describe and configure the full system and system advisory is the answer to this problem. The full advisory is an extension to the advisory to describe everything there is. So traditionally, a advisory only describes what is available for one component, for one OS, such as Linux. So you will end up having one advisory for Linux, one advisory for Zephyr, one advisory for Opti and all of them will be different. So in the full advisory, you can describe everything there is with a multi view angle, meaning that you can describe multiple different address spaces, and you can also express differences in what Linux can see compared to what Opti can see compared to what Zephyr can see. In addition, there is a configuration aspect of the of the problem. Please, the configuration of what is the set of devices that we want for you boot what are the set of the Linux what are what is a set of devices that we want to enable for Zephyr on in another domain and what is a set of devices that want to make sure are only usable from secure world. So what system advisory is about the code in description of a full SOC, including multiple heterogeneous CPU clusters as well as OSS and the configuration of them so which domain can access which devices. So so you can imagine really taking a full as a full SOC design and carving out domains and each domain having a subset of the hardware and which subsets exactly is also expressed by system advisory. Next to you, Bruce. My part of the presentation is something I've been working on with collaborating with Stefano and the system device tree. Specification, if you will, is a tool called Lopper. I think that we that we've been working on for almost three years now, I wouldn't say we've been working on it full out in the sense of continuous development but we introduced the idea and the concept at the same time we were talking about system device tree because we knew that some of the extensions and some of the information is globally described, if you will, that we would need some sort of tool to take the system device tree and produce other device trees in the format that an existing RTOS or Linux would expect because we couldn't expect these operating systems to immediately be able to consume or we might not expect them ever to consume a system device tree. There's lots of device tree libraries and little tools and of course, common things like said and, you know, you can script and pull together sort of a lot of ad hoc ways to, to manipulate the DTS or the source files or even the ATBs but we wanted something that we could have a defined interface, a divine something that would be able to, in a controlled way, analyze and manipulate device trees. So that is what it is now transformed into which is not just for pruning device trees which is our initial idea was to remove the information from a system device tree that you need for an OS but it's now sort of a framework for manipulating system device trees and transforming the information contained within device tree, splitting it, joining it and doing other things. We can produce multiple different types of output whether it be configuration files, YAML files, DTS, DTBs and it's flexible in the sense of those are what we currently support but we could add more and plan to add actually more inputs and output format types in the future. So it is flexible to integrate into different types of workflows, which we'll see in our demo, and that it is completely data driven in the sense that, you know, we can do things. So self does not understand what a system device tree is, it doesn't understand what you may or may not want in another system device tree but the details of what it needs to do are inputs to the tool and it executes what's asked and that is where the actual intelligence or the semantics would be. I just did some quick details about Lopper itself, it is open source BSD3 license. I have links to where you can find the GitHub on the, it's part of the device tree org GitHub, and it does get released as a PyPy patch package as well so you can actually install it quite quickly and do a test run. As I mentioned, obviously it's a some PyPy, it's written in Python, and it has pluggable back ends to manipulate device DTBs, and it also has the ability to add additional logic components in the future. It does support these, the data that I mentioned, the inputs are these unit operations, if you will, which are simple operations to say rename a property, delete a node, add a new node, so those are called Lops, and we can feed those in as an input and they will be applied to the tree, or we can actually implement complex Python functions on assist modules, which is what will be part of the demo today. And of course, depending on what you're doing you can use one both or none, whatever you need. And it, one of the, one of the things that it does do always is a level of validation and consistency checking on the, the output tree so if you have an invalid p handle or things like that it is always running in a mode that will do some level of consistency checking, and we plan to add more when viewed from a system point of view in the future. So, just to go in a little bit closer on the framework part of it and so we kind of use it now as a common base for tooling requiring that to query or inquire, or manipulate device tree as I mentioned you can do a lot of this of course with what's been said in AUK and whatever your favorite tool is but you would have it's very one off at that point and it's hard to extend it to some of the more complex logic that we envision and we'll even show in the demo, but built into for itself what I'm saying is it can do these core tree manipulations, you know, no merging it can merge local trees it understands what the DTS tree should look like this syntax what a property is and these sort of things but it, it doesn't understand how they really interrelate with these, that's that's done in these, these assists, it can do DTS DTV YAML, it parses with either lib FDT or DT lib from Zephyr. It manages these assists and lock files to make sure they're applied in order to make sure they do same things to the tree. This is the consistency checking that I mentioned, and there's a set of building up of manipulation libraries routines that we can use in the lock files, or in the assist so these are part of, you know, the core of Lopper they're always there they're always available. So there is the framework portion which means there's optional plugins and or assists if you will, and this is where they provide logic semantics and sort of context awareness of what we want Lopper to do. And this is where, you know, we have things that are front ends about domains and partitioning that Stefano was mentioning in that the SoC picture where you might need to do some partitioning. It can do YAML expansion on a front end as a description of the domains. It has a bunch of back ends that are built in you can find in the tree right now. It's very pruning some example our toss bare metal back end generation assists. It has a security to partitioning back end for subsystem that can generate some firewall information with this in progress it started. And any number of them you can find them in the tree I won't bother going through them all. Another back end assist is sort of that more detailed device level tree analysis and modification, which is sort of what we're going to be showing in the demo. And there is a little rest API that we plan to use in the future to support other tooling interacting with the information that Lopper reads and pulls and can represent of the system device. Visually, these components that I just mentioned, this is, you know, what it looks like in a normal flow where we start from the left where we take inputs which are all sort of standard based inputs, there's nothing that we have created in a normal files, BTS files, lock files, which would be a system device tree, a board device tree and overlays, you know, they're they're read in through these front ends. They go through an importer and they're part of law per core. At that moment, we then apply we have assists or Lops or translations that are applied to those to that rep internal representation of all of that information. And then on the back end, we go through an exporter, which means we can write out YAML DTS DTV or something custom, which is what we use to write out device drivers for bare metal configuration for a Yachto build information that a specific device might need. And this is of course the path through is all on what how you run Lopper on the command line or how you wrap it. And now we're back to Stefano to talk about Zen partial device trees. So, I would like to discuss our device tree a little bit, because it introduces a problem, a solution and a problem that this is very actually widespread in these static partitioning configurations that goes way beyond Zen, beyond any single hypervisor and also beyond hypervisor in general, because you will have the same problem also with multiple domains running on different CPU clusters. The problem is how do you configure the devices that you want to be assigned to each of these domains, as your idea of a domain that includes software like Linux, and the CPU cluster is running on like the corporate days, and then the devices that are accessing such as Azamet or a timer. So, you know, that means you need to be able to specify individually which devices are assigned to each of these domains. And this domain as I mentioned could be imagine Zephyr on the Cortex Rs, driving something in programmable logic, or it could be a Zen VM on the Cortex As. And that's where we get into the Zen specific part. So I'll show you how Zen handle this problem today, with all the Zen specific details now and then I will abstract out to discuss a more generic problem and how Loper can help solve it. So, today, what Zen does, it uses what we call a partial device and it's something that predates overlays and what is kind of similar in intent to device overlays. So you will put the description of the device or devices that is accessible from the guest into his own special device tree with a top level node called pass through. And like you see on the slide on the right, everything under the top level node called pass through everything underneath it, it will be copied into the guest device tree. And it's done for two reasons. So the primary reason is the description, right, so that you can describe for the guest, the what's available. So you want to expose a timer to your guest, you put it in there, it gets copied into the guest device tree, so the guest tab boot can read it. It's not just that though, there are also a couple of special Zen specific properties for configuring device assignment. Like you see in the example there is Zen force assigned without a UMU, Xenreg and Xenpath. So Xenreg that are the last three that you see under the timer node. Xenreg is maybe the most important one, and these two specify the memory mapping, the address mappings for the device that can be one to one, like in the example you see on the screen, but you can also map them in your region of a device to a different location inside your guest if you want to. And Xenpath is used for is used for UMU configuration and is a link to the corresponding node in the host device tree. Xenforce assigned without a UMU is for the cases where an UMU is actually not needed, for instance, because the device is not a DMA mastering device, like in this case the timer is not a DMA mastering device. So there is no need to configure the UMU. So how do you start by generating something like that? So up until now it was done entirely by hand. So somebody had to go and take the timer node on the host device tree, copy paste it into this new device tree under the timer node and start making edits, like removing things that are not wanted and adding the Xen properties. The problem is it's a bit difficult. So next slide, Bruce. So first of all, the Xen specific properties are not few. And second of all, the other modification and even harder. So this is a full list of Xen specific modification needs to be done. So, you know, like I said, Xenreg the first one for memory mapping, Xenpass to point to the corresponding host node. Xenforce assigned without a UMU when a UMU configurations are not necessary. Interapp parents also need to be changed to be FDE8, this magic number, and that is because interapp parents is pointing to the interapp controller. And if you copy pasted the node from the host device tree, you will get a link to the host interapp controller, which is not the one that's going to be available in the guest. The guest is going to get a virtual interapp controller node new and generated by Xen. So in order to point to the right interapp controller, you need to change interapp parent to be 0x FDE8. You can choose properties if any need to be removed. That's because when an UMU is available, Xen is going to use it, and typically the guest cannot use it. So you need to hide the UMU from the guest. And the last detail is in the host device tree node, not the guest, so not the one under the host device tree node, you need to add Xenpass through under each of the nodes for devices that you want to assign. And to do that is, if you don't do that, DOM0, the first guest, will always get all devices by default for simplicity of configuration. So if you want to instead give one of the devices to another guest, first you need to tell Xen to not give it straight away to DOM0, otherwise it's going to be already in DOM0 and you cannot take it away. And to do that you just add Xenpass through under the host device tree node. So if you look at all of these changes, they're not easy, like they're not trivial, but they're also not extremely difficult. All of these are kind of straightforward, like Xenreg, it can be automatically calculated from the reg property. These are changes to interapparent of static, removing a UMU is something you need to do all the time, add Xenpass through something to do all the time. So it's easily scriptable. So like Bruce was mentioning earlier, I mean it wouldn't be that hard to write a script, even a bash script, using said that makes these changes. The real problem are the other changes required. So, you know, if you start from the host device tree node, there are a few other things that needs to be done, like because often there are pointers to external dependencies such as clocks, power domains, reset lines. So how do you deal with those? They definitely, they definitely have an effect on the driver behavior, cannot be simply be ignored. So this is a difficult problem. And this is a generic problem, because it's not Xen specific at all will happen with any hypervisor. Beyond hypervisors, it will happen even with heterogeneous clusters. So if you have Zephyr running on a separate cluster wanting to drive a device, how are you going to configure Zephyr for that? So this is a clock, especially the clock dependencies is a really difficult problem to solve. So next slide please, Bruce. So this is a more realistic example, compared to the one I showed before, the one I showed before was a timer, so it's very, very simple, maybe trivial even. So this is an MMC controller, and the more realistically reflects how things actually are in reality. And as you can see, in beyond the MMC controller are all a bunch of clocks as well as a clock controller had to be added all under the past through node, so that they can become part of the guest industry. And the reality is that pin control lines and reset lines power domain can be removed with a bit of lack of functionalities but, you know, the core functionality, you know, everything will still work. It's different. If you remove the clocks properties from the node, often the driver will refuse to continue, will refuse to operate the device completely, so the device will become, you know, not working. So this is where Lopper comes in to help. So this is not something that can be automatically scripted using Bashun said it requires chasing up dependencies of clocks and the clock hierarchy. This is not something can be easily done by hand. It takes several tries, trial and error, in order to be able to produce something. What clocks do you need to import, what clock controllers is actually quite difficult. So this is one problem. This is one problem where Lopper and system devices can greatly help. Next slide. Yeah, you go. Right, so we're back to me. So yeah, to, so we said about trying to come up with a solution using Lopper because as Stefano mentioned it's error prone and it's trial and error and you have to know and then what happens when the path changes in the host device tree and you have to make this. So there's a maintenance and a lot of issues around coming up with a one off solution. So the answer of how to do it with Lopper given the information I presented about the framework. Earlier is, you know, to leverage the, you know, the core Lopper functionality and assist so you know obviously we want to take the input host device tree, bring it into that internal representation. We need to be able to look at the devices, the clocks, the ethernet and their dependencies and P handles and in order to get that understanding of what we need to what needs to be considered. Do tree manipulations change that interrupt parent to the magic number, drop properties, rename properties, create nodes, that sort of thing and then create an output DTS that is specific to the pass through device. And then, and also, as you mentioned, we do need a modification to the host device tree so we can also ask Lopper to not only generate new information it can modify the input host device tree in order to make sure the information is consistent between the two. And I'm going to show that it's quite simple to then create yet another assist to glue the different bits together into the image if you will the bootable information that we need. And that's actually wrapping something that Stefano has as part of Zen called image builder, and there's some scripts there and some information that will generate what you boot and different parts of the system need. So it's quite trivial to take the known written host device tree the parts of device trees, and just wrap image builder in this in this demo to recreate what we need to boot to avoid any need to run things manually to do the testing. But we did set a couple of boundaries again, you know, if we're going to make something very one off we might as well try to do it with bash in a said script. So, you know, there's no hard coded Zen knowledge in the sense of the things that Stefano was talking about the, the reg, or the no M when an IOMM you is or isn't required what so we're triggering we're we're reading in inputs. And what actually happens in the extraction device tree is not checking for the name of a node only, you know, or anything specific that it's looking for properties that are present and using that to trigger different bits of information into the extracted device tree. I also wanted to split the functionality into that generic problem and reusable components. So there's something that's a generic extraction sort of routine and then something that makes it into what Zen might be looking for. And of course, using and creating more locker library routines where possible. For example, a lot pres away to say look at a node and return me a list of all those that are referenced from this note so things like that calling those types of library routines as part of these assist so they're very specific to this solution work. We're following. We want it to be data driven and have command line options for flexibility because those can always be wrapped later on with some of the domain the ammo files or different ones throughout there. So you'll see in the demo there's, you know, excluding and including properties are done with command line options to the assists. The third code should pin control never be processed. It's like well you can exclude it or you can exclude another one, if you want, and that it should work for any device that is a valid passive device so, you know, we're running it on to, and we'll test it on more as time passes but it's not, there's no knowledge that it's not specifically tied to the devices that were that we're passing through. You know, what that looks like in the implementation itself is it's a pipeline of assists. So we, you know, Loper can run and assist it can run many assists, and they can pass information from one assist to another whether they stuff it in the system device tree whether they modify the main device tree, or you know there's different ways that they can communicate so in this case we have a pipeline of assists that manipulate the main host device tree in this case so there's one called extract which is very imaginatively named, if you will. What it does is it knows how to generate a partial or extracted device tree starting from a target node. It does not name it as the the pass through node that Zen is looking for it calls it an extracted node. It annotates what it's done it pulls the dependencies it annotates them with the extracted path property again, not exactly what Zen wants it's something a little bit more generic. As it runs, we then have a plugin called extract Zen, which gets the main device tree, it gets the extracted tree, and it can then use the Loper routines to rename the nodes rename extracted path the Zen path it adds the IOM you pass through extends the Zen specific things, and then it writes that it takes that extracted device tree, and it makes it specific to what Zen might need. And as part of its execution, you know it can write a DTS file but in this case we're actually asking it to write directly right to a DTV, because that's what we need to boot. And it puts that in the proper location. And then the third, in this case, pipeline assist is the one that wraps that that Zen image builder utility that I was mentioning and it takes the partial device trees, the main host device tree. There's a configuration file that sizes things and lets us know what we need. It takes all those in under it's sort of wrapped by Loper and Python if you will, and then it writes out what exactly we need to do the booting. I think I covered this in detail but just to give you a flavor that you know each one of the assists actually has its own ability to take command line arguments. So one of the things is you know we tell it T the target node so it doesn't know that you want a serial or ethernet it just knows that there is a target node which is where it's going to start and the device tree. You know, we have an I which says if you are walking through a device tree path and you see this node please always bring it into the extracted device tree. That's how we pick up the zinc MP firmware in our example but not the root node and all the rest of the nodes so we have a way to conditionally include nodes. And then we have an X which is exclude nodes or properties and matching reg X and this is how we make sure pin control and some of the other power domains and things that we don't want any structure tree that they never come over in this generic step. Then has fewer options because it is much more specific. We simply tell it the target node and then it finds the extracted tree and triggers everything based off of that information. And finally my simple image builder. Rapper assists which is sort of a little bit rudimentary at this point but it just knows whether we want to do you boot where to look. And then we have a directory which is where the config file is and and things like that so it will generate it wraps and generates what we need so these are the three plugins that will be running in the demo that implement this logic. So at this point I will stop the share and I will share my my terminal. So we can have a look at how all of this. How this works. Okay we should see my terminal now hopefully we all do. Stefano will stop me if we don't. So what I can show here is if we run Lopper I'm running it out of a Git clone that I have because everybody has a Lopper Git clone available in their favorite location because why not. So you can see Loppers available as cloned off device tree.org. One thing that we're going to do is we'll take the main mpsoc dts which is our host device tree and we're going to actually write it out into this tf2p root directory and psoc dtv so that's using the Lopper corvillary read device tree write device tree and mps will show that there is nothing in this mpsoc device tree. And it doesn't know anything about Zen and I will manually to make sure we boot properly I'm going to run. This is the image builder script that Lopper will run automatically later but I just rewrote the boot information that is required to in our demo to boot the basic tree. The way that this works is we have two QEMUs running one is micro blaze which is required for boot and then we're going to boot the main cb cluster so I started the micro blaze. We switched to another terminal, which is now going to run this arm 64 main host boot if you will so this is using the information that was just written by image builder which is the u boot the size the configuration what we what we need to boot. So, what we're showing here is that the initial boot. It does boot us to DOM zero and that DOM zero will have all of the devices as definitely mentioned by default they are given to DOM zero. It has all of the devices and we will show that when we log in. We have loopback we have the this ethernet device which is exactly the device that we will be attempting to give to DOM zero or DOM sorry DOM the DOM U. And when we switch to we switch our serial input, we're now a DOM one which is zephyr. It's quiet it doesn't have the timer which is what it will be giving a timer tick if the timer is successfully passed through to it. And when we switch to DOM to which is the DOM U you can see that it has. It has no ethernet. So there's our basic system. But now in this workflow that we're that we're describing, and I just terminated the curing session. So in the workflow that we're talking about somebody says I want to give a timer to zephyr. And then I want to and then I want to give the ethernet device to DOM U. So we have the ability to run again back where we're going to run Lopper and we'll do two, two executions of Lopper the first one will take the main. We have Lopper we're reading in that MP sock device tree that we ran and we're writing a new one called MP sock boot. It is chaining to the extract assist. It is telling it that the target is the timer node. It's telling it to always includes in can be firmware exclude the interrupt controller exclude pin control and control names power domains current speed. And again, these are things that we're leaving on the command line now to show that they're available but this is something that you would either wrap, or we could configure and potentially the main YAML file or some other way to represent it. It then chains to extract Zen, which all we really have to tell it is the timer node is the target and we're asking it to directly write the partial device tree for the timer. The run is that fast. It's that simple. You can see that the main plugin said that it was dropping the mass property. It found the interrupt parent and it updated it, and then it updated the system device tree with the pastor property which in this case is the main host device tree, and it is copied extended Zen. So, if we do a quick diff of the main host MP sock DTS to MP sock boot DTS, which is what we asked it to write. You can see the only difference is that the timer node had this property that we're required called Zen pastor to indicate that Dom zero should not get that node by default. We're going to do a second run and this is the one that completes the process in this example. And in this run, it's exactly it's the same thing. We're except this time we're reading the MP sock boot DTS that we just wrote because we already have one device that's been passed through. And we're actually going to directly write the MP sock DTB required for booting. We're running extract again this time we're telling you to look at the ethernet device, same includes same excludes, and we're running extract Zen telling it to the working for ethernet it's writing the ethernet DTB directly, we don't have to bounce through DTS we can write DTB directly. In this second run, we're using that image builder assist I talked about to combine the MP sock DTB, the serial and ethernet partial device tree into something that will be bootable. So again, you can see it ran the same way it did some. In this case we actually found. We found the pasture properties we found the SMMU. We triggered different functionality and then the image builder plug in a generated you. So at this point. We haven't interacted at all. My hand with the various inputs to to generate those advisory so if we start the micro blaze again. And then we attempt to boot our. Main thing if everything work properly image builder took the DTB is it created the you boot, boot source boot information what it needed for the devices. And it was all written in the right format. So we're booting. Hopefully this finishes soon because we are running a little bit up on our time. But when we log in this time. Eventually. All right, and we look at the devices. We have a serial and ethernet in Dom zero. And you can see Zephyr has already popped up when we switched to Dom one, which is Zephyr. It don't worry about the serial spawning things. It is now getting a timer tick because we're getting a message. And when we switch to Dom two. And we log in and we have a look. We have anything a device that was passed through. We have no human interaction whatsoever everything that's definitely talked about was automatically triggered generated and constructed into a bootable image. And as part of my wrap up, I just wanted to let everybody know that the components of what I was talking about Loper and some use cases they are there is a yacht to integration. We have a meta virtualization layer, and we plan on doing more integration with some heterogeneous yacht to builds in the future, and that there are a series of features and different things that we're going to do in the future if you find this interesting or you have ideas, please reach out and let us know. Yeah, feel free to ask questions on the chat will be online. Thank you guys for listening.