 All right, I'm going to go ahead and get started. So today, we're going to talk about a problem that we're calling multi-build. This is probably, if anybody's an MCU boot developer, this is distinct from the multi-image that MCU boot is talking about. So maybe this is not the droids you're looking for. But all right. So we're going to do a little disclaimer, talk about what the problem is, and talk about one approach which has been put into the PR that you can see up there, 13.672. And then I want to try to get to discussion as quick as I can, though I don't know how much time we're going to have. This is something that we've been discussing within Zephyr for quite some time. And so I'm assuming that there's going to be a huge level of difference of experience with this problem in the room from what is this problem to like we were just arguing about it for the past couple hours right over there. And I'll do my best to assume as little as possible and then hand it off. So the disclaimer is that Nordic has a dog in this fight. We have a very particular solution which we're running in our downstream. We've been running it for quite some time. It does work. However, this is an upstream focused presentation and I'm trying to be kind of an impartial describer of what's going on. But everybody knows who pays my bills. So that's out there. What's the problem? OK. There are an increasing number of use cases that we've been running into throughout the community really. This is not just restricted to my company where we want to build multiples Zephyr binaries that are all interdependent in some sense. And the sense can be very different depending on the use case. But they are all sort of targeting 1SOC. And the way that we can do this now is that we can create a separate build system for every one of these binaries and then do whatever, however, tools we choose to validate the interdependencies between them or past dependencies between them and then get them running on our boards. So there are some problems with this, which I will go into. And we have a proposed solution for this. But that's what we're here to discuss. So the first of these is usability. And what you can see listed is that you've got multiple built-in flash steps. But it's a little bit more difficult than that. There are certain cases like how much space am I allocating to my boot loader and how much is left over to the application, where it would be really nice to do it in a more integrated way, where if I increase the amount of space allocated to my boot loader, the amount of space allocated to the application is decreased by the same amount. Or in other contexts, like in a trusted firmware image, you have a partition between the peripherals between your secure and your non-secure world. So if I use this by in my secure world, I cannot use it in the non-secure world. And it would be great if we can produce kind of a single system that knows that. And can check it for the user. So it's not just that there's more steps. It's also that interdependency leaves more work in the hands of the application developer, unless we have kind of more framework level support for this somewhere in Zephyr. There are some performance concerns. And the main issue is that a build DAG has these sort of serialized steps at the beginning and the end. And then in between, it's using all your cores to build the individual C files. But while it's doing linking, that's very inherently serialized, generating header files for autoconf.h, for example, you can't really build any C files until that's done. So there are some steps that if you're doing each of these binary, if you're evaluating the DAGs for these build systems separately, then you're leaving some performance on the table, especially at the beginning and the end. And then if you're doing them in parallel, like you're allocating some threads to one application binary and another set of threads to another, well, that limits you from doing some cross-checking in some ways. But it also is kind of unclear about how to map threads to builds. You're also doing some repeated configure time work. But then the main thing is kind of what we already started getting into is the dependency management. So everybody's got their own application build directory, so you can't share memory between, like, say, a minion core. So if you've got a single SoC where you've got a Bluetooth controller on it that's a separate core than the application core, you're going to need probably some shared memory area to swap your buffers around. And it would be nice to be able to, say, in one place and have a single point of truth. Where are the boundaries of this memory area? There's another case that comes up with trust zone, which is just not just the partition of the peripherals. It's also that you're going to have a system call interface between the non-secure world and the secure world. So the way that that's going to work is that the secure build is going to generate a dot a that the build system for the insecure world must link in. So it's going to give you some function signatures and give you this dot a that you can link into your insecure image that lets you call into the secure world. And so if you're going to do this with multiple build directories, you have to somehow let the other one know as an input. Here's my previous output archive. And then there's this issue which we've already talked about, like enforcing compatibility, which comes up in the bootloader use case, for example. OK, I wanted to talk about example use cases before going on, hit my backup slides, because this is something that we had a question about an hour ago. So bootloader chains are one. So MCU boot is a Zephyr application. And it's going to boot the main Zephyr application. It needs to know where it is. It needs to know how to upgrade it, all this other stuff. Longer chains are the same thing. So we have multiple stage boot. So it's not just a two-stage boot. The minion cores we've already talked about. Another thing you might think about in terms of minion cores, which are very inherently AMP SoCs where it's not just like an SMP core. Everybody here understands that. Another one you might run into is that there could be a really fast sleepy application core that wants to offload a little bit of work to a lower power core that might be directly polling on a sensor or something like that. And the application core for power management reasons wants to go back to bed as soon as it can. And then trust them, we've already talked about. So these are the main use cases that we think that we'd like to try to address, at least at first. And so this is where we're sort of restricting our discussion to, because this is a very broad problem space. So I'm going to rewind. All right. Now, those of you who are familiar with CMake may say, great, it has this external project feature. Why don't we just use that? And there is an open AMP sample in Upstreams effort, which does. There is a problem with it, which is that the dependencies between the two builds are not propagated between them. So if I have an application, kind of the main application, and then I've got my little open AMP minion core over here, what's going to happen is that if you update a file which both of those applications rely on, like say some file in the kernel, and then build the top level build system again, it doesn't know about the dependencies down here. And so it doesn't rebuild them. So it doesn't really work properly, right? It's not, at least as far as we can tell, a good solution out of the box for this problem. Because we want to manage these dependencies. And Sebastian says no. He knows way more about CMake than I ever will. It's just working as designed, basically. Unknown. But it doesn't work the way we want. Is the TLDR on that. So this is kind of the approach that we've got right now. And I'm just going to take a quick time check, because I do want to try to leave some room for discussion, although I expect that we're going to be talking about this all week. OK. So CMake has this feature called add subdirectory. You can say execute the script files in this subdirectory and include them in my current build system. So right now, what we have is a way to basically call add subdirectory on the top level application directory of another Zephyr application. So you've got one application, which has a top level CMake list. And it's going to, via the mechanisms that you're probably familiar with if you're a Zephyr developer, call into the main Zephyr build system. And then that's going to add subdirectory in all the kernel drivers, all the other subsystems. One of the new things that you can do, though, is say I'm building all of this for my main image, my app image. But you can also say, I want you to add a new Zephyr image. And what that's going to do is it's going to clear out a lot of the global variable namespace that we kind of rely on within Zephyr. So for example, all our kconfig variables that are the result of merging our kconfig fragments become global CMake variables in the Zephyr build system. And we use that throughout to, say, add this C file to the build ifdefconfig feature. And so that obviously won't work if we're trying to put all of this into one big CMake build system, which is what we're trying to do. So we have this new magic thing called image. And whenever you switch images, you can then call add subdirectory on your new application top level CMake list. And it's going to add it and reset all your kconfig foos and do a bunch of other stuff. Now there are some drawbacks to this because not everything in CMake can be namespaced. It has kind of this long legacy of backwards compatibility. And that introduces some constraints into what you can namespace. In particular, there is a global target namespace. So I cannot call something driver's USB, whatever, vendor, and have that be the top level target to build that library for some particular application and then reuse that target name for another application that may want to configure USB differently. So that's a problem. And what we have chosen to do is prefix the names of targets with the name of the image that we're currently building. And the same thing is true for certain global variables. The cache, there are certain CMake namespaces that we are prefixing with the name of the image. So this is not a small change. This is a pretty big change that affects many, many files throughout the build system. One thing that we're trying to do with this PR since the first time that it was proposed upstream is that we've moved as many global variables as we could into target properties of this new target that we indirect through a variable. And so that's Zephyr's thing on the next to last second level bullet. So there's this variable called ZephyrTarget. And it expands to image Zephyr. And so you say, I want you to set a property on the current ZephyrTarget. And you, as the author of a driver level CMake file, don't need to know which image am I building for. It's just like this is some sort of property that's global to the image. And so by moving as many things over to properties, we're trying to decrease the scope of this. But there are certain things like, for example, targets that we just can't change because it's a CMake thing. So that's kind of a 20-second overview of what we have proposed. And I didn't write the CMake parts of this. All I did was kind of make it work with West and tweak it a little bit. So I'm looking at this as independently as I can. And I'm trying to make lists of pros and cons that I think there is broad consensus on. And of course, again, this is highly asterisk. If you disagree that there is consensus on anything in this list, like yell at me, and I will be wrong. So one of the pros is that this exists and it works. We've shipped this downstream. It has been part of official releases. It works with Sega Embedded Studio, which is the IDE that we use. So one major advantage is that within a single CMake build system, which you can point an IDE at, because IDEs understand CMake, you can get an overview of your entire build. It is easy to use from the perspective of building and flashing, because you can use the same tools if you're using Rossimake and Ninja, or if you're using West Build and Flash. You do it once, and it builds all of the applications and knows how to flash them all together. It does solve the performance problems. It's one big build system. That means there's only one job server. And so since the dependencies are tracked properly, different linker steps can be performed in parallel. The fine-grained target and dependency sharing and configure time work is only done once, by definition, because there's only one build system. That's what CMake does. That's its purpose. So there are some cons. One is that in its current form, it doesn't allow multiple tool chains. So if you're targeting two different architectures, for example, that's an issue. We have done some early prototyping, which at least convince us internally that this can be overcome with another layer of interaction like every problem. So you point your tool chain at a script that then knows some image-specific context for what tool chain to call out to. That's solvable. The other two are that it is complex and that it won't go away. So you will need to know about this if you develop at least driver or kernel-level Zephyr code. And that will cost some maintenance, because you might forget to put the image there, you might use the old target name without the image prefix. And then it's going to work when you're only building one application, but subtly break in case you want to build that same application as a sub-image. There are some differences of opinion where people believe that certain things are pros and true, and other people do not believe that those things are true at all, and likewise for cons. Some people think that a non-recursive build just is the best practice, and we should always do it. I think the main opposition to that is not if it's too complicated or intrusive. And there seems to be a different perspective on that. Yeah, you can kind of read it. People sort of feel like on the left, yeah, we're both looking at the same thing, on the left, that it's worth it, is I guess how I would try to sum it up, given that this works, given that it exists, given that it's been deployed for a long time in production, the fact that it introduces these complexity costs is worth it. And then I think on the other side, people are saying, no, we should do this elsewhere. Doing this in CMake is not the right place to be doing this. And again, this is just my read on the discussion. And please, during the discussion, tell me what I got wrong. OK, so let's get to it. I do have some ideas, but I do kind of just want to let people talk. We'll skip to the last one, which is, obviously, trying to guess what your users are doing is very hard. And it's been really nice, actually, to have Rick's here telling us as a user how he feels about it. And I would really encourage any Zephyr users that haven't had a chance to give their opinion to please, please do so, so that we're not just guessing. But yeah, so is this really better? Do we want to keep the same way? Is recursive CMake a viable alternative? Do we disagree entirely vehemently enough that we're never going to be able to upstream this? Some ideas for discussion, but otherwise, if anybody wants the mic or we can just talk, now you're shy. OK, that's fun. Anybody have any questions? This thing on? Test, test. OK, that sounds like it's working, right? So I don't know how recursive versus non-recursive differs between CMake and Make. There's the famous paper, Recursive Make Considered Harmful. And it talks about a lot of problems with doing recursive makes, mostly about dependencies not having anything to do between the invocations and your bills don't work. And it seems like you've got that same notion here of that stuff that's shared between these would not be shared. I guess the question is, what are the? I didn't see it like spelling out. First, I've seen on the slides of a recursive make as an alternative. Are you talking about just a top-level CMake file that just invokes it? CMake several times kind of thing? Separate bill directories with scripts, that kind of thing? I mean, at one level, you could always have, I mean, think of how Yachto does something like this where we invoke CMake into it once this way, and then we invoke it and do it this way. And there's some external tool that then combines all of this together and just builds it. And I don't know if I have enough opinions about what's better or not on that. But I don't think, I don't know if I have a question. So the core of the problem of the recursive make being harmful is that build systems work by having a graph, tracking all dependencies. But if you don't have one graph, but several, then, and there actually do exist dependencies between the graphs, then the solutions for pretending like you have one graph, they don't scale very well. And yeah, we're not really sure how to... Yeah, I can't comment specifically about the... Yeah, I think I read, like I ported the Linux Kernels build system, KBuild, from Zephyr, or Zephyr used to use like nearly a one-to-one copy of the Linux Kernels build system. And I ported that to CMake together with Anas and several people. And in that top of the KBuild build system, it had like this comment about, because we do recursive make, you have to be very careful about the order that you include sub-modules in the Linux Kernel. So I'm not sure I agree that it works fine in the Linux Kernel, but I guess someone else will have to comment on that. It works in the... Yeah, it's even worse, I would say then. Yeah, yeah, yeah, yeah. I mean, we are not trying to... And it's not quite, but it's not quite as soft. So for example, if you have some Xilinx SOC where they've got a microblaze and a Cortex-A, right? The kernel build system is not trying to build the two kernels for that in a single location, right? You're expecting to build those as two different, right? There's... Even with two cursors, they're not. The assumption is that that's two separate builds. It's like the recursive versus not a kind of question a little bit here because we're really talking about different software entities, right? So it would be like to say, I'm gonna non-reversibly build my whole distro, right? And that's, you know, I mean, that's obviously very excessive, but it's kind of in the sense that my, you know, my Yubu versus my kernel, I'm not building in the same build-in location, right? Also, what is the complexity of these dependencies? Like, Peripheral-Q? The owner of the Peripheral-Q, yeah. There's the... Which is true of Ufos, in the question. Is this stuff that, like, is in the device tree? This is something everyone was bringing up. The question was, is this in device tree? And it's like, yes and no, right? Like, right now there's no global view. Well, the thing is, and also why, is it in the device tree, is it? Correct, because you're selling and it's usually that it has access to the spy, the ethernet. Actually, it kind of does, because it's... Yeah, because they never execute concurrently, right? They execute it in boot and never come back. So, it may need a spy flash or external... Right. It needs it, but I'm saying that there's the presumption, so there's some different things, right? So it becomes, there's two different aspects of Peripheral-Q, right? There's something like a UART where you're handing it over, right? So you used it, and now what's the state of it? We're talking about, this has come up, right? As to what the next guy gets it. Is it in the reset state, or is it in some state that I need to know? Which is another complication which you're probably having. Right, versus, hey, this was a Peripheral-Q. You did a touch-up because you weren't supposed to, so I can assume it's in the reset state when I get to the application. So yeah, while there's a technicalness, is that if it's used once first and then something else and it's no longer there, there is still need to be done. But this song has a very specific set of use cases and there's a lot of other use cases that just doesn't deal with it. And I kind of wonder, complexity and support all of this when you can't deal with all the other use cases where I've got, you know, the example I had earlier, the Cortex-A and the Cortex-M on the same SOC. Yeah, yeah, this is exactly the same thing that I maybe shed some light more from the outside because we are doing IVI systems and there we have really complex SOCs with the big cores where Linux is running on and there's a secure world and there are some tiny cores that are running on the same SOC and RTOS, not Zephyr in this case, but anyway. But the problem is the same, you have to share peripherals, you have to share your flash, your device tree must be correct. And we don't have it all. No, absolutely not. I mean, you have different RTOS that have completely different built systems and, but in the end, you still have the problem that you have to share some common configuration between the different parts. Maybe the flash layout, it may be the memory map, which also boils down to the device tree of the kernel. And... I guess a challenge. It is. I mean, we also have the problem. The Morta header basically. Yes, we also have shared headers. Yeah, we have also APIs between the different cores. And they need to have to share some header files, some common definitions, what the API calls are. Yeah, but this one is actually simple. But for example, we have some parts in the system that do the CAN communication with the car. And they have some very specific APIs that are exposed to Linux. And this is some kind of RPC calls and they have to share some common IDL in the end. So we have parts of the system that share really common source code. And I think your problem is just one part of it. So, but there's a bigger scope at work here. And the way how we solved it by our own tool. So we created a meta-built tool to drive all these builds. So, yeah, so, and it is the way how it does it is, you really, you get all your tool chains configured by that and you put out the dependencies between those. And then you let the tool build the different images in parallel and then stitch them together to bigger images. And in the end, you have some big flash image that you put under your no flash. Yeah, you have to really separate these parts together and hand over the built results from one, I would call package to the another one. But you're not building them at the same time. You are in sequentially probably. So you can hand over information. Yeah, yeah, but it's a big graph. You have all different entities from the free RTOS, which is running on one core up to the busy box package. The Linux kernel, they are all different packages and they have dependencies between each other. So the DIPC needs the kernel headers and so on. And then you let... This is the opto case. Yes, so we don't use yokto anymore. We just replace it also with the same tool. And... No, no, I mean, I don't want to go out on it. But what about something like... What bitpig? Bitpig, yeah. Yeah, yeah, bitpig. So... What's bitpig? Bitpig is more specifically, hopefully, just like... Yeah, but bitpig is known to be... Yeah, but bitpig doesn't really cope with the multi-tool chain very good. Right, and just... I'm not sure if this one is really solved. I mean, are we making... This is not a distribution, yeah. I think making the existing code a lot re-entrance, like, that it can be run... But the fact is, we're not adding a huge system that you have to learn. It's just a few rules about... Right. But we're only solving a tiny piece of the problem. Yeah, exactly, yeah. What about other things? There are two processors that are different. One of the images isn't actually running Zephyr. Yeah. But this is a particular problem that can be solved relatively easily and expensively. This is a projective appreciation, of course, and in an optimized manner, with the current CMake build system, the others are harder to solve like that, in my perspective, especially the combining millenials... But you know, if some... Okay, so, I mean, we can say, okay, let's do that this way. But other use cases, let's solve them using some metatool. But then we go... The metatool would actually address this problem as well. No, because the metatool would know about this functionality, so it would solve for the part of the one that this particular PR offers. So that means that if the metatool knows that you need to build two Zephyr images from the same code base, it could use this functionality to build those. But if then it needs to build something else that has not to do with Zephyr, and so on and so forth, so the metatool could benefit from this, knowing that the functionality is baked into the... But then you have a metatool to build the metatool thing, and this is already getting... No, no, I actually disagree. It's, I mean, this solves that particular problem. Multiple Zephyr images from a common code base. That solves this one. Now, if your use case includes also stitching with pre-built binaries, if it includes combining with Linux, all of that you need a layer buff. But that particular problem is solved by this, and it's solved in an efficient manner. Now, you can argue that the complexity added to the build file is not worth it, so that we might as well support all the use cases with a metatool instead. That's a fair, you know, we can discuss that, but what I'm trying to get here is that this change does not try to solve all of these problems. So, and it's arguably... But I think the example is this. I forget the Linux example, right? To solve that problem, then I would have to solve the MCU plus Zephyr problem. I think that is... But isn't that logical? I mean, you've been... No, no, because I can have the same solution that deals with building... Kumar's bootloader plus Zephyr can also build MCU plus Zephyr. That same solution... And then you can scale it up to support Linux and others. Sure, but then you wouldn't be solving the same way with the same efficiency. So we didn't... How much is efficiency actually? Right, you're correct. You're right. What percentage of efficiency? That was exactly my question, Paul, yeah. I mean, consistency, whatever the software may be, which I think is more beneficial, because I think that the spectrum of the number of cases of Zephyr plus Zephyr plus Zephyr is smaller than the number of Tfm plus Zephyr plus some other boot. And even then we, you know, then you expand past the single core MCU's, where you've got a lot more complex and more tender for things going on. It's a problem, absolutely. I just don't think that the benefit is so poor. Solving the problem differently in this one case versus the other makes sense. Yeah, yeah, that's the trade I'm going to talk about. Yeah, that's where I struggle with. That's a fair argument, I think. Mostly, is that. It says even in the case where I get, where I can say, okay, if I'm doing Linux plus something else, yeah, there's a whole different solution. Tfm plus Zephyr. Yeah, that, that I have to do something different for that. Zephyr plus Zephyr plus Zephyr plus Zephyr plus Zephyr plus. Is there anybody that has an example of how much worse the new code looks versus the old Zephyr? It's a full request, so you can just, look at the, I think the number was up on the slides. Yeah. And I couldn't pull it back up right now at all. Just served for non-recursors. Yeah, you just served for non-recursors from the GitHub. So when you just, from what you just said, you just go back to kindergarten. That's the pull request for the person who was asking. You got it? Great. All right. I think we're about actually out of time. I started my time a minute late. Let's keep discussing, but for now I guess we should let the good folks who are in the back of the room and running the room have their time back. Thanks a lot.