 Okay. Hello everyone. My name is Vitaly Wol and today I'm going to talk a little bit about edge computing with the RISC-5 platforms and how it all goes together with running Linux and specifically Linux that has XIP technology enabled. Sorry. Let me move it a little bit further down. So I was supposed to have a co-speaker Maria Wol. Unfortunately it looks like she won't be able to come over and participate and show the demo because we have a problem with the babysitter. We came here. We hired a babysitter but she is stuck in a traffic jam. Things like that happen. On the other hand, what I can, I mean, you probably should be upset not seeing Maria here. I can promise that if that talk gets accepted to ELC Europe, hello team. If that talk gets accepted to ELC Europe, I absolutely promise that she will be giving it and I will be sitting with the kids. I mean, we have all the quality stuff. We're coming from Sweden, you know. So Maria isn't exactly a software developer. She's a photographer and she's a video engineer and eventually she started off doing QA for a console co-group for a console co-AB and this is how she came into this picture and this presentation and she did help with setting up the demo and, you know, running it. So I will not be talking too much about it, you know. But ELC Europe, you know, will look forward to that and she is currently living in Malmo, Sweden as much as I do. Even though I have more embedded background and I've been doing embedded stuff since 2003 actually. Working for Monavista and then for Embedded Ally and, you know, massive number of Swedish companies when I went to Sweden and moved to Sweden in 2009. So now I'm running a console co-AB Swedish subsidiary of console co-group, which is like a global company with a headquarters somewhere in the Bay Area. I think it's San Jose but, you know, who cares about that? Especially after the COVID times, you know. No one cares where the office is. So this presentation, I will briefly give an overview of risk five and, you know, what it is architecture-wise and where we stand with that. And some, I'll absolutely talk, hopefully not too much about XIP and Linux XIP because this is one of my favorite subjects. This is one of my, you know, points of interest. And then we'll cover a little bit edge computing and why risk five and edge computing go well together. And then we pass over to XIP and Linux and how all those things, you know, form a perfect circle as if, you know, you're trying to solve a puzzle and this is the solution to the puzzle. This is the answer. So optimistically, you will get to that. So risk five, risk five, what it is and what it is not. So it's the instruction set architecture and it's open source and this is to a huge extent its main advantage because it's royalty free and it can actually create anything based on the instruction set architecture that is specified by risk five specification and that he is especially important when you're working on some kind of design that is supposed to be low cost. And this, this is something, this is one of the keywords. We are looking for low cost design. We are thinking about low cost design. And since, you know, we're going to talk about edge computing, this is IoT. So we're looking at some kind of low cost design for an IoT device or a set of IoT devices. And since it's a risk architecture, and command wise, it's actually pretty similar to ARM. It's absolutely unavoidable. People are comparing risk five with ARM. And I'm comparing risk five with ARM. So I put together table where such comparison is done with some specific stress for XIP, even though I haven't really told you guys what XIP is, but maybe you already know. Who knows what XIP stands for? I can imagine that. Okay, anyone else? No. We'll absolutely cover this. Sorry, it can be a deficiency of my slide. Anyway, yeah, we'll get to what XIP stands for. But we can cover the rest of the table and we can go back to XIP later. So risk five, well, it's pretty similar to ARM as you see. And since risk five is an emerging standard, it's an emerging architecture, the fact that as you can see from the table that the 128-bit is not supported well, eventually it will be. And then ARM is a well-established architecture with a well-established community and risk five's emerging architecture and the community is not so well-established yet. But the good thing about risk five is that it's free in open source ICA as opposed to ARM. And that is very important. I mean, money is important, right? So if you're looking into creating an IoT design, you probably want to save some money on those small devices. You don't want to pay royalties. And then if you go for risk five, then you also want to have some kind of certainty that this whole thing is going to fly. And later in the slides, we will show how it can actually be made to fly. So we're not going to cover the rest of the table with XIP. We'll go back to that later. We'll first try to cover what XIP in a nutshell is. So XIP is a technology which decipheres stuff as executed in place. And that actually means exactly what it says. So the code is executed directly from persistent storage, the persistent storage, the flash. It should allow this execution. So it's basically an over flash even though now we have quite fast QSPI nor flash. It used to be just a standard slow nor flash back in the days. So we don't have to copy the code to RAM to execute it. And this is a significant advantage in a sense because we're saving times on copy and we're saving RAM itself. We don't need to spend that much memory basically in vain to copy the executable code and get it executed from RAM. But it comes at a certain price like you cannot change anything. It's a nor flash. It is linear. It can only serve as a media to execute from in read-only mode. So that said, you cannot modify the code that you flashed. So everything should be resolved at compile time. So you should be pretty aware and pretty cautious when you actually build an XIP application or XIP kernel or XIP user space for that matter. And since we're talking about XIP applications, it's actually more common for real-time operating systems where you have a certain blob, a complete application, no separation between kernel user space applications or whatever. It comes with a single binary and it's flashed to nor flash as a single binary and it's executed. And for instance, Zephyr does that. So if you develop a Zephyr application, it can be XIP and then you have everything in one application, you flash it to nor flash. And all the addresses, all the stuff is resolved at compile time and then you flash it and you run it. It's a bit more complicated when it comes to XIP on Linux because we absolutely do have the distinguishment between kernel space and user space and kernel should be separate binary basically and user space, the separate file system. So when it comes to XIP on Linux, we mostly, first of all, think about kernel XIP. So the kernel that can be executed directly from a nor flash. And kernel XIP is currently supported for just a few platforms and historically the first platform to support XIP was ARM 32-bit and it goes back like more than 10 years, maybe 15 years, I don't remember. Most of the work was done by Nicholas Petrie and at the time there was no ARM 64 and then ARM 64 came in but there are still no kernel XIP support for ARM 64. And then there's RISC 5 and we do have kernel XIP support for RISC 5, not for all RISC 5 platforms. So the support that has been merged in 5.13 as it says in the slides, that support is currently limited to MMU-enabled RISC 5 64-bit targets. But the work isn't going for other targets as well but still it's RISC 5 64 MMU-enabled and now we were doing a lot of things to get this merged and there was another guy from the RISC community, Alexander Git, who helped a lot as well. So XIP is not supported for all targets that run Linux, for instance, X86 is not supported but still we are mostly concerned about RISC 5 and it's supported for RISC 5 and the support is in the main line. It's not supported for the MMU-less RISC 5, at least in the main line but there is the work that we've been doing so that support is there but it's just not merged. And then going over to user space, there's nothing that prevents user space to be executed in place as well. So basically it's the same idea as with the kernel. If the file system, well it has to be a special file system like CRMFS or AXFS or well there was a work for SquashFS to enable XIP for SquashFS. I think the patches are somewhere out there on the internet on GitHub or something. I don't think they ever made its way into the main line for SquashFS but still it's possible to find them and use them. However anyway it requires a special file system but the idea is very much the same. You need to have executable sections uncompressed because CRMFS for instance is a compressed file system, it compresses the data that it stores. But with a special markup like this is the executable section, you should not compress it with a special markup. It's possible to use CRMFS for instance for XIP where the executable sections of binaries that are to be executed are not compressed and then you can basically run the code directly from Flash while the data sections will be copied to RAM because they have to be modified. So if we take a short look at the standard kernel loading scheme then we first have a bootloader which initializes RAM and then we have kernel code that is compressed and you can see in the picture that on the NAND flash which is to the left the kernel code is smaller in size than in RAM which is pictured on the right. So a bootloader would decompress and copy the decompress kernel code into RAM and same basically goes for kernel data and then it will pass over the execution to the kernel code which is already in RAM and that is it is a fine scheme as good as any. Just as a side note we can see that a lot of RAM is occupied by kernel code and kernel data while on the other hand those things being compressed are not really occupying that much of the NAND flash and if we move over to XIP operation specifically kernel XIP then we don't copy code anywhere and in the very first in the very simple case we don't use compression over data so data is just copied as is to RAM because this is a prerequisite that it has to be copied to to RAM because you know we have to modify it and then the code stays in NOR flash and so we can see that with the kernel XIP operation most of RAM stays available while on the other hand we need to have a lot of or well maybe not a lot of but quite some NOR flash to use and finally if we also deploy the user space XIP then we have to use even more NOR flash but on the other hand we still have a lot of free space in RAM that one can use for something else or if we have very scarce on RAM or if we don't have any DRAM which is I think I'm moving forward too fast but still we are looking at IoT targets first of all and those IoT targets for instance the RISC-5 platforms that we use they don't have any DRAM they do have SRAM which is usually four or eight megabytes in size and I'll show you we can get away with eight megabytes easily if we use kernel user space XIP and that would not be possible if we didn't so that's why okay RISC advantages this is just a summary slide the obvious advantage is that we are saving on RAM and as I've said we can even get away without having DRAM for Linux which is quite nice you normally are not going to have that if you don't use XIP and also there is lower idle power consumption and you know when I'm saying lower I mean that and when I say lower idle power consumption I mean sometimes it can go almost all the way to zero because if you don't have DRAM then you don't have to run self refresh so the values for idle power consumption are on par with very small systems running real-time operating systems and that is very nice for a Linux system right shorter boot time if we have if we have the nor flash which is fast enough then we're going to save on the fact that we're not copying anything or not anything we're not copying much to ROM so we don't copy the code we don't run decompression and then again if we have if you have QSPI flash that is fast enough we can even gain on the faster execution compared to some RAM that is not so fast I mean it will be slower compared to I know running it on the DDR4 but once again we're talking about IoT devices battery powered primarily you know low cost IoT devices and they will not be DDR4 on low cost IoT devices okay risk 5 and edge computing let's take a few steps back and go back to to the table so I've mentioned it a little bit just you know passing by but since since this is important I would like to draw your attention to to this table and you know since we are looking into IoT and small devices and low cost those will be devices without MMU without memory management unit so it's mainly concerning the two lower rows of this table and we can see that for risk 5 it says in progress for 32 bit and it's merged in ARM for 64 bit it's ready but not merged it will be merged eventually I'm absolutely sure and 64 bit MMU less device is not supported HIP XIP wise on ARM so if if we are to to make a choice here between ARM and risk 5 for an IoT device MMU less and have it run Linux and that essentially pretty much means using XIP then we will either have to go for 32 bit ARM or for 64 bit risk 5 and 64 bit risk 5 would be preferential for edge computing but we're not we're not quite there but you know 64 is better than 32 and for edge computing we do need some computational power so as you might see there are there are some bits and pieces of the puzzle that are you know getting together already right so so edge computing you know when we talk about edge computing we do mean IoT so it's just it's just IoT which is not the very traditional IoT as we think of it where we have a bunch of absolutely dumb devices that just you know send the data up in the cloud and get some responses some instructions what to do next edge computing extends the traditional IoT model by moving some of the computations moving some of the intelligence closer to the edge while that being those devices themselves and it reduces volumes of data to be moved in between the devices and the cloud and it allows for better utilization of computational powers of those small devices the IoT devices that we have and finally and this is probably the most important thing it enables AI and intelligent IoT where the connection is weak or intermittent or unstable so that we can't really transfer large amounts of data from IoT devices to the cloud and back so why risk five when I talk about edge computing why risk five risk five has relatively high computational power for their MCUs and combine that with relative low cost and then again as we talked earlier the hardware design is open source there are no royalties there are no fees so you can design whatever you want or use some existing designs and still I don't have to pay for that on the flip side of that you know risk five architecture its support in many real-time operating systems is still somewhat lacking as opposed to Linux where risk five support is still not exactly on par with ARM but I would define it as advanced not ideal not super usable but it's an advanced stage and then edge computing applications are relatively complicated that's that there's a bunch of code to be executed there are some computations some intelligence needed and if we have I mean if we are to go for a real-time operating system for our edge computing design on the risk five we might be into trouble because the support in some real-time operating systems is not exactly at the production level as I've said and then gets harder to debug you know all those intelligent things that they're harder to debug in this traditional single app rt os environment so we might want to go for real-time operating systems for risk five but it's better not it's better to go for Linux if we can deploy Linux and then this is where xip is coming into play because Linux support for risk five mcu's link support for risk five is quite good basically for both mmu enabled and mmu less designs uh and and we can yeah we can trust it we can rely on that and even though as I've said risk five user space for non-mu targets is still somewhat shaking still I mean well if you have the kernel loading you can resolve user space issues so um the bottom line is it's tempting to run Linux on risk five mcu's for IoT because we get faster development and easier debugging and shorter time to market but if we just plainly do it with the mainline kernel and not using any tricks like xip uh then we might end up having a design that is unnecessarily expensive compared to what we could have had because traditional Linux requires a lot of RAM to run and is generally more power hungry than a real-time operating system so we might end up having more RAM and a bigger battery which is like not very nice and if we do it like this then we might kill the main advantage of risk five over arm design because the money that we have spared on on not paying royalties will be just now spent on battery and extra RAM and once again this is where xip is coming into play to fill in the last cap in the puzzle um so as I've said xip allows for significant to drastic RAM reduction and it allows for almost zero power consumption in idle mode and finally which is very nice and this is where this time to market statement comes from xip technology is transparent for application development that's that you can develop an application in an ordinary Linux environment and and then almost transparently almost painlessly uh transfer it to this xip environment which you wouldn't be able to do if you do it for a real-time operating system so okay real-life example there was a there was a test project down there in sweden for for the cameras for that that to license plate recognition for for people that break speeding limits I know this is not the most popular application out there on the flip side of that it's just a test project so there still are no fines associated with this project specifically but anyway image capture and recognition so makes do you know is risk five board with mcu which is 64 bit and it has floating point unit and it has eight megabytes of s ram and it has eight megabytes of nor flash and then we used linux kernel 515 plus our patches and build a base to use a space and we we did an attempt to to run to run that without xip and in that case the kernel which would be compressed consumed 400k in flash and slightly more than one meg in ram after having been uncompressed the root file system containing the very basics consumed 500k in flash and then in ram well it depended on the way of usage it's really hard to estimate but once again depending on the usage and the moment it was one to four megabytes and then we had this application with the recognition everything and consumed like 500k in flash and somewhere around to make a little bit less than that in ram and you can see that with those numbers we do have a trouble fitting all this stuff in the s ram that we have on board which is eight meg so yeah it doesn't fit but if we deploy xip we are going to use more more flash because we cannot use compression for the kernel we cannot use compression for the sections of executables on the root file system that are actually executable but still i mean we're going to have more flash we're going to have a lot less ram i use than that so for the kernel it will be 250 something plus dynamic loading dynamic memory allocation whatever it's going to be under one meg for the root file system and exactly as for the application it's going to be around two meg and then we're well under the the the ram size that is used and we don't have to use the ram which is normally designated for AI and it's not contiguous and it sort of has a problem so we actually do have six meg on megs to email plus two which we can use but it's kind of complicated but we're now under six so we're just having a happy life out there with xip so finally i think it's fair to say that xip goes very well with risk five designs with what they have to offer and it does allow to reduce sign costs and power consumption for risk five targets especially when we talk about iot designs and finally well it's a bit of a bold statement but still xip technology does enable linux for low-cost edge computing designs because otherwise in most cases it would just not be possible okay thanks for your attention that's it yeah questions yeah team has a question there's one question from the virtual crowd if i could ask it and the question is are you assuming network capability is available on the end device what about networking capability in the lot device and the last one is any recommendations if the design model supports having the nic on the broker i'm sorry i'm not sure i get all the question could you please repeat that sure how about that we'll start with the first one are you assuming network capability is available on the end device um yeah yeah for for the situation when we when we use cloud it's it should be available the thing is that it might be unavailable at some points so we we don't we don't assume it's available all the time okay what about um what about networking capability and the lot device iot device sorry i'm not a technical person i'm gonna guess that the question is does the is the is there a networking driver and stack in your kernel size for the it sounds like it is if you yes okay yes there is a networking stack but it's it's not it's not a big it's a relatively small one for instance because we we don't we don't incorporate in this example don't incorporate 80211 for instance because makes the you know allows for for the esp to be plugged on top of it which has all the wi-fi stacks and everything right so we we don't have to bother about it so so there isn't networking stack but it it limits to to either that more or less okay so my my question uh is uh the qsbi nor uh i thought that was slower than ram so do you have any benchmark on uh is it slower than ram or especially static ram so do you have any benchmarking on the like is your performance going to suffer and when doing xip static ram is absolutely for sure it's faster but but there's there's no alternative of you know not not using qsp or nor flash and using static ram right uh because we we don't have that much static ram anyway so it's the alternative is to either use qsp i nor for execution or use d ram for execution and then there there they're basically on par for for those designs because it's not a high speed d ram that she would use on those so as i've said if it if it were a ddr for uh we would have lost a lot but you know you're not putting a ddr for on a light bulb that is you know uh somewhere out there right or or traffic camera you're not you're not putting a ddr for out there so um it's it's basically on par when it comes to um the types of dynamic ram that you use on iot devices hey i think you were the first yeah first of all i mean i i certainly like the xip idea there um but regarding your demo application where you say okay uh it's better on risk five than on 64 than arm 32 um aren't you as things stand now a bit hampered by the fact that risk five doesn't have vector extensions whereas you could use them in arm uh that's i think that's a very valid point i i don't have i don't have the numbers at hand but it's very it's very application specific um so did you you did not like evaluate the performance of the two platforms against one another um well let me put it this way um you can't you can't evaluate an abstract risk 64 versus an abstract arm 32 so there there has to be like a specific risk 84 like k2 210 like this can drive thing versus i don't know for instance nordic nrf 53 40 for instance right so um um if if we take it like this um or or or for arm games it could be you know stm something uh we we we did some measurements on kind right versus um what was that the nordic 52 40 and in kind right kind right one with a low margin there are stm 32 uh cortex m7 designs like 723 730 all those now new things i don't i i didn't didn't do some the real measurements on those i think i think they will overperform um kind right but they're freaking expensive okay i mean i mean really they're really expensive the the cortex m7 uh stm parts are really expensive so they start to play in another field okay so that's that's the point i'm trying to make i see thank you and unfortunately this is the end of the session but the speaker i'm sure can be mailed made available and you got you all can talk i just want to give everyone a if anyone wants to go to break there's a 30 minute break okay thank you