 Okay. Hello everybody. Thanks a lot for coming today. It's good to see so many people here. My name is Richard Barry and I am a principal engineer with Amazon Web Services. I'm also the original author and founder of the FreeRTOS project, which I'm going to talk about today. I'm going to start then with an introduction to FreeRTOS for people that aren't familiar with it. And that will lead on, because we're here at Fozdom, I will talk a little bit about Amazon's work within Open Source and the projects they contribute to. FreeRTOS is under the stewardship of Amazon Web Services now, so that's where the relevance is. And then I'm going to get on to the meat of it and show FreeRTOS running on risk five architectures and how we've architected the kernel to enable the architecture extensions as well. So this is the brief introduction to FreeRTOS. I've got a lot to get through, so excuse me, talking so quickly. Hope I'm understandable. The clue is in the name here. It's a free real-time operating system. Now, if you come from the Linux world, you'll probably not consider it an operating system, but a kernel or even a scheduler. I normally refer to it as a real-time kernel. And this is quite good following on from the previous talk, actually, because this is also for deeply embedded systems. I've been carrying this graph around for about 15 years. I'm not quite sure how to quantify it, but this is trying to show where FreeRTOS fits in. On this graph, on the x-axis here, we have processing power, all the capabilities, you know, the resources of the processor. On the far left, 4-bit processor, you're not even going to be running C. You might just be running assembly code. Up here, we're looking at kind of low-end Cortex-A. Way off the scale, we've got the higher-end 64-bit processors as well. FreeRTOS runs there as well, but this is really the sweet spot in the middle. So the higher up the y-axis you are, the more applicable the FreeRTOS kernel is. So in this sweet spot in the middle, we are looking at microcontrollers, so very small processors, typically tens of kilobytes of data memory, okay? Just to get the scale here, we're not talking about Linux class processors. Hundreds of kilobytes of program space. So why would you run multi-threading on a processor that is that small? And it all comes down, in my mind at least, to maintainability. On the top here, I'm trying to show a diagram. I'm not a graphics designer, okay? I'm a software engineer, but I'm trying to demonstrate a very, very typical way of writing software for small microcontrollers. What I'm trying to depict is three pieces of functionality, each implemented as a state machine, and there's a superloop, so you're calling one, then the other, then the other. You know, a very good way of writing software if your software is quite small. But when we think about, particularly if we are adding connectivity or any kind of complexity over time as your application gets more complex, what we can do is use the FreeRTOS kernel to translate those state machines into separate threads of execution, which in FreeRTOS we call tasks, to avoid any confusion between processes and threads. We use the term task, but think of it as a thread. And then you can implement your code as a kind of flow diagram, and the kernel, the scheduler handles the prioritization for you. So as the functionality increases, you can manage the responsiveness and the maintainability much more easily. So the thing here is that FreeRTOS is a library. It's C source code which you build into your application. It's statically linked. So again, very, very different to the Linux environment. MIT licensed, I should have said. The project's been around for 15 years. And because of the application space, we are designing or not we, our users are making devices, real world objects, the kind of things that just looking around this room might be in the, I guess that's a Wi-Fi thing hanging from the ceiling or the smoke detectors or some control system in the projector there. These are devices which gets manufactured in their hundreds of thousands and put out into the world. If they're not connected, there's no way of updating the software on them. So people need a lot of reassurance, a lot of confidence that the code is going to be robust, that there's no IP issues in the code, that they can get good support. So over the 15 years we've built a distribution model which basically gives people all these assurances. There's even a commercially licensed version, if you want it. We have a strategic partner called Wittenstein who actually provides the code under a commercial license with commercial support if that is what you want. We only deal with the open source software ourselves. This graph shows the growth over, well, since 2004 to 2018. You can see it's going up all the time, still on a positive trajectory here. Currently downloaded about once every three minutes approximately which I always find amazing because I didn't realize there were that many engineers in the world. Use cases, many, many and varied. There's all industry verticals you can think of here. Fitness, trackers, automotive. The thing I really like because of my past in industrial computing is some of the interesting things in Industry 4.0 with factory automation, cyber-physical systems. We are basically surrounded by devices that are running free artis as we go around our lives. New use case, or I'll say new. Actually the first job we ever had was in Internet of Things. We just didn't really call it that at the time, SCADA systems. But Internet of Things is the real growth area of focus of interest at the moment. And this is where Amazon come in. Amazon web services have a whole plethora of cloud services by the way. I'm showing a very tiny subset here which are just specific to IoT. On the right hand two thirds here we have the cloud services which are specific to IoT or a subset thereof. On the far right we have the services like machine learning, analytics. That's where we actually extract value from connecting our devices in the first place. But to get there we have to have first the gateway to get into the cloud, then the security, the authentication, the encryption, the ability to over-the-air update things that are connecting, et cetera, et cetera. So that's all the undifferentiating work that Amazon provide for you to enable you to get to the value as quickly as you can. So that's in the cloud. Outside of the cloud we have on the left. Here there's a product that runs on Linux called Greengrass which is a really interesting product in that it allows you to take some of those cloud services like machine learning inference and that kind of thing and actually run them on the Linux box on your own premises. That's not what I'm talking about today though. I would encourage you to look at it. We're really looking at the edge of the edge. This is the proliferation of those billions and billions of microcontrollers which are wanting to connect. And there Amazon are adding security and connectivity. So we can think as very analogous to what's in the cloud. The ability to over-the-air update, the ability to connect securely, et cetera. The important thing to note here is that this is all MIT open source code and using open standards. So although Amazon are providing this software, you can use it for any purpose. It's there for the good of the entire FreeRTOS community. You can even go and connect to someone else's cloud service if you like. Because this is FOSDEM, I'm going to talk very, very briefly just a few slides to show some of the work that Amazon do in open source. On the screen now I have a whole load of different projects. These are all projects that Amazon contributed code to in 2016. So then we'll go through the next couple of years and we can see how this is growing. By the way, the size of the font is proportional to the amount of contributions that have been made. 2017, we've added in these orange ones. You can see there's a lot more. If we go on to 2018 here, we've added in the blue. And I have to say this was done about halfway through the year. So it's a bit out of date. I'm sure there are a lot more there now as well. Because you can see the growth in the number of projects that Amazon are actively contributing to. If we look really carefully in there, I've actually highlighted it on the next slide here. In yellow you'll see that FreeRTOS is kind of there. Okay, so let's do the interesting bit now and talk about running FreeRTOS on risk five. At the moment, we are really looking at the microcontroller space here. So it's machine mode only. This is the first official version. And there is the direction it goes in will very much depend on what users want. If we look at what the kernel actually does, most of it is generic C code. The same code runs on all 40 plus architectures that we've ported to. There is what we call the portable layer, which is the bit that actually has to touch the processor. So when the processor is running, here we've got the stack pointer pointing actually at the stack of the currently running task. And the stack just contains whatever the compiler puts on there or whatever the processor puts on there. But as we stop that task running, we have to save that context onto the stack. So these are the tasks that aren't running at the moment. You can see they've got all the registers on there. So when they start running, they're popped off back into the processor registers and it doesn't know that anything happened. Now on a fixed architecture like an ARM Cortex, the port layer is also fixed. This just shows the directory structure here of the source code. And at the top there, we see the source files that are common to all architectures. And then here's the portable layer. It's in a hierarchy, first the compiler. There are lots of other compilers, you just can't see them. And then the files that are specific to that architecture that do this register manipulation. Now this is interesting when it comes to risk five because the architecture is extensible. So we actually have to add another dimension in here. It's the same up to this point. We've got GCC then RV32, port layer code. And then we've got this new chip specific extensions here. And I'll look at that a bit more closely in a minute. Okay, demonstration. Let's hope. So I have two boards here. One is a micro semi board. And the other is this vagaboard. And both of these are easily available. This is open ISA.org. You can get these vagabords. And we also have the micro semi board, which is actually from Future Electronics. These are both low cost hardware boards. And what I'm going to do is show the code running on these architectures just to prove it's real. And on the micro semi board, I'm actually going to run it in an emulator called Reno, which is from a company called Antmicro. And if we look at what this code is doing, we'll see that it enters main does some initialization. And then it's starting a whole load of test tasks here. So there's test tasks and there are examples, et cetera. And at the bottom here, it is running this check task. And what the check task does is go through all the other tests which are self monitoring. And if the tests pass, it just prints out a period, a dot character. If the tests fail, then it prints out an error message. So start that running. And in the Renoed serial port here, we should see then that the periods are getting printed out. So you can see everything is running. Now, for time reasons, I'm not going to also show the vagable being programmed, but it's pre-programmed before. So if I run this terminal window, we should hopefully see that the code is, those periods are also being printed out at the top there. And we can. So we're running on two different boards. There's a difference between these boards. Now, this board here has a sci-fi core on it with just the base architecture. This board actually has two different cortex and two different pulpino risk five cores on there. I'm running on the risky architecture. In that implementation, we have six registers, six additional registers, two groups of three registers. The other thing that this board has is a vector interrupt controller and doesn't have the machine timer, whereas this board has the clint for the local interrupts and a separate external interrupt controller does have the machine timer. So you can see that the two, although the base architecture is the same, have different functionality or features. So how do we manage that in the code? Well, I spoke about this architecture extensions file here. If we can just zoom in a little bit here, we can see in the project and incidentally, these projects are just Eclipse projects I've opened. If you download the code from SVN, this is not actually released code yet, but it's publicly available in SVN. If you download the SVN repository, yes, I said SVN. It's rather old, okay? It's been there for 15 years. Then the Eclipse projects are all in there. You just open the project. This is a re-node I'm using, sorry, soft console, which is MicroSemi's Eclipse project here. So when you open the project, you can see the same architecture that I gave before. Now here we see the chip-specific extensions and I'm selecting this header file from the directory that says there's a clint and no extensions. If I look at that header file, we will see that there is this macro that says additional registers zero and underneath there, if I can scroll this down, there are two assembly macros, save and restore additional registers, and they're both empty. So if I then switch to the Polpino, the different architecture that does have extensions, here we will see that at the top, my mouse works, I'd say this is six additional registers and then these macros, very, very simply, what they are doing is copying the extended registers. There are six additional registers and it's copying them into core local registers and then saving them to the stack. And then likewise, we have a restore context and that's all we have to do to take care of those additional registers. But we also have to make sure that we can actually pick out the correct header file and that's just done by, if we go to the assembly properties here, we can see that in the includes, in the assembly includes, these are not the compiler includes, we just pick out the correct directory structure, I was just going off the screen a bit, but we just pick out the path to the header file that manages those extensions. The other thing I said was that the architectures had different interrupt controllers. We also have to set up to ensure that the kernel calls the correct interrupt controller. And we do that by setting this macro, again in the assembly, we've got this handle interrupt macro and we just point it to whichever handler is provided by the bought manufacturer to handle interrupts. In the Polpino case, all interrupts go to this vectored interrupt controller. In this Psi5 case, the client does the local ones and then only the external interrupts get sent out. So that's basically it. As far as tailoring the kernel is, there's a couple of other things that you have to do, like give it the base address of where the client is or if they're in the Polpino case, the client one, we just set the base address to zero. Okay, so finally, because I think I'm out of time basically, then if we look at this diagram, there are a few risk five implementations that other people have done. What I'm demonstrating here is our official one, which is the one that we create as the maintainers and give all the tests that we would, all the other kernel ports and then support. One of the things we have done is introduced this and separate interrupt stack. The other ones I've seen aren't using this. So in each task, when you create a task, you have to allocate a stack and the stack has to be dimensioned to make sure it can hold the entire call depth of whatever program it's running. And then if an interrupt occurs right at the maximum call depth, it then has to have the whole interrupt stack size as well. So if you allocate that, I'm trying to kind of demonstrate that on the diagram here, the task stack and the interrupt stack. If every task has to have that, when you add up all the RAM, which is going to task stacks, we're duplicating that interrupt stack. So they're very, very basically in the official version, we've just separated out that IRQ stack. And that means that only appears once and we've saved some RAM there. I've only shown three tasks here, but you imagine if there are 10, 15 threads, then that would make a big difference to your RAM consumption. Now the interrupt stack, you can define just by setting a PAN define, you can get the kernel to actually allocate that for you as a static array. But again, because we want to save as much memory as possible on these small MCUs, the other thing you can do is optionally define this linker variable. So I've just taken the linker script that came with the code and I've added in this linker variable, being careful to make sure that its address matches the address of the stack that's already allocated in the linker. The reason for that is you start the code, see runtime starts, main is entered, there's a stack for main. You then start the scheduler, after that the stacks are used, either the task stacks or the interrupt stacks. The stack that was used by main kind of leaks away and is never used again. So by doing this, what we are able to do is re-claim that memory. And I know that was a really, really quick, and I was talking really quickly, I got through to the end and I'll be around if people want to talk and I'll say thank you for coming again. A few details here on some URLs and some handles where you can find out about Amazon's work in open source. And there are some very distinguished people, very experienced in open source in Amazon and they provide this free book that you can go and download. So I'd encourage you to do that. I think I've got about 30 seconds, which if there are quick questions that's fine, otherwise I'm happy to talk. And everything worked. Why is there interest? There's interest because basically we want to do what our customers want us to do in that it's very, all the road map is customer driven and at the moment there's a lot of activity with free autos in risk five. There are four or five different ports and I don't think that kind of fragmentation is just not good for our ability to support the code and that sort of thing. So we want to try and have a kind of unified version that we are able to support. And to answer your question, the reason is because people are doing it already so there's demand for it. Any other questions in the 15 seconds left? Okay, so wrap up. Thanks.