 Hello everyone, welcome to our session talking about insights on rearchitecturing to a single code base. My name is Erez. I'm working at Ogory for the past four years. I'm a firmer embedded engineer. I'm also a firmer engineer at Ogory for the last two years. So what is Ogory? Ogory providing predictive health maintenance for a predictive maintenance solution for clients that has critical equipment that need to be monitored. So what we basically do is place sensors over our machines with different kinds of sensors implemented in them. We sample and send the data using Bluetooth to a local gateway and to our cloud platform. So basically we monitor, we diagnose, we guide the owners of the machines what to do with it. Hopefully they should act accordingly and repair or do whatever action is necessary. So today we're going to talk about single code base and we'll explain what it means at least for us. And then we're going to talk about why Zephyr is a good fit for a single code base concept. And afterwards we're going to speak about migrating to Zephyr. Whether you have different kind of code base or software stack and you have to migrate to Zephyr. We're going to explain how it's done. And afterwards we're going to proceed, talk about single code base design rules. So this was our starting point. We had a legacy device on the right side that was developed a few years ago over a very old Nordic SDK. It was based on Nordic chip. It was developed on very old Nordic SDK, supported up to BLE 4.2, and it was un-updatable for us in that particular aspect. And pure data, we developed the new generation device. We chose Zephyr authors for that. And of course we used the latest NCS. We keep on updating that. And basically we created a situation that we have two different devices, but that doing more or less the same functionality, but have two different code bases, entirely different code bases. And we didn't like this situation. We wanted to create one single code base for all our hardware devices that can be used for us to add more features and maintain these devices in a better way with one single code base. And this is why we chose Zephyr for that, because Zephyr emphasizes hardware abstraction and modularity in a very nice way. And these are important features for a single code base application that runs on multiple hardware, because you want the ability to compile two different kinds of boards and use the same application for that. And also modularity is very important, because you want to exclude and include features. It depends on what board you're using. And here's an example of application there on the top. And below that, below the Zephyr abstraction, we can see two different hardware. The applications use, application needs accelerometer, sensor, it needs magnetometer, it needs thermometer, and it uses a flash. But in these two boards, the sensors are different, different accelerometer, and so on. And also the flash is different, but the application is exactly the same. Another reason to choose Zephyr for a single code base application is that a lot of features are already built in. In the Nordic Connector SDK and Zephyr. For example, we adopted the usage of MCU boot. Let's provide us a secure boot mechanism. We also use a file system, which was a little fast, but Zephyr is a more than one choice of file system. And we also use Memphold, which is a very powerful debugging tool. They're also here presenting in the exhibition. Simpsys DSP, compression libraries, and many other features that are already built in in Zephyr. So when you're designing or thinking about single code base application, you need to attend to a lot of issues. For example, are your devices that you want to change the code base already deployed in the field? And if yes, will you be able to update them over the air from one code base to another code base? Will they be backwards compatible? After the update, will they behave the same in your IoT stack? Will they perform well in terms of memory and performance? Perhaps there are older devices that have lower capabilities, and performance will not be suitable for the new application that implemented on it? And of course, how will the boot mechanism look like in this case? So in order to proceed on that, we came up with a strategic plan. We decided to tackle first the major risks, make sure that we can achieve that, and only then continue development the single code base application and then deploy it. And the first risk or the first milestone that we marked for ourselves was over the update. Will we be able to update the devices that are already deployed on the field from the older code base to the new code base? So let's talk about that a little bit. Our legacy devices of those stack look like that. Remember, it's a legacy. It's a very old SDK from Nordic. So it launches from address zero to the bootloader, and then launches our application, which is the Oguri app. In this case, we used our own bootloader, and not the SDK bootloader, but it doesn't matter too much. So this was the old software stack. This was the new software stack, Zephyr-based, launched from NCO boot, and then we have the NCO boot secondary and primary for the application. So what we basically needed to do is to shift from old software stack to new software stack. Well, is that possible? Well, it is. We know that our application has access to the entire internal flash, and we know that we are able to disable and enable the bootloader as we want. And immediately we're going to see how we use these two effects in order to accomplish the migration to Zephyr. So what we see here is the internal MCU flash, software stack, and on the side we see external flash. And let's try to follow up. So first we perform regular firmware update for the Oguri application. We updated the Oguri application 2.0. Once we do that, this new application now has migration capabilities and can help us in the transition. So we pass along from the platform to the local gateway to the MCU, to the external flash, three artifacts. One of them is the new version of the bootloader. The other one is the MCU boot. And the last one is the Zephyr app. And these last two are the artifacts that you actually want to migrate to. So once they are safely stored in the external flash, we disable the bootloader. Remember, we can do that. And the application also can erase it and then copy the bootloader that we stored before to the same location. Now the bootloader 2.0 also has migration capabilities and it will also participate in this game in a minute. Now once it's copied to the internal flash, we can re-enable it and boot. Now we're running the bootloader. Once the bootloader is running and takes control, it scratches the internal flash except from the MBR and copy MCU boot, copy Zephyr application. And once they are there, we can disable the bootloader. We don't need it anymore at that point because once we boot, we boot from the MBR. And because the bootloader is disabled, we're going to jump to others 1,000 and then to the MCU boot, which takes control from that point on. So what we did is successfully migrated from old software stack to new software stack over the air using BLE only without flashing or something similar like that. And we've proven it's possible to do it over the air and we can continue on developing our project, which is a single code base for the entire install base. And let's continue from here the second part of the session. So now we're going to build in the application and strategy is the same risk first. And the first uncertainty we had is with size. And the reason we had that uncertainty is that our new devices have 266 kilobytes of RAM. And when they built our Zephyr best application, they got to 160 kilobyte of RAM footprint. Our legacy device has only 64 kilobytes of RAM. So there was a serious question, can we squeeze it? And to answer this question, we took our Zephyr best application. We gave it a very serious diet. We left only the communication protocol. We built it on the legacy devices and it built. It was the first milestone and it also worked with the BLE communication. And at that point gradually we added more devices, more drivers and more models to achieve a complete full application. While doing so, we used onsize reduction. We removed unused config. We optimized stack allocation. We shared a walk you instead of having several walk you if it was possible in our case. And we used the mempool. Zephyr allows to have a statically allocated memory managed like a hip, meaning you can allocate and free dynamically while the application is running from that statically allocated memory. And we used that. And we used it to save memory. We had a static allocated pool. And at one state of the program application that needs buffers like sampling and streaming to flash, we allocate from this pool. And once this stage is over, we freed it. Next stage was to transmit the data via BLE. Again, we needed buffer from this. From the same pool, we allocate these buffers and the freedom once the procedure was over. So by that we saved having all the buffers stack all together. Keep in mind when doing some stack optimization, there may be a case, maybe not right when you do it, but maybe later along the way, did you have some stack overflows? It can happen. And just keep in mind, if it happens, goes first to the stack optimization, maybe it's there. Zephyr gives us stack sentinel, which is a tool that the checks help to check for a stack overflow. The way it works, it has some magic number on the address of each thread. And every time there is some context switching or some other operation on the thread, it checks this value. And if there is some problem, some corruption, it will give an alert. The minus of this is that if the corruption jumped this number, then you will not get an alert. And also it will not give you an alert exactly when the overflow happened. It will give you an alert only when it checks this value, which means when there is some operation between the threads. It's also hardware stack protection and MPU stack card. We use the stack sentinel because it has less footprint than the other two. So at that point, we have an application that compiled full application. And the next uncertainty was functionality. Our sensors sample at a higher rate. And they streamed all the samples to flash. So it needs to achieve, to write to flash fast enough. And the reason it was an uncertainty, our new device has none external flash, which can write two kilobytes per one page write operation. And it writes faster than our no flash device that we have on our legacy device, which can only write 256 kilobytes per page write operation. Meaning eight times more operations requiring CPU intervention for streaming the same amount of data. And again, that was a big question mark. So we flashed the program and we send the command for the end point to sample and crossed our fingers. And it didn't work. And we had to, we didn't give up, moved to a scene, what we can do to optimize the procedure. Our program, all with our application already used PPI and DMA. And if your hardware allows it, you should use it. It saves on CPU. But what we did is use the profiling with a picoscope, with some GPU at some functions. And what we see here, the blue line is the sensor interrupt whenever there is a sample ready. The yellowish line is the thread that manages the write into flash operation. And the red line is a thread, is the high priority thread. And it takes, you see here, like four and a half milliseconds, which is quite a lot. And what it basically does, whenever the driver collects samples, and when it gets some certain chunk of samples, it calls on the callback for the application. Now this driver, we run it in a high priority thread because the accuracy of the rate is important for us. And in the same thread, it calls for this application callback. And its application callbacks, basically, its job is to take the samples and give it to the thread of the flash streamer to, not to the thread, but give it to the buffer of the flash streamer to put it to flash. But what it does this thread, this callback, it calls on the normal Zephyr API, which is sensor of fetch and sensor of fetch for each sample individually. So there is a lot of sample fetch, sample get. And what we sought to do to avoid this is to change the callback API and the Zephyr normal API to give us the buffer, the pointer to the buffer of the chunks and not use the sample fetch, sample get operations. Basically, it's all there. I'm not going to go into details. You can see it in the presentation later. But this is like the error show, the sensor, the Zephyr generic code that we know. And the changes that we made to it, we created like our own sensor of each module, header, sorry. And by that, we were able to reduce the time of this thread from 4.5 milliseconds to 220 microseconds. And it enabled us to achieve the functionality that we wanted. And we can actually see that it was not the flash that is writing slow, that did not let us do the functionality. But the way we used the function, the thread was too long. So the flash operations did not happen every time the flash was available. Okay, so second uncertainty was removed. And at that point, we were very happy. We were confident that we will have this single code base on the way. And we moved from the red light to green and yellow. We'll talk a little bit about design because this is not just like we did to make it work. When building an application that should support several hardware, you want that each build will contain only what it needs. You want to avoid having some modules that are related to other boards that you compile only part of them using if-defs and stuff like this. So in a general note, avoid filling the code with if-defs, if any possible, and for exclusion of models from CMIC, module segregation and encapsulation. And also use the device tree for distinguishing data. And some examples for using the device tree to get some generality in the application. So here is the example. We have two device tree. One is we see the drivers of the sensor implementation, sensor 3 and sensor 1, different sensors. But we use LISs, they are both vibration sensors. In that case, it was possible to do this LISing. In the application, we will use this macro DT-LIS with the VIP sensor. And the application doesn't care from which device it is or which sensor it is. It will know how to bind it. Also, we can use information like if it's, for example, flash. We can use the page size and block size, put it in the device tree. And by that, we will not have to configure it specifically for each board or target or have separate models for different flash. You can use, just draw this data from the device tree, and that I have more generic application. And I'm going to talk about, show an example now of how we sometimes need or want to expand an API to have more generality. And this example is the beast. Beast is a built-in self-test. It's a mechanism that we have that whenever our device is powered up, they go over all the components in the board and check if they function as they should. So we have a beast module manager. And for each possible driver that we have on all of the boards that we have, we have a module that uses Zephyr API. If it's, again, a sensor, FET sensor, sample get, trigger set and all these APIs or if it's a flash, write to flash, write to flash, see that it is, see that you really tried. So these models use, these app models in the application use this API to talk with the drivers, which are in the driver area. And the code basically looks something like this. We have a beast directory with all the modules for all the possible drivers, again, that all our drivers have. And we have a lot of if-defs in the code, also in the included section and also in the function itself. And we didn't like it so much. And we thought, how can we, sorry, how can we create agnosticity to hardware in the application? And the way we thought to do it in that case, in the beast case, is that we said, okay, maybe the driver will implement their own beast. But then how will we call on this beast without using the normal API? Okay, maybe we need to expand the API and we'll see in a minute how we did it. Another thing is, how do we bind to the driver? Again, we want a way to bind to different drivers that are not all vibrates and sensors, they're different sensors and flash and stuff. So we need to bind without knowing. So first things first, this is how Zephyr currently without any changes of power implements the drivers. It's just an API, just a reminder. So we have a sensor H model. We have a sensor driver API with all its functions. And here is an example for a sensor channel get function, now it's implemented. And then in the driver module, we will have the sensor driver API struct and we will give it to the initiator. And this is how it is normally work. What we did, we create like a wrapper, another module that we call it my device for this example. And we include it also in the CMEX. And in this module, we build our own API. We traps the Zephyr API with the union. This union will have the types of drivers that we know if it want, if it's a sensor or flash or some other drivers. And also the beast run. Now, important note is, if you implement it like this, we want to stay aligned with the Zephyr functionality of each driver. This union must be the first element in the struct. Because the pointer points to that element. And once we, in the application, if you use a sensor channel get, for example, we will give it a pointer to the device. And in the implementation, we can see explicitly cast to this type. So it must be the first argument. And after that, we include it. This is my beast implementation. This is again in the driver model. Instead of having my sensor device API, we have my device API. And we implemented the beast function. And we put it to the initializer as the device API, driver API. So this was the first section. The second session is how do we bind without knowing. So we decided to add a beast property to the driver, which driver that implements a beast. It's a Boolean. So if it's in the device tree, it's true. We add it again to the device to drivers that implement the beast. And MacroBotics is always fun. Basically, what this macro does, this macro is in the application layer. It's in the beast model. And what this macro does, it goes over all the nodes in the device tree. Each node that has a status okay and a beast property set to true is inserted into this array of device pointers. And so in the application, we can use this array. And again, not knowing anything about the pointers there. But instead of having all these models here that we don't need anymore, that I remind you was for all possible drivers. And instead of having all the if-depths, we have much cleaner code, which is only an array, a loop going over the array with our new My Beast Run API. And that's it. So we wrote a blog. I wrote a blog about this topic, this single code-based construction. And there are some personal notes also, and most of the data you heard here. And you are welcome to enter and comment, and I will respond. And I want to thank all the image-peak photo contributors. And there are some in the presentation. And thank you for your interest and listening. And we are open for questions. We use really, I don't know if it's really large, it's a matter of perspective, but I think it's 16 megabytes. This is the size of the flash. Not after we did the improvement. Okay, if there's another question, thank you.