 My name is Ulf and I'm going to talk about continuous deployment of IoT sensors. First we're going to have a look at what really is continuous deployment, but also why is it useful for embedded devices. And we're going to look at the different parts of such a system and the things you need in order to build it. So I'm a software engineer at Red Hat. I work in the drug IoT team and that's a project which really is about creating these open source components for building an IoT solution. So we have a lot of open source projects around building applications, but you don't really have this end-to-end story about the device to the cloud, to the applications consuming the data. So it's a collection of components. One of them is a connectivity layer so that you can ingest events using several standard protocols. It's a device management registry so that you can manage the configuration of your devices and do the provisioning and store the credentials. It's also a firmware update service which we'll talk more about in this talk. And we also have a toolkit for writing firmware for a lot of popular market controllers. So with this we get this end-to-end solution and hopefully you can use this to build your system and not reinvent the wheel. So with that, what is continuous deployment? So I tend to think about it in three different ways. So you have continuous integration which is the shortest cycle of testing that you have centrally in your team where you build and you run maybe some simple not-too-long tests, but it depends really. And then you merge your changes and you continue to do this so that you know that your system is working. Then you have another step which is the delivery step. What that is is that you really add the ability to create some releasable artifact from your stuff. So you can do this for every build but maybe it's not that necessary or efficient to do that. So the releasable artifact, with that you can deploy to a production system. And that's what continuous deployment is at least in my world where you take that artifact and deliver it to the production system. So all in all you have these three different types of integration. And with the continuous deployment you can get this full end-to-end lifecycle of you writing your application and ultimately it ends up on some device. So what sort of devices are we talking about? We are talking about really tiny microcontrollers in this case with very little RAM and also very little storage. So this means that you have very little facilities to do updates of your applications. You basically need to build this into your application and we'll see how to do that later on. So why would you want to do continuous deployment of IoT sensors? After all it's not like all IoT devices need to be connected but those who are connected might have bugs. So you might want to fix that and the idea is that by rolling out firmware updates you fix issues in your software. There might also be non-critical fixes but there might not be bugs. It's just simply that the world is evolving around you. Maybe a root certificate needs to be updated and rolling out firmware updates might be a way to do that without over-complicating the deployment pipeline. So you have this mutable firmware images that they deploy. There's a lot of factors to consider when thinking about continuous deployment of sensors. One is that it's a big risk of course. You might need to make sure that your firmware will work because if it doesn't then the device can actually stop functioning and you have to perhaps send someone on site to actually fix the device. So that's not a great solution if you have millions of devices delivered across the world. So that's one aspect. Another aspect of the risk is that a lot of the microcontroller software is written in C which has a lot of pitfalls. So you really need to have a lot of testing. You also need to think about resource usage for these types of devices because they're quite small with a little RAM, a little flash. So you need to consider that when you think about the size of your firmware but also that you might need to have two copies of it in order to swap them which we'll talk about later but size and resource usage is important. Most IoT devices are connected in different ways. So you have different bandwidth restrictions depending on which wireless technology you use. Your device might not have the capacity to actually retrieve the firmware updates in one big swoop. So you might need to spread it out. And finally the tooling for building a continuous deployment pipeline. There's nothing that's tailored to embedded but we might be able to reuse some of the existing open source tooling that's out there in order to avoid reinventing the whole thing. So how do you reduce risk? You need to do automated testing on your device. Ideally you also need to be able to roll back to a previous version of the firmware because if you can do that then you can always try again later. That really allows you to, rather than preventing errors from happening, it allows you to plan for them to happen. And this might be a more robust approach longer term. And as I mentioned programming languages might have something to say as well. You can use a language like Rust which is what we like to use because it eliminates a big chunk of the possible bugs that you can have in your firmware. And programming language also matters for the resource usage because you can't have a big runtime for your language if you're going to run this on a microcontroller. And for the connectivity there are a lot of different options there. You have Wi-Fi or Internet, they are both a lot more power hungry than the alternatives. And you have these small network protocols like Thread or Bluetooth which require an additional gateway in order to connect to the firmware update services. Or you might have some long range system that might already have gateways deployed like Lora One or via your telco provider. So they all have different trade-offs but for instance with Lora One there's a limit on how much airtime you get for your device. So you won't be able to consume firmware that quickly. Let's say you have 64 kilobytes of firmware that if you're using a publicly available things network, gateway and account then you will spend four days doing that firmware update. Which might be fine but it's something to think about. Finally, we have all this tooling for building regular applications. We can run them on Kubernetes. We can build the applications using tecton pipelines. We can store the artifacts in container registries. And that's a lot of things that is just there for regular applications but it would be nice to be able to use that for embedded as well. So how are we going to do that? So let's start with the sensor, the lowest level in the stack. So the sensor firmware needs to be able to roll back to previous versions which means we need to have some component which is a bootloader to load the application. So it loads the primary active application but you reserve some space in your device so that you can store an updated application. And then it's the bootloader's responsibility of switching between them and then persisting some kind of state to know if the firmware is good or not. Like if it tries to run the new firmware and it doesn't work it needs to make sure that it can load the previous one before allowing it to be overwritten. And the way we've done this in our system is that we have a very minimal bootloader that is only able to swap these images. It doesn't support any networking. So all the process of retrieving the firmware is done by the application itself but using a library that we provide. And the advantage of this is that your application can continue to function while the firmware update is being downloaded which might be important if you're taking four days to download it. So that's the bottom layer of it. Some devices need a gateway as well which we won't cover here we're just assuming that there is some gateway deployed. And finally on the cloud side the events from the device comes to the cloud via the draw cloud connectivity layer. And you can use this protocol endpoints in HTTP, MQTT or co-app to send events to draw cloud. And you can report the current firmware status from the devices. So when the device reports its firmware status it gets sent to the protocol endpoints which checks the authentication in the device registry. And then it stores these telemetry events in the Kafka itself. And then the consumer applications consume these events using Kafka directly or via some integrations that the connectivity layer provides like MQTT or WebSockets. And the nice thing about this is that the applications don't have to care about how the data gets from the device. They can just consume it and the device don't really need to care about which application is consuming the device. So the applications can be anything from a serverless function to a digital twin system to actually a firmware update service which is what we'll demonstrate here. You can use draw cloud to manage devices and gateways. It has a REST API and also a console. So with this you can group your devices into applications and you can also have a gateway that is registered to serve a given device. So ultimately you can do provisioning with this as well. For firmware updates we have made a delivery and build system called DrawGazure. This system is just like any other application but it uses draw cloud as the connectivity layer. And the advantage of that is that it automatically gets support for all the different protocols that the connectivity layer supports for the devices. Ultimately it means it can deliver firmware updates over MQTT, HTTP, Co-WAP or Lora1. It is really nice. It uses a special, let's say a small type of message using the C-BOR encoding which allows the device to minimize the bandwidth it needs in order to send these updates. So the device sends updates to the firmware delivery service about its current firmware and then the delivery service sends back a message saying you're up to date or it sends a next firmware blob. So the nice thing with this is that the update service doesn't really need to care where in the process each device is. It's up to the device what it wants to fetch next. And the delivery service can fetch the firmware from different sources. One is Eclipse Hawk Bit which is another open source project that specializes on managing firmware and scheduling updates and so on. But we've also been playing around with the idea of storing firmware in container images so that we can reuse all of the mechanisms that exists for other cloud-native applications using build service and combining it with the tecton pipelines. So that's really what the update service is. It has a console so you can get an overview of your application devices and your current builds. You can also see the progress of each device and the information about the progress is stored in the Drogcloud device management system so you can also get that using REST APIs. And if you're familiar with Kubernetes resources it follows a similar schema to the data. You can also trigger builds from this console. So if you have a build section specified in your firmware, which we'll see later, you can actually trigger the builds from here. And with that let's do a demo of the entire system. So what we're going to do is to have a microbit connected to a gateway, which can be anything from a Raspberry Pi Pico or a more powerful thing. And the gateway forwards these events to Drog IoT. The Drog IoT runs in the cloud. It uses a container registry for storing the firmware updates and we also use the tecton pipelines to build the firmware. So with that let's go to the demo. So in this demo we're going to use the Drog IoT sandbox. This is a public service which you can use to build your IoT system. It basically contains the connectivity layer so that you can send the data from devices with HTTP, MQTT, co-app or even LoRa one. And you can run applications on your own servers that connect to this to consume the events. So it's very much like a broker or an Apache Kafka instance, but with some additional, let's say, services on top. For device management, authentication and also the ability to integrate with other services like the things network. It runs on an OpenShift cluster and uses a managed Apache Kafka instance from Red Hat. So you can free to use this. You can also run Drog IoT on your own, of course. It's an open source project. But it might be easier to try it out this way for the first time. So we can log in to the console and what we're going to do is create a device in the registry for our microbit here. And then we're also going to specify how to build the firmware for it and then trigger a build. And eventually deliver that firmware to the microbit while it's operating, reporting temperature and so on as normal. So let's start by creating the device. We'll use the command line for it. We can also use the browser, but there's some functionality that you need to use the command line for. So we're going to start with that here and then move to terminal there. And then we have the ability to create the device. Let's have a look at what the device looks like in the browser. We can find our device here and then we have a special section in the device configuration, which is about how we deliver firmware for the device. So in this case we're going to store firmware in container images, just like regular applications. And then we're also going to build it using tecton pipelines, just like regular applications. And whenever the pipeline is finished running, it will also publish the firmware to the internal registry on the OpenShift cluster and use that to deliver firmware to the devices. And then now that we have the devices ready, we can also go and trigger the build, which we'll do. So we have the firmware console, which you can use to do that. The firmware console, we have an overview of the devices. We can see that our microbit is not yet in a known state because we haven't really started it up. So what we have is a new version of the firmware already ready to be built, so we're just going to build that. And then I'm going to edit the firmware and run the build locally to make sure that we have something that we can see the change up. So in our initial version here we're going to flash it, which means programming the bootloader and the Bluetooth driver and our application so that the microbit will blink with letter A. And our firmware built will be built from the GitHub repository, which will cause it to blink B once it's updated. But in order to do that, we need the gateways in order to deliver the firmware to the device. And we're going to look at the browser again and see that our build is running. We also have the device reporting its state eventually, and we can also look at the temperature data from the device. So it might take some time before the system will report the status. So there we have the temperature being reported. And it seems like it's very hot here, but it's because the unit is really... You need to divide it by 10 in order to get the actual temperature. This is in order to save space within a single byte. So the temperature is being reported in a special channel named Foo. And we have channels out of the way to multiplex data streams on top of the topics in Kafka. And we can use this to have a special channel for the firmware updates. So now you can see it's reporting the current firmware status device. We can also see that the build has succeeded now. So we can actually see that the device is receiving the firmware updates from the system eventually. So we can see it's updated live here. And the target version is basically the Git revision. And this is basically sending updates over the connectivity layer as commands to the device, which goes via the gateway, which goes back to the device using these Git services. And while this is happening, you can still see that the temperature is being reported from the device. So it's doing normal operation while performing the update over the DFU channel. If you remember, I mentioned that when the device is fully updated, it will blink the letter B instead. And the reason for that is that in the source code we have committed that it's going to display B, but locally we've just flashed it to display the letter A. Just to see and notice that there is a difference. So depending on your device bandwidth, this is going to be much faster or much slower. The device is actually the one determining the block size of each update. So if you have a device that can receive larger batches of firmware in one go, your device can report that when it's reporting its firmware status. It also says which block size it supports so that you can sort of let the device decide the speed of the update. Because ultimately there is some delay between receiving the blocks, but it's also considering that devices might take a lot of time to update. So it's not about getting the delivery as fast as possible. It's getting it as reliable as possible. So now it's updated. You can see it stopped blinking because it's now actually swapping the firmware and it's actually rebooted and now it's blinking the letter B instead. The update progress says 100 because it haven't reported a successful boot yet. That's going to happen soon, hopefully. We can see if it starts reporting temperature. So now the device is functioning as expected, reporting temperature to service. And we can also see it reported something on the DFU channel, which means it got synced. So now our device is in a synced state and we delivered our first firmware update to it. And now we can make new changes, trigger the pipeline using the console if you want to, but also using REST API. So we can build automation into this and build a full continuous delivery pipeline. So with that, let's take a look at what we just saw. We had our device, which in our case was talking Bluetooth via a gateway, that we're sending from temperature readings, but also further status updates to the drug IoT connectivity layer. And the connectivity layer stores that in Kafka. And then the firmware delivery service receives those events by consuming from Kafka and essentially it looks at the status of the firmware, compares it with what's stored in the firmware storage, and then sends back either OK, you are up to date, or it sends the next firmware blob that the device should write, or it tells the device that your update is complete. You may proceed to reset and switch the firmwares. We've seen that in this case we used a container registry to store the firmware, but you can also use other registries if you wish. This is very similar to how applications running in containers that are not embedded are built and deployed as well. So it's very similar to that flow. We also saw how the build service, which we triggered using the REST API, basically creates tecton pipeline builds that build the firmware using a builder image and then publishes it to the internal registry of the firmware storage. And once the device, once the new firmware was available, the device started updating it via the gateway and at the end the delivery service instructed the device to switch and it came up reporting the new version, which was in sync with what is stored in the firmware storage. So that's the whole end-to-end and keep in mind that all of this was happening while the application was also reporting temperature and functioning as normal. You can also use this with direct Wi-Fi devices or using Lora One or any other technology that allows you to talk to drug connectivity endpoints because the protocol itself is layered on top of those or the update protocol itself is layered on top of those different transports. So that's a nice kind of feature that we get for free by running this on top of the connectivity layer. It can be slow, like if you fetched the firmware directly, of course, and it would have been a lot faster, but this way the device themselves determine how fast they consume the update. And that's really nice when you have a big fleet of devices that might have different capabilities. They might have different connectivity and so on. So it's really important to be able to roll out the firmware gradually and independently of each other. So to recap this presentation, we've seen how we've reduced risk by introducing the ability to roll back in the device firmware. We're using a bootloader for that and an active passive partition. It's very similar to like this AB testing that you get with regular applications but it's not running both firmers at the same time, of course. The device was connected using Bluetooth in our demo, but with Drogh Cloud you can connect any device. You can also easily integrate with different third-party services. Like we have an integration with the Laura one, the Things Network, and you can easily build your own applications that integrate with a legacy system or any third-party system that you might have. We've been using open-source tooling, of course. You've seen Kubernetes and OpenShift and you've seen Apache Kafka, Postgres, Keycloak and Tecton. So it's really about gluing this together and providing the right abstractions for IoT. You can try this demo yourself, the repository you saw in the demo. And also you can use the public sandbox, as I mentioned. It's freely available. In order to play with firmware updates, we've had to restrict this on a per user basis because otherwise you could potentially use a custom Bitcoin mining builder image. So we don't want that. So if you want to play with firmware updates, get in touch with us and we can help you out with that. And yeah, you can go to these URLs to just log in with your GitHub IDs and you should have examples there in order to get started. And if you go to our website, there's a lot of documentation there on how to use our APIs, how to deploy drug IoT yourself. But also how to use these command line utilities or even firmware updates and things like that. For the device side, we are working with the Rust embedded community to create an ecosystem with drivers and examples that work. But you should be able to work with any U-stroke cloud with any types of device as long as it talks to open standard protocols. And that's it. Thank you for watching.