 Yes, so I'm going to give a talk about building a product with Opti. So actually using Opti to build a product which you can deploy to production and be reasonably sure that nobody's going to leak your secrets, you deploy it into the product in the end. So a bit about me, my name is Ruben Zawinski. I work for Pengotronics ECHAR. We do a lot of embedded software consulting. So we do things like bought support packages, and we often work on embedded ARM platforms. You can find me on GitHub with the name EmanTor, and you can also contact me by email at this email address. At Pengotronics, I do a lot of work with Opti. I do a bit of system integration. So integrating Opti into the bought support package, which will eventually be used in the field. I also do a bit of testing because I wrote a testing framework like three years ago, which we still use today to test platforms when we eventually want to do updates on the platforms. Let's get short overview. So I'm going to start with a bit of introduction, a short introduction into the trust zone and Opti. Then I'm going to present a motivation for this talk. Then I'm going to show you the problems you run into when you're trying to use Opti in a product. I'm going to present the solutions so far for the platform I'm working on, but which should also be applicable to the platforms you may be working with. I'm going to come to a conclusion, and then there's going to be the happy outlook where I wish for things which are not implemented yet. So let's start with the trust zone. So trust zone in 32 bits divides our processor state into a normal world and a secure world, and then there's some kind of secure monitor which allows us to switch between the different worlds. So the usual CPUs come up in the secure world, then the bootloader starts up in the secure world. I deploy my trusted operating system, and the trusted operating system ensures that when it searches over to the normal world, my bootloader can continue to run. And every time I want to access some kind of trust zone data, or I want to call into my trusted operating system, I do this SMC called secure monitor call, which goes into the secure monitor. The secure monitor then does all the sanitization between the normal and the secure world, so ensuring no register contents are leaked and everything is sanitized, and then goes into the secure world. The secure world does some kind of computation. My trusted operating system runs my trusted application, which runs in the field, which also may compute some secrets, may request some pin entry from a secure device, whatever, and then returns the data back to the normal world. The trusted operating system we are employing in the field is called OPTI, which is the open portable trusted execution environment, and it's an open source implementation of the global platform T specifications, which use the trust zone. Global platform is a standard body, which also standardized smart card interfaces. So the idea is that you have a standard interface for your trusted applications, and then it doesn't matter which trusted operating system in the end you're going to use. OPTI also has support for various ARM platforms, so there's some support for STM32, TI platforms are supported, layerscape platforms from NXP are supported, and Broadcom platforms, for example, for simple development use cases, you can also use Raspberry Pis, so there's support for Raspberry Pi 3 to develop on OPTI there. And my focus is especially on IMX6 32-bit platforms, or specifically on IMX6-UL platforms. So the motivation for the talk is that we want to secure OPTI and the trusted applications running within OPTI for production use, so we want to be reasonably sure that we are not leaking our data and production anywhere, and we want to ensure that upstream OPTI can be used securely on IMX6, so all the changes I need to do to OPTI are not going to live downstream somewhere, so they're going to go upstream into the project, so we don't have to maintain that and can do that in conjunction with the upstream OPTI maintainers. And I also want to provide some guidance which parts may be missing for other platforms, because I'm specifically looking at IMX6 platforms, I don't know how to implement certain parts for TI or STM platforms and how to solve certain problems there, but I can show you the problems and maybe you can come up with a solution to the problem. So the problems are, which components do I need to secure OPTI? So which are the parts which need to be implemented by the platform, rather than by the OPTI core to secure the OPTI in production, and which part of this configuration is already done upstream? So what's the part that I don't need to worry about, which part has already been done by the upstream maintainers, or maybe by the SOC manufacturer who upstreamed his changes to upstream OPTI? And then again, the next question is, which parts of this needs to be managed by the system integrator? So which part needs to be managed by the person who in the end is going to assemble the whole system, including the kernel, including the bootloader, and including the trusted operating system? So securing upstream OPTI, from my point of view, consists of the following five or optionally six points. You need to employ some kind of RAM protection, or you need to employ the OPTI pager. You need a hardware unique key for your platform. You need to see the OPTI pseudo random number generator with a sufficiently random seed. You need some kind of peripheral access configuration, depending on your platform. And you have to ensure that you have a trusted bootup of your OPTI operating system. So you need to ensure that only the OPTIs you compiled can boot on the platform, because otherwise somebody could run other OPTIs and thus leak your secrets on the platform. And you optionally want to employ some kind of storage rollback protection to ensure that nobody can fool you with old data from a previous installation. So let's start with the RAM protection. So the one thing you can do for RAM protection is employ some kind of DDR firewall. And this protects part of my RAM from access by the normal world. So an example of this is the trust zone controller 380 hardware which supports the configuration of multiple regions. So you can say, I'd like to be 32 megabytes at the very end of RAM to be allocated to the secure world, and the normal world can no longer access the memory there. Every time the normal world will try to access this memory, it will generate an interrupt to notify you that somebody tried to access the secure memory from the normal world, and you will also read only zeros from the normal world. And for IMX6 platforms, there's this trust zone address space controller from ARM used inside the platform. So many SOCs may use different DDR firewalls. There are also custom implementations from different vendors. I know that, for example, the high key platforms use a custom DDR firewall which is not very well documented as far as I know, or the documentation is hard to get to. But for these ARM standard implementations, you can just download the datasheet from ARM. And the upstream, there's already an upstream driver for these trust zone controllers inside of Opti. And you had to previously describe the regions and all the stuff yourself, but this is no longer necessary because, depending on your platform configuration for IMX6 platforms, at least, there's auto configuration. So with an Opti, we already know the total memory size. We already know how much memory we want in the secure world. And we can just calculate how the region configuration inside this trust zone controller should look like. And then we can just apply the trust zone controller configuration and be happy that nobody can access this. It's also important to remember that there may be some kind of bypass for this trust zone controller. For IMX6 platforms, as an example, there's a single bit in the IOMUX which you need to set to disable the bypass of the trust zone controller because you have a certain bandwidth. So there's a small reduction of bandwidth going through the DDR firewall. And you have to disable this, obviously, because otherwise you can just bypass the firewall and everybody can read your secure world memory, which is rather bad. So next up, an alternative to using this run protection or configuring the DDR firewall is employing the Opti pager, where you run a small part of your Opti inside the SRAM on your CPU and then you encrypt all the other memory pages you want to store into a normal memory. You still want to ensure that nobody can overwrite the memory in the end. But even if they can read the memory, the memory is still going to be encrypted and authenticated, which does not necessarily require the DDR firewall. There are some constraints with this. Your device needs to have a sufficient amount of SRAM, so 128 kilobytes to 256 kilobytes. And for us, the chosen IMX6-UL does not necessarily have enough SRAM, depending on the version of the processor you select. And also for bigger variants which may have enough SRAM, there's constraints on other devices maybe requiring the SRAM. So even for this IMX6-UL, there's a pixel co-processor you can use to do simple frame buffer stuff or simple 2D graphics on an external display. And this requires 128 times 32 bits of SRAM to store the frame buffers for your pixel pipeline. So if you connect the display, your SRAM is going to be smaller. And for bigger IMX6 variants, there's the image processing unit or the GPU which also may require SRAM to store data because DDR memory may be too slow for this. So this was why we didn't choose the OptiPager as the way to go there and instead use the DDR firewall. So next up is this hardware-unit key. Opti requires some kind of hardware-unit key which is used to derive all the other keys on bootup. And this should be a unique per device which means that every device should have a unique key and subsequently once a device stores data it can only be decrypted by the very same processor. And obviously it should not be accessible from the normal world because otherwise people can just derive the key in the normal world and decrypt the data. For IMX6 we actually use a trick with the cryptographic acceleration and authentication module. There's something called the master key verification blob where you can say I want to do a hash over the master key which is unique per device and then we use this hash as the hardware-unit key. And additionally in the calm there's a bit where you can log out this generation or where you can increment a counter and the next one who does this master key verification blob is going to get a different hash. And you are never going to get the same hash again unless you reboot the platform and then the bit is unset again. And this is still not in upstream Opti. The pull request at the moment is closed because this needs to be rebased on the IMX6 7 calm driver. So NXP actually contributed a driver for the calm. The calm does more than this master key verification blob. You can also do AAS encryption and decryption in hardware. So a lot of crypto acceleration is possible there. It also does hash algorithms. And I'm currently in the process of rebasing this on the calm driver and then we should have hardware-unit key derivation for Opti platforms or for IMX6 platforms. So this will be done soon. Then the next one is RNG seeding. So Opti internally if you use a software RNG you can also use a hardware RNG but due to constraints on our platforms we don't want to use the hardware RNG. You still require some kind of RNG seed. And the default seed for a development environment is always zero. So it's kind of well predictable. That's not necessarily what you want for a product. So for IMX6 we have an RNG or true random number generator again inside this calm block and we can read out a seed for our pseudo random number generator at the very beginning of boot and then just seed the PRNG with this. And this is not implemented yet. It's up on my schedule. I'll get to it at some point and then we should have good RNG seeding at the start of Opti. I know currently if you start Opti on IMX6 platform it's also going to complain with a loud warning saying seeding RNG with zero's exclamation mark. So eventually I'd like to get rid of this and do real RNG seeding. So there's also the problem of peripheral access configuration. So SOCs have DMA masters usually besides the CPU and the DDR firewall only protects you from accesses which are marked as normal. So if they are marked as a secure world access your DDR firewall is just going to let the access through. And depending on how your platform works those masters may actually be default secure. So in the end I configure my DDR firewall correct. I configure my bypass correct. So my CPU normal world process can't access my secure memory which is great. But then I asked the GPU hey give me a DMA transfer from my secure memory and the GPU says happily here's your DMA transfer what else do you want. So you need to ensure that all masters are configured as non-secure and this is highly dependent on your platform. So read your reference manual and ensure that these access policies are configured correctly. For IMX6 there's a unit inside the processor which is called the central security unit CSU. There's a register inside of it where you can just say whether this DMA master is secure or non-secure and for this we just configure everything as non-secure except for the CPU obviously. And then everything is safe. If you want to do some kind of peripheral access from opti however then it gets more interesting. So if you want to do some kind of DMA you are in the secure world you may have to come up with some kind of API which is able to handle this case. We just configure everything default non-secure inside the IMX6 URL. And it's rather trivial to configure this register because on your platforms you just have to look inside the security reference manual and then find out which peripherals are actually on your specific SOC and then submit this should be really, really easy. And then next is trusted boot up. So you have to use your platform's version of verified or secure boot because otherwise anybody could start other opti versions on your board and then just leak your secrets and then verify it and this verifies the opti version to prevent replacements. So for IMX6 platforms there's a high assurance boot implemented by the SOC vendor and inside the boot ROM. So the boot ROM verifies our boot loader. The boot loader in turn contains a binary version of opti so the boot loader is verified, the opti is verified and the very first thing the boot loader starts on startup is opti. We try to minimize the attack vector in this case. And this is not something you can implement in upstream opti. This is something you need to ensure on the integration level. So inside this is something the system integrator has to do when he's designing the platform. So not only does he need to compile the kernel, compile the boot loader, compile the opti and bundle it all into a fine bundle. He also needs to enable high assurance boot on the SOC and sign the boot loader and opti correctly. And then he may also want to sign his user space and the Linux kernel to ensure that there's no untrusted software running on the SOC. If you're added, why not do that as well? And then there's also optionally storage rollback protection. You can use the EMMC feature of replay protected memory blocks which is an area where your writes are replay protected by a counter. Opti implements this as a simple FAT file system. So many MMCs have a replay protected memory block size of like four megabytes which is enough to store some kind of data or store an encryption key for data as an example. And this is already supported upstream. So you have to enable this replay protected memory block file system and then you have to deploy this one time during production to write a key or to exchange a key between your processor and your EMMC. And this is a one-time operation. You can't write this key again. This is then kind of fused to your EMMC. And then you run your normal opti which does not do the key exchange afterwards. And you also have to ensure to disable the emulation in the user space program which facilitates the communication to user space from the kernel side because otherwise this provides an emulation layer to test this in a development environment. But for production devices, obviously you want to disable this. So in conclusion, there's no platform which is totally secure at the moment in upstream opti. I'm slowly getting EMMC 6 to a state where I can say if you enable the correct switches and ensure that you use high assurance boot your opti should be secure enough. And vendor implementations may include the necessary bits. So because opti is a BSD 2 clause licensed downstream vendors do not necessarily have to open source the implementations. But sometimes you can bug them or sometimes they release them as well and then you can look at their implementations. But you still have to review all the code they implemented and get the reference manual and then cross-reference between the implementation and what the reference manual says. And even then you have to sometimes test whether their implementation works correctly. In the end you have to get a real hardware platform. You have to do all the different stuff and then actually test whether your memory accesses are now protected or whether your DMA masters can now really not access the secure word memory. So it requires a lot of validation and testing on your side. And then I'll get to the outlook or wish list. So currently we have a certain problem in that clock accesses and coordination between opti and Linux is kind of hard. I told you before that this cryptographic acceleration and assurance module is now supported with an opti. But the kernel may disable the clocks for your cryptographic module because the kernel is managing the clocks on the system. And if you try to access the crypto module in opti because your TA wants to encrypt some kind of data your platform is just going to get stuck because there's a transaction on the bus but nobody is sending it. So you effectively have a denial of service attack there. And I would like some kind of fix or some kind of framework for this. There's some work ongoing but I think it targets more ARM 64 bit platforms but I'll have to look into this again. And I also would like deeper device integration for opti. So currently it's a lot of platform device definitions with an opti and there's not much parsing of device tree information going on. The device tree is basically just used to insert some properties for Linux to parse so Linux knows that opti is available but stuff like memory sizes or available peripherals can all be passed from the device tree and opti already includes the possibility to include a device tree within opti itself. So in theory you could do all the probing of devices from the device tree as it's done with in Linux at the moment. And I would also like some more CI infrastructure to test each commit to opti master for iMix 6 because on the way here I tried to test the latest release which was like a week ago but unfortunately there was a last minute fix going in and my platform doesn't boot at the moment on the latest release. So obviously we need some kind of CI there and I need to allocate some time to get platforms up and running to ensure that the platforms don't get broken again. Yeah, so that's the talk and now I'd like to answer your questions. Yes please. The firewalls always spoke about the firewall for opti, but is it also provisional? Yes, so for iMix 6 platforms there's only this DDR firewall that STM platforms have a very have a more advanced method of assigning peripherals to the secure or normal world or the small cortex co-processor and I'm not entirely sure but I think there's a framework being talked about on how to coordinate this. Yes please. Figured by the people building the product and I was wondering yes this is so all the configuration needs to be done at build time and in the case of this replay protected memory blocks thing you also need two opti binaries so you need one opti binary which is basically configured the same as the other one but includes this right key provisioning so it actually exchanges the key between the CPU and the EMMC for manufacturing use. So during manufacturing you would write the key and then afterwards deploy your normal opti which no longer does the key exchange so the key doesn't leak. Yes, Ahmed. Yes, and you know there's some implementation discussion currently going on for opti but I have not looked deeply into it yet. Other questions, yes. No, I did not yet. I just got this merged like two days ago or three days ago when I did the talk and I still have to verify that these DMA accesses are now forbidden. I will probably just hack the GPU driver to try and ask DMA request but you could also probably do it using the URDMA capability so the SDMA firmware should be able to do DMA transfers there. So I will test this and then I will submit more pull requests to fix this if it doesn't work. No, I unfortunately don't have open PCIe ports on the IMX6UL it's too small for that but that would be a possibility on bigger platforms like IMX6Quat which has a PCIe port. Yeah. Other questions? Yeah. It's a part of the SOC in the system itself. It lives on the bus of the CPU.