 Welcome to this presentation about software security based on isolation. After the security part 3 about STM32 security features, I'd like to go on in more details about ways to create isolated environments on embedded targets and especially about TransZone. First, I'd like to introduce the concept of isolation and what is it good for. Then I'll go through the means of isolation that we have today on STM32 and introduce TransZone, which is a new feature coming from ARM that is integrated on STM32 L5. At the end we'll go through the development flow and also the C extension for TransZone applications. Isolation concept. There is a color convention throughout this video. The secure project is color coded as green, the non-secure as pink. The purpose of creating isolation barrier between the secure and user project is to protect the key secrets and assets. If Hacker gains access to the user application, they will still be able to cause mischief in those regions, but they will not be able to access any resource that is behind the isolation barrier. They cannot compromise the key, they cannot bypass the authentication checks, for example. To achieve this level of isolation, there needs to be a support on a hardware level. There is no way to do this purely on a software basis. In a typical application scenario, when the microcontroller resets, it boots into the secure part of memory and it executes the secure boot. This is an immutable piece of firmware that checks integrity and authenticity of user application. And if everything is okay, the execution is passed to the user project. A second thing which is also very desirable is export some of the secure functionality to the user project. These are so-called legal API. An example of this can be, for example, crypto operations or secure storage. And it's these interactions between the secure and user applications that are especially difficult and trust zones bring real benefit into this. One way to create these isolated environments is thanks to MPU and privilege and a non-privileged level of execution of Cortex Core. MPU is a core periphery that can create regions anywhere in the memory map. These regions can cover flash, SRAM, or even peripherals. The access to these regions can be allowed only to the core running with privileges, which means the non-secure application must run without these privileges all the time, which might cause issues in some cases. A second possible problem is that MPU is filtering transactions only close to the core and it does not cover other busmasters, such as the DMA. So to work around this, the secure application must restrict access to the DMA configuration registers themselves, which in turn means the non-secure application cannot take the benefit of DMA streams. Another way to create isolated environments is through firewall. So this is a proprietary SD periphery that snoops the bus transactions close to the flash and close to the SRAM. So this is below the bus metrics. The access to the protected regions is only possible once the firewall is open. Once the firewall is closed and the non-secure application tries to execute or read the protected region, the firewall will detect it and it will generate a reset. The configuration of the region is done by the secure application and once it's finished, it cannot be changed until the next reset. The big advantage of firewall compared to MPU is that it covers also other busmasters, such as the DMA. There is very specific procedure to open the firewall. The non-secure application needs to jump to very specific address and it is this well-defined entry point that makes the firewall secure. It's not possible for the non-secure to jump randomly inside the secure region. It has to go through this very well-defined entry point. Also firewall has some constraints. First of all, the interrupts must be disabled by the non-secure application before calling the legal API, before opening the firewall. The second constraint is that it's not possible to protect or isolate peripherals. Firewall only creates regions in Flash or SRAM. Another mechanism for isolation is secure flash. This mechanism is in fact very simple. It's implemented on the level of flash interface. In the typically use case, the microcontroller after reset boots into the secure part of flash where the security parameters are configured and secure boot is executed. This is immutable piece of code that checks the integrity and authenticity of the non-secure application. If the check is passed, the flow of execution is passed to the non-secure. At this point, the secure part of flash memory disappears from the memory map until the next reset. The obvious constraint is it's very difficult to have common interactions between the secure and non-secure only via the secure flash mechanism. All these interactions have to go through reset. This might be feasible, for example, for secure firmware upgrade, which is a secure service which is called rather rarely and takes quite a long time on its own. But it's rather difficult to have services which are meant to be called in a regular and frequent way. This table summarizes the isolation features with respect to different STM32 families. The secure memory, also called HGP, is present on L5, H7, G0 and G4. MPU is on every family except F0. Firewall is present on L0 and L4. And the new microcontroller STM32 L5 has secure memory, MPU and also trust zone, which is by far the best solution to isolate the secure and non-secure application. Trust zone introduction on STM32 L5. Let's now have a look at Cortex M33 and trust zone. Cortex M33 is a part of ARMv8 architecture, which adds an extra security state of execution. When the core is running in secure state, it has access to all the resources in the microcontroller. On the other hand, when the core is running in non-secure state, it has limited visibility of resources. And there is a great granularity in which these resources can be restricted. This of course is a job of the secure application running in secure state to define the split between the secure and non-secure world. It's possible to restrict access to multiple regions in flash and RAM. It's possible to restrict access to individual peripherals, even also to individual GPIO pins. Some of the core registers, some of the core peripherals are banked, which means they exist in two instances. One for the secure state, the other one for non-secure. So there are two cystics, two vector tables, there are two MPUs and two stacks with two separate stack pointers. The state switch is driven by hardware, which brings the benefit of real-time execution, meaning there is low interrupt latency, low switching overhead, and the state switch is deterministic. So on the left we see the ARMv7 architecture that we have currently on all other STM32 except L5. The core can run in two levels of privileges. Handler mode basically means the core is running inside the interrupt service routine or running an exception. Thread mode is ARM term for running the background task. The exceptions or interrupts are always run with elevated privileges. The thread mode can be either privileged or un-privileged depending on the software design. In the ARMv8 we add an extra security state, which is orthogonal, so we can have all the four combinations of privileged levels and secure states. Trust zone is based on transaction filtering on the internal bus. There are two levels of filter. The top one is close to the Cortex M33 core. These are the attribution unit, SAU and IDAU. The second level of filters is on the slave side, so close to the targets, which in this case is flash, RAM, external memories and also peripherals. The access rules of trust zone are rather complex and you can find more details about this in the security MOOC part 3. Here I'd like to just highlight that trust zone is not just about new core, it's also about the new specification of the internal bus, the AHB-5, which is adding a sideband signal which propagates the security states of the processor to the bus. This in fact allows the protection controllers to filter these transactions based on the state of the core. The protection controllers are distributed around the chip. There is one inside the flash memory interface. There is also the GTZC global trust zone controller, which is filtering transactions to the RAM, external memories and also some of the peripherals. Some of the more advanced peripherals are capable to process the security sideband signal on their own, and these are so-called trust zone aware. So global trust zone security controller, it's an ST proprietary periphery that can be found on STM32L5. It configures the secure areas inside SRAM, external memories, and it also configures the security of most of the peripherals. It also aggregates the illegal access to the restricted regions. So if a non-secure application running in a non-secure state tries to access a secure region or secure periphery, the GTZC will gather the illegal access signal and it will generate an interrupt towards the Cortex-M33 core, and then it's up to the secure application to handle this illegal access. Trust zone architecture allows an easy integration of other bus masters such as the DMA. So the DMA is capable to generate secure or non-secure bus transactions based on the security configuration, and this can be done on a level of individual DMA channels and also on the level of source and destination address for one particular channel. The security configuration is performed through the DMA slave port, and this is typically done by the secure application during the initialization phase. On Cortex-M33 there are two separate vector tables, and each individual interrupt service routine can be assigned to either secure or non-secure weld. The assignment and also the priority management is configured by the secure application after boot. So on the example we have on the right, the interrupt service routine for the UART and SPI is assigned to the non-secure weld, which means that when the periphery generates an interrupt request, the core will execute this interrupt service routine in non-secure state. There are really no constraints on the security state switch. This can happen at any time, either when the core is running in the thread, meaning the background tasks, or even when the core is running inside the interrupt service routine. The interrupt latency is slightly increased when switching from secure to non-secure. The reason for this is there is some necessary cleanup of the core registers in order not to leak some information to the non-secure application. So apart from the hardware-driven context-safe of the core registers to the stack, there is also register clearing, which adds a couple of clock cycles to the standard 12. The same is true even for the function calls from secure to non-secure. Even here, the latency is slightly increased because the registers need to be cleared. In this particular case, this is not hardware-driven, but the compiler adds an extra instructions that clear the core registers to zero. Thrust zone, development flow. The development flow with trust zone is also that is something new. The secure and non-secure projects are built separately. They are often developed by different teams, possibly also by different companies. The secure project can optionally export some legal API to the non-secure. In this case, the output of the build of secure project is a secure gateway library and the associated library headers. The non-secure project is then linked against this library. So obviously the order of build is secure project first, non-secure as second. Secure project does require support for trust zone. It's necessary to use an extension to the C language to tell the compiler and linker to use the special instructions that are specific to state switch. On the other hand, non-secure project is unaware of security states and linking against the secure gateway library is in fact the same like linking against any other type of library. Trust zone, interactions between the secure and non-secure application. So let's now have a look on the details of the security state switch from the non-secure to the secure. So let's imagine the secure application exports legal API. It exports a functionality to decreed some chunk of data that is provided by the non-secure. And by doing this, the decryption algorithm and also the key is not exposed to the non-secure application. So we start the non-secure application starts by branching into the non-secure callable region, which is already part of the secure memory. And it branch, it has to branch on a very specific instructions called the secure gate. And this is the well-defined entry point into the secure world and this is what makes it secure. If the non-secure application tries to jump anywhere else except on SG inside the non-secure callable region, there will be a security fault and the secure application can handle this case. After the SG, the execution flow will branch to the actual body of the decrypt function and when the function is finished, the BX and S will pass the execution back to non-secure. So again, there is another state switch. So in fact, you may ask why there are two regions, the secure and non-secure callable. It's adding an extra level of protection because the non-secure callable region is the only place where the SG instruction can be executed. It cannot be executed inside the secure region. So we have this very small well-controlled non-secure callable region and this is the only valid entry point into the secure world. So how do you actually do this from the software point of view? Let's say we have a function inside the secure application and we want to export it to the non-secure application. We simply add this intrinsic inside of, in front of the function definition, the CMSE underscore NS underscore entry and this will tell the compiler and the linker to use these special instructions and it will tell linker to create this veneer inside the non-secure callable region. So let's now look at the opposite direction. Let's imagine the secure application is calling a function inside the non-secure application. In fact, this happens at least once in every application. So after the reset and initial configuration of security parameters, also after the secure boot, the flow of execution is passed to the non-secure application. In this very specific use case, we would not expect to ever come back to the secure application. But there are other use cases where this might be necessary. Again, there is new instructions, BLNS, BLXNS, branch with link to non-secure. In fact, when the execution of the function B is finished, the execution is passed back to the secure and the small detail is that even the return address is not exposed to the non-secure application. The link register contains some magic value and the actual return address is hidden and not visible to the non-secure application. And again, how can we do this inside the secure application? How can we call a non-secure function? Well, we first need to define this function pointer of this special type, function pointer underscore ns. Then we take the function pointer to the non-secure, feed it as an input argument to the macro cmsc underscore ns and so on. Then we type cast again to this special type def and then we can call the function. So this is just the special syntax of this language extension for RMV8. The interesting fact is that the function pointers are often initialized at runtime and the reason is the dependency. We want to get rid of any placement dependency of non-secure project on the secure one. So the secure application already knows that it will call some non-secure functions. However, it's not sure about the addresses. It's not sure about the placement of these functions. That's why the secure project exports a legal API, which is called by the non-secure. And this is the way the non-secure application can share the information about the actual placement of these function pointers. Conclusion. We have introduced all the existing isolation means on STM32, including NPU, firewall, secure memory and trust zone. We have also gone into quite a lot of details about the integration of trust zone into STM32 L5. At the end, we introduced the development flow and also the C language extension necessary for trust zone applications. Before we finish, I would like to point out to your attention some of the useful resources from ARM and also ST. On this landing page, you can find introduction to trust zone, also a functional overview and more software oriented application nodes. If you look for some specific details, I would suggest searching in the reference specifications and user manuals. I can also recommend to you AN5347 that details the trust zone features on STM32 L5, the AN5156 that introduces security on STM32 microcontrollers and also the reference manual for L5. I hope you enjoyed this video and thank you for your attention.