 So, let's get started with low-power introduction on STM32L5. During this session, we want to speak about ultra-low power, so as a starting point, let's try to define the profile of typical low-power application. When power supply is applied, the MCU starts to execute the code. First part of the application is always related to the configuration of resources such as clocks, GPIOs, and peripherals. We can call this phase a startup initialization. This phase typically takes place only once. Then we have a part of application which is repetitive. It starts from an inactive phase. MCU doesn't need to perform any task, so the goal is to reduce the consumption to the minimum. In consequence, MCU stays in deep low-power mode. Then MCU receives an interrupt or event. It might come from an external world, for example from GPIO or some communication interface. It might come as well from an internal source, for example, a periodic wake-up timer. Interrupt is a trigger to wake up MCU from low-power mode. That is why this phase is called wake-up. After waking up, the MCU performs some tasks. For example, it reads and processes a data from a sensor. In consequence, this phase is an active phase. Once the activity is finished, MCU comes back to low-power mode and the whole process repeats again. In each of these application phases, MCU consumes current. So after taking into account all the phases, it's possible to calculate or measure the average current consumption of the application, which in turn translates to the lifetime of the battery. Once we are familiar with profile of typical low-power application, let's try to identify list of requirements for ultra low-power MCU and let's speak about solution on STM32L5, which addresses them. First requirement for MCU is computational performance. This feature is needed because it allows to execute faster tasks in active phase of application and reduce the time that is spent on it. In order to address this requirement, STM32L5 offers performant Cortex M33 core, system frequency of up to 110 MHz and so-called ART accelerator, a cache memory that speeds up the execution. Second requirement for MCU is power efficiency. This feature is desired because it helps to reduce the current consumption in active phase of application. STM32L5 is equipped with internal SNPS, which can significantly reduce consumption. In addition, scaling of voltage and gating of clocks that supply and feed resources of the STM32L5 could further reduce the consumption in active phase. Second requirement for MCU is large variety of low-power modes. They help in achieving low-current consumption in inactive phase of application. Here STM32L5 offers multiple low-power modes starting from sleep mode up to very deep low-power modes like standby or shutdown. We will take a deeper look on these modes later on. Another MCU requirement is wake-up time. It helps to reduce the transition period between inactive and active phase of application. Depending on the low-power mode, STM32L5 can offer wake-up time down to 14 clock cycles. Finally, last requirement for MCU is a set of smart peripherals. Typically, they can help to reduce the current consumption in inactive phase. Here STM32L5 offers wide range of smart low-power peripherals. Amongst them we can find Watchdog on ADC, Low-Power UART, Low-Power Timer or DMA. As a summary of this slide, we can say that STM32L5 offers effective solutions to address all requirements of ultra-low-power MCU and to optimize each phase of low-power application. In this section, we will have a close look at performance and efficiency of computation on STM32L5. One of the key factors that affect performance but also power efficiency of a microcontroller is the system-level architecture. Most applications run their code from either internal flash or an external one and they sometimes place the very critical code into SRAM. But it's often the flash that limits the performance at higher frequencies. However, this was largely mitigated on STM32L5 thanks to low number of flash weight states in the first place and also due to the cache that resides between the cortex core and the memories. So what you see in the picture are the four busmasters on the chip which are the cortex core and the three DMA controllers. On the right you see the slaves which are the internal flash and SRAMs, all the peripherals and also the external memories. The cache resides between the core above the bus metrics and it caches instructions and data to internal memories and when remapped even to external memories. When the instructions and data are inside the cache they can be accessed at zero weight states even at the maximum frequency of 110 MHz. This not only affects the performance but also the power efficiency and that's for two reasons. Firstly, the code executes faster so the application can spend more time in one of the various low power modes. Secondly, the access to the cache is in fact more power efficient than accessing the original memory. There are many ways to measure computational performance of an MCU. One way to do this is to simply state the core frequency or number of instructions that are executed each second. Drystone benchmark was one of the first that tried to tie performance with a real code. But today the industry standard is coremark from the embassy. Their source code is freely available and can be ported virtually on any architecture. The code is using algorithms and data manipulations that can be found in common real applications such as list processing, metrics manipulation or state machines that are based on if or switch statements. The outcome of the coremark is a single number that represents number of coremark iterations executed each second. You can also encounter coremark score per MHz which defines the computational efficiency. The coremark score is influenced by the core itself. In case of STM32L5 this is the codex M33. The score increases with frequency but this relationship is not necessarily linear especially when the code is executed from flash. However, on L5 this is the case, the relationship is linear and that's thanks to the low number of wait states for the flash and the cache that accelerates code execution. It's also worth mentioning that every coremark measurement should be accompanied by the exact version of compiler that was used and also the optimization flex. The coremark code is designed in a way that it prevents any elimination of code at compile time so all the computation is actually performed at runtime. In the graph you see the dependency of coremark score on the frequency of the core for STM32L5 and one of our competitor that is also based on coretex M33. In both cases the code is executed from internal flash. At low frequency the performance is almost the same but at higher frequencies the flash wait states and the lack of cache start to limit our competitor. L5 reaches similar performance at 80 MHz where the other device needs to run at 150 MHz. And this has significant implications for the consumption because the same amount of computational work can be done at lower frequency and therefore at much lower consumption. The coremark can also be executed from SRAM which is always 0 wait states at all frequencies so you might encounter very different results for code executed from flash and SRAM and it's something to be aware of. So how can we evaluate power efficiency of computation? What's often used for this purpose is microamps per MHz in other words how much current is consumed at a given core frequency. And there are two issues with this. First of all megahertz is a poor metric for performance because performance depends on many other factors apart from frequency such as the flash and cache design. Second of all as you'll see later SGM32L5 includes an internal SNPS so unit of energy instead of current provide better and more accurate description than just current at certain VDD voltage. The better way to compare two products is to measure the energy that is required to compute certain tasks for example one coremark iteration. In fact embassy provide a standardized method to measure the consumption while running the coremark. So the ULPmark-CM is the inverse of the energy required per one coremark iteration. The inverse is taken just because we intuitively expect that the higher value expresses better performance. To compare two products we should always make sure that we compare at the same level of performance and not just at the same frequency. The SGM32L5 integrates an internal SNPS which is an optional feature available on specific part numbers. From the burl of material point of view it requires only few passive components one inductor and two capacitors. SNPS greatly improves the power efficiency especially at high VDD voltage. The SNPS is configurable at runtime so when the MCU enters one of the deeper low power modes in fact stop one or deeper. The SNPS is automatically switched off and it's re-enabled once the MCU wakes up. The SNPS supplies the internal LDO that in turns supplies the whole digital domain. The CPU, digital peripherals and memories. So this is the end of the theory part so let's now move to the practical demonstration.