 Hello, and welcome to this presentation of the STM32L4 flash memory. All STM32L4 flash features will be presented. The STM32WB embeds up to one megabyte of single bank flash memory. The flash memory interface manages all memory access, read program and erase, as well as memory protection, security, and option bytes. Applications using this flash memory interface benefit from its high performance together with low power access. It has a small erase granularity and short programming time. The STM32WB flash memory provides various security and protection mechanisms for code and data, read and write access. The STM32WB's flash memory has several key features. It has up to one megabyte of single bank flash memory. The erase granularity corresponding to the page size is only 4 kilobytes. A page, bank, or mass erase operation requires only 22 milliseconds, and the programming time is only 82 microseconds for a double word. The adaptive real-time memory accelerator with an instruction cache, a data cache, and a pre-fetch buffer allows a linear performance in relation to frequency. The flash memory supports error code correction, or ECC, which is 8 bits long for each 64-bit double word. A single error is detected and corrected. A double error is detected but not corrected. The flash memory contains 256 pages of 4 kilobytes each. Each page is made of eight rows of 512 bytes. Next to the main memory block, there is an information block which contains three parts. The first part is the system memory, which is reserved for the ST Microelectronics bootloader. When selected, the device boots in system memory to execute the bootloader. The second part is a one kilobyte, one-time programmable area. The OTP area cannot be erased, and a double word can be written only once. If one double word bit is at zero, the entire double word can no longer be written, except with the value all zeros. Programming a previously programmed double word is only allowed when programming all zeros. The last part contains the option bytes for configuring user options. This slide shows the flash memory map. There are 256 pages for the main memory, starting from page zero. The page number is used in the software procedure to erase a page. The flash memory embeds an error code correction function to ensure robust memory integrity and safety. The ECC is 8 bits long for a 64-bit word. In the case of a single error, it is corrected. The ECCC bit is set in the flash ECCC register and an interrupt is generated if it is enabled. In the case of a double error, it is detected but not corrected. The ECCD bit is set in the flash ECCC register and a non-mascable interrupt is generated. When an ECCC error is detected, the failure address and associated bank are saved in the flash ECCC register. The programming granularity is 64 bits. In fact, it's 72 bits with the 8-bit ECCC. There are two programming modes, standard mode for the main memory and OTP and fast mode for the main memory only. In standard mode, the flash memory checks that the double word is erased before launching the programming. In fast mode, 64 double words are programmed without verifying the flash location. The flash memory programming time is only 82 microseconds for 64-bit double words. To program one page, 4 kilobytes, 41.8 milliseconds are needed in standard mode and 30.4 milliseconds in fast mode. For the complete flash memory to be programmed, it requires 8 seconds in fast mode. The page erase time is 22 milliseconds. It also requires only 22 milliseconds to erase the complete flash memory. The short programming and erase time, plus the small page size, make it convenient for data EE prom emulation. A fast programming mode allows programming of 64 double words faster than in standard programming mode. Only the main memory can be programmed in fast programming mode. The flash memory address location contents are not verified by hardware before programming in fast mode. The 64 double words must be written successfully. The high voltage is kept on the flash memory for all programming. The maximum time between two double word write requests is the programming time, which is approximately 20 microseconds. Consequently, interrupts should be disabled to ensure that the 20 microseconds between two word write requests is not exceeded. The minimum clock frequency must be at least 8 megahertz in fast programming mode. This slide compares standard and fast programming modes. Standard mode can be used to program the main memory and OTP areas, while fast mode cannot be used for OTP programming. Standard mode allows programming 64-bit double words or 8 bytes, whereas fast mode only allows programming 64-bit double words or only 512 bytes. In fast mode, the address location content is not checked before programming. The flash clock frequency must be greater than 8 megahertz and CPU interrupts are prohibited. It takes 5.2 milliseconds to program 512 bytes in standard mode and 3.8 milliseconds in fast mode. The flash memory is guaranteed for a minimum of 10,000 cycles up to 105 degrees Celsius. Data retention is 30 years after 10,000 cycles at 55 degrees Celsius, 15 years after 10,000 cycles at 85 degrees Celsius, and 10 years after 10,000 cycles at 105 degrees Celsius. It is 30 years after 1,000 cycles at 85 degrees Celsius, 15 years after 1,000 cycles at 105 degrees Celsius, and 7 years after 1,000 cycles at 125 degrees Celsius. In order to read the flash memory, it is required to configure the number of weight states to be inserted in a read access, depending on the clock frequency. The number of weight states also depends on the voltage scanning range. In range 1, the flash memory can be accessed up to 64 MHz with 3 weight states. It can be accessed with 0 weight states up to 18 MHz. For range 2, it is up to 16 MHz with 2 weight states. Thanks to the adaptive real-time memory accelerator, the ART accelerator, the program can be executed with 0 weight states independent of the clock frequency. This provides an almost linear performance in relation to frequency and allows you to reach 80 dry stone MIPS at 64 MHz. The ART accelerator brings outstanding performance and reduces dynamic power consumption. It consists of a Cortex-M4 1 kilobyte instruction cache, 256 kilobytes of data cache and a pre-fetch buffer, and a Cortex-M0 plus 32 bytes instruction cache, 32 bytes of data cache, and a pre-fetch buffer. The Cortex-M4 instruction cache contains 32 lines of 4 double words and the data cache has 8 lines of 4 double words. Once all the instruction cache memory lines have been filled, the LRU, or least recently used policy, is used to determine the line to replace in the instruction memory cache. This feature is particularly useful when code contains loops. This architecture is chosen to provide the best trade-off between cache size, power consumption, and performance. After each miss, the cache is updated with only the requested double word in order to limit the flash access for power saving. In a line, the 4 double words may not all be valid. In case of a miss, the code takes the instruction directly from the flash memory. In parallel, the 64-bit cache is copied into the current buffer enabled and iCache if enabled. So, the next sequential access is taken directly from the current buffer. If pre-fetch is enabled, another 64-bit flash memory access is performed to fill the pre-fetch buffer with sequential data. When the data is present in the current buffer, the CPU reads the current buffer. The next sequential read is performed in the pre-fetch buffer, which is copied into the current buffer so that it is free to be filled with the next sequential data. If the data is not present in the current buffer, it is read from the pre-fetch buffer if it is present. If not, it is read from the instruction cache if there is a cache hit. Otherwise, a flash access is performed. Flash access arbitration between Cortex-M4 iCode instructions, iCode data and Cortex-M0 Plus S-Bus instructions and data uses round robin. The instruction cache behaves differently depending on whether or not the pre-fetch buffer is enabled. If the pre-fetch buffer is enabled, the art instruction cache behaves like a branch cache. The cache is modified each time a branch or a jump occurs in the execution flow. Sequential accesses are issued by the current instruction buffer and the pre-fetch buffer. Each time the pre-fetch buffer is hit, its contents are transferred to the current instruction buffer and a new flash access to fill the pre-fetch buffer is performed. In this case, the cache content is not altered. If the pre-fetch buffer is disabled, the art instruction cache behaves like a normal cache. Since no pre-fetch buffer is available, even a sequential access will modify the cache content. The power and performance trade-off must be evaluated for each application to know whether it is better to enable or disable the pre-fetch buffer. For most applications, enabling the pre-fetch buffer allows us to slightly increase the performance but with a higher consumption. Most of the time, the best energy efficiency is provided with caches enabled and the pre-fetch buffer disabled as it often reduces the number of flash memory accesses. This slide shows the number of cycles needed to execute sequential 16-bit instructions without pre-fetch when two weight states are needed to access the flash memory. Every flash access provides 64 bits or four instructions. Two weight states are therefore inserted every four instructions at every flash access. This slide shows the number of cycles needed to execute sequential 16-bit instructions with pre-fetch enabled when two weight states are needed to access the flash memory. After each flash access, another flash access is performed to fill the pre-fetch buffer. So, after all instructions are fetched from the current buffer, the next sequential instruction is read from the pre-fetch buffer and no weight state is inserted as long as the instruction flow is sequential. Several flash memory options can be configured using the option bytes. The readout protection is configured using the RDP option byte. The readout protection prohibits any access to the flash memory, the SRAM 2 and the backup registers by the debug interface or when booting from SRAM 1 or when the bootloader is selected. The proprietary code protection is configured using the PC Rop option bytes. These options protect specific code areas from any read or write access. The code can only be executed. The protected areas can be defined with two kilobyte granularity and two areas can be defined. The write protection is configured using the WRP option bytes. These options protect specific code areas from unwanted write access and arrays. The write protected area can be defined with four kilobyte granularity. The Cortex-M0 Plus security is configured using the SFD option byte. This option secures a specific flash memory area for exclusive Cortex-M0 Plus access. The Cortex-M0 Plus security area can be defined with a four kilobyte granularity. Please refer to the specific trainings about system protections and Cortex-M0 Plus security for more details about these protection options. Cortex-M0 Plus security is enabled by clearing the flash security disable option. The secure flash starts from the address in the secure flash start address option. In addition to the flash memory, the SRAM 2A and SRAM 2B security can also be enabled by secure backup RAM disable, secure backup RAM start address, secure non-backup RAM disable, and secure non-backup RAM start address options. Security peripherals like advanced encryption standard accelerator, private key accelerator, and true random number generator may be secured by register bits in the system configuration IP. Several user option bytes are available in the flash memory to configure certain specific features of the device. The user option bytes are loaded in two cases, either after a power or brownout reset when exiting from standby or shutdown modes, or when the OBL launch bit is set in the flash control register. Three option bits are used to configure the brownout reset threshold. Three options are available to prohibit or allow the stop, standby, and shutdown low power modes. Four options configure if the watchdogs are enabled by hardware or after a software configuration, and if the independent watchdog is frozen or not in stop and standby modes. Three options are used together with the boot zero pin to configure the memory used for booting. Two options are used to configure if the SRAM 2 is erased with the system reset and to enable the SRAM 2 parity check. One option is used to define the common memory area in SRAM 2 for inter-processor communication data buffers. Several option bytes are used for memory protection options, the RDP for readout protection, PC Rop for the start and end addresses of two areas, and WRP for the start and end addresses for each of the two areas. The PC Rop RDP bit is used to preserve or erase the PC Rop area when the readout protection is removed from level one to level zero. The Cortex M0 Plus secure memory areas in flash memory are defined by the SFSA, and an SRAM 2A by the SBRSA, and an SRAM 2B by SNBRSA. The Cortex M0 Plus reset vector is defined by SBRV and C2B OPT. Debugging of the Cortex M0 Plus is disabled by DDS. The Cortex M0 Plus secure memory area, boot options and debug disabled have exclusive Cortex M0 Plus write access. They can be read by the Cortex M4 to provide information on the secure memory areas. Four interrupts can be generated by the flash memory. The end of operation interrupt, which is triggered when one or more flash program or erase operations is completed successfully. The operation error interrupt is triggered when a flash memory program or erase operation failed. The read error interrupt is triggered when an address read through the core database belongs to an area of the flash protected by the PC Rop option. The ECC interrupt is triggered when one ECC error is detected and corrected. When two ECC errors are detected, a non-maskable interrupt is generated. The flash memory's consumption can be reduced when the code is not executed from the flash. The flash clock can be gated off in run and low power run modes. It can also be configured to be gated off in sleep and low power sleep modes. The flash clock is configured in the reset and clock controller. It is enabled by default. The flash memory can be configured in power down mode during the sleep and low power sleep modes. It can also be configured in power down mode during run and low power run modes when the code is executed from SRAM. Gating the clock and putting the flash memory in power down mode significantly reduces power consumption. In run and low power run modes, the flash memory is active. Its clock can be disabled if code is executed from SRAM and the flash memory is in power down mode. In sleep and low power sleep modes, the flash clock can be disabled and the flash memory configured in power down mode. In stop zero, stop one and stop two modes, the flash clock is off. The content of the flash interface registers is retained. In standby and shutdown modes, the content of the flash interface registers is lost and must be reinitialized after exiting the mode. The performance of the flash memory is almost linear with the frequency using the ART accelerator. The core mark score is 212.5 at 64 MHz which corresponds to 3.32 core mark per MHz with the instruction and data cache is enabled and the prefetch buffer disabled. In range two at 16 MHz, the performance is 2.48 core mark per MHz with the instruction and data cache is enabled and the prefetch buffer disabled. In energy efficiency at 64 MHz with switched mode power supply off is 26.6 core mark per milliamp and in range two, energy efficiency only lowers to 21.4 core mark per milliamp at 16 MHz. The flash memory is shared between the Cortex-M4 and the Cortex-M0+. Both CPUs use the flash memory to execute instructions. The performance of the flash memory has minimal impact due to the ART accelerator. During simultaneous code execution, the Cortex-M4 core mark per MHz is 3.28 at 64 MHz with Cortex-M0+. at 32 MHz. The instruction cache and data cache enabled and prefetch buffer is disabled. This translates into a Cortex-M4 performance loss of only 1.3%. The Cortex-M4 core mark per MHz score is 2.48 at 16 MHz with Cortex-M0+. at 16 MHz. This translates into a Cortex-M4 performance loss of only 0.04%. Flash memory program and erase operations are only possible in Power Range 1. In range two and low power modes, flash memory program and erase operations are prohibited. Due to the single bank flash memory architecture, program and erase operations will block execution for both CPUs. To prevent flash memory operation from impacting real-time CPU performance, they can be suspended. As long as the suspend is active, no new operations will be started, guaranteeing the execution can continue. If an ongoing flash operation has been enabled before the suspend, it will be completed. Each CPU can request a flash operation suspend using its own suspend register bit. This is a list of peripherals related to the flash memory. Please refer to these peripheral trainings for more information if needed. For more details, please refer to application note AN2606 about the STM32 microcontroller system memory boot mode.