 Hello and welcome to this presentation of the STM32L4 flash memory. All STM32L4 flash features will be presented. Please note that this presentation has been written for STM32L47X48X devices. Key differences with other devices are indicated at the end of the presentation unless otherwise specified. The STM32L4 embeds up to one megabyte of dual bank flash memory. The flash memory interface manages all memory access, read, programming and erasing as well as memory protection and option bytes. Applications using this flash interface benefit from its high performance together with low power access. It supports read while write, has a small erase granularity, a short programming time and allows dual bank booting. The STM32L4's flash memory has several key features. It has up to one megabyte of dual bank flash memory with a read while write capability that can program or erase one bank while executing code from the other bank. The erase granularity corresponding to the page size is only 2 kilobytes. A page, bank or mass erase operation requires only 22 milliseconds and the programming time is only 82 microseconds for a double word. The adaptive real-time memory accelerator with an instruction cache, a data cache and a prefetch buffer allows a linear performance in relation to frequency. The flash memory supports error code correction or ECC, which is 8 bits long for each 64-bit double word. A single error is detected and corrected. A double error is detected but not corrected. The flash memory is divided into two banks, each having a main memory block containing 256 pages of 2 kilobytes each. Each page is made of 8 rows of 256 bytes. Each main memory block has an information block which contains three parts. The first part is the system memory which is reserved for use by ST Microelectronics and contains the bootloader. When selected, the device boots in system memory to execute the bootloader. The second part is a 1 kilobyte one-time programmable area. This area is located in bank one only. The OTP area cannot be erased and can be written to only once. If one bit is at zero, the entire double word can no longer be written, even with the value zero. The last part contains the option bytes for configuring user options. This slide shows the flash memory map. There are 256 pages in bank one starting from page zero and 256 pages in bank two starting from page 256. The page number MSB corresponds to the bank number. The page number is used in the software procedure to erase a page. The flash is dual bank memory with read while write and dual bank boot capability able to boot from either bank one or bank two. The BFB2 option in the user option bytes is used to select the dual bank boot mode. When the BFB2 option is set, the device boots in either bank one or bank two depending on the valid bank. When the BFB2 option is cleared, the device always boots from bank one. The dual bank option is used to select either a single bank or a dual bank for the 256 kilobyte and 512 kilobyte device part numbers. For instance, when dual bank is selected for 512 kilobyte devices, 128 pages are in bank one and 128 pages are in bank two. The first page name in bank two is always page 256 regardless of the device's memory size as the page name MSB refers to the bank number. With a dual bank memory, it is possible to read from one bank while programming or erasing the other bank. Code execution is not stopped when the flash memory is being programmed. When programming or erasing data in the same bank, the AHB is stalled as long as the flash memory controller is busy. Using the FB mode bid in the system configuration memory remap register, the two flash bank addresses can be swapped. When this bid is cleared, bank one is mapped at address 0x0800000 and aliased at address 0. When this bid is set, bank two is mapped at address 0x0800000 and aliased at address 0 which allows the device to boot into bank two. The dual bank boot allows a safe firmware upgrade as the previous firmware version is still present in the other memory bank. The dual bank boot is managed by the bootloader. The device boots in bank two using the BFB2 option bit programmed in the flash option bytes. The boot pin and boot option are configured for booting in flash memory. If the BFB2 option bit is cleared, the device boots in flash bank one. If the BFB2 option bit is set, the device boots in the system flash memory. The bootloader checks the bank's first address as it must read there the stack pointer at that location. If bank two's first address is a valid SRAM address, the bootloader swaps the banks to remap bank two at address 0 and jumps into bank two. If it is not valid, the bootloader swaps the banks to remap bank one at address 0 and jumps into bank one. Note that the bootloader uses resources in SRAM one from address 0x2000000 to address 0x2000100 so this SRAM area must not be used by the application when the BFB2 option bit is set. The flash memory embeds an error code correction function to ensure robust memory integrity and safety. The ECC is 8 bits long for a 64-bit word. In the case of a single error, it is corrected. The ECCC bit is set in the flash ECCC register and an interrupt is generated if it is enabled. In the case of a double error, it is detected but not corrected. The ECCD bit is set in the flash ECCC register and a non-maskable interrupt is generated. When an ECC error is detected, the failure address and associated bank are saved in the flash ECCC register. The programming granularity is 64 bits. In fact, it is 72 bits with the 8-bit ECCC. There are two programming modes, standard mode for the main memory and OTP and fast mode for the main memory only. In standard mode, the flash memory checks that the double word is erased before launching the programming. In fast mode, 32 double words are programmed without verifying the flash location. The flash programming time is only 82 microseconds for 64-bit double words. To program one page, 2 kilobytes, 20.9 milliseconds are needed in standard mode and 15.3 milliseconds in fast mode. For the complete bank, it requires 3.9 seconds in fast mode. The page erase time is 22 milliseconds. It also requires only 22 milliseconds to erase one or both banks, as both banks can be erased simultaneously. The short programming and erase time, plus the small page size, make it convenient for data EEPROM emulation. A fast programming mode allows you to program 32 double words faster than in standard programming mode. Only the main memory can be programmed in fast programming mode. The flash address location contents are not verified by hardware before programming in fast mode. The 32 double words must be written successively. The high voltage is kept on the flash memory for all programming. The maximum time between two double word write requests is the programming time, which is approximately 20 microseconds. Consequently, interrupt should be disabled to ensure that the 20 microseconds between the two word write requests is not overpassed. The minimum clock frequency must be at least 8 megahertz in fast programming mode. This slide compares standard and fast programming modes. Standard mode can be used to program the main memory and OTP areas, while fast mode cannot be used for OTP programming. Standard mode allows programming 64-bit double words or 8 bytes, whereas fast mode only allows programming 32-bit double words or only 256 bytes. In fast mode, the address location content is not checked before programming. The CPU clock frequency must be greater than 8 megahertz and interrupts are prohibited. It takes 2.6 milliseconds to program 256 bytes in standard mode and 1.9 milliseconds in fast mode. The flash memory is guaranteed for a minimum of 10,000 cycles up to 105 degrees Celsius. Data retention is 30 years after 10,000 cycles at 55 degrees Celsius, 15 years after 10,000 cycles at 85 degrees Celsius, and 10 years after 10,000 cycles at 105 degrees Celsius. It is 30 years after 1,000 cycles at 85 degrees Celsius, 15 years after 1,000 cycles at 105 degrees Celsius, and 7 years after 1,000 cycles at 125 degrees Celsius. In order to read the flash memory, it is required to configure the number of weight states to be inserted in a read access, depending on the clock frequency. The number of weight states also depends on the voltage scaling range. In range 1, the flash memory can be accessed up to 80 MHz with 4 weight states. It can be accessed with 0 weight states up to 16 MHz. For range 2, it is up to 26 MHz with 3 weight states. Thanks to the adaptive real-time memory accelerator, the ART accelerator, the program can be executed with 0 weight states independent of the clock frequency. This provides an almost linear performance in relation to frequency and allows us to reach 100 dry stone MIPS at 80 MHz. The ART accelerator brings outstanding performance and reduces dynamic power consumption. It consists of a 1 kilobyte instruction cache, 256 bytes of data cache, and a pre-fetch buffer. The instruction cache contains 32 lines of four double words, and the data cache has 8 lines of four double words. Once all the instruction cache memory lines have been filled, the LRU, or least recently used policy, is used to determine the line to replace in the instruction memory cache. This feature is particularly useful when code contains loops. This architecture is chosen to provide the best trade-off between cache size, power consumption, and performance. After each miss, the cache is updated with only the requested double word in order to limit the flash access for power saving. In a line, the four double words may not all be valid. In case of a miss, the Cortex-M4 code takes the instruction directly from the flash memory. In parallel, the 64-bit line is copied into the current buffer enabled and iCache if enabled. So the next sequential access is taken directly from the current buffer. If pre-fetch is enabled, another 64-bit flash access is performed to fill the pre-fetch buffer with sequential data. When the data is present in the current buffer, the CPU reads the current buffer. The next sequential read is performed in the pre-fetch buffer, which is copied into the current buffer so that it is free to be filled with the next sequential data. If the data is not present in the current buffer, it is read from the pre-fetch buffer if it is present. If not, it is read from the instruction cache if there is a cache hit. Otherwise, a flash access is performed. The instruction cache behaves differently depending on whether or not the pre-fetch buffer is enabled. If the pre-fetch buffer is enabled, the art instruction cache behaves like a branch cache. The cache is modified each time a branch or a jump occurs in the execution flow. Sequential accesses are issued by the current instruction buffer and the pre-fetch buffer. Each time the pre-fetch buffer is hit, its contents are transferred to the current instruction buffer and a new flash access to fill the pre-fetch buffer is performed. In this case, the cache content is not altered. If the pre-fetch buffer is disabled, the art instruction cache behaves like a normal cache. Since no pre-fetch buffer is available, even a sequential access will modify the cache content. The power and performance trade-off must be evaluated for each application to know whether it is better to enable or disable the pre-fetch buffer. For most applications, enabling the pre-fetch buffer allows us to slightly increase the performance, but with a higher consumption. Most of the time, the best energy efficiency is provided with caches enabled and the pre-fetch buffer disabled, as it often reduces the number of flash accesses. This slide shows the number of cycles needed to execute sequential 16-bit instructions without pre-fetch when three weight states are needed to access the flash memory. Every flash access provides 64 bits or four instructions. Three weight states are therefore inserted every four instructions at every flash access. This slide shows the number of cycles needed to execute sequential 16-bit instructions with pre-fetch enabled when three weight states are needed to access the flash memory. After each flash access, another flash access is performed to fill the pre-fetch buffer. So, after all instructions are fetched from the current buffer, the next sequential instruction is read from the pre-fetch buffer and no weight state is inserted as long as the instruction flow is sequential. Several flash memory protection options can be configured using the option bytes. The readout protection is configured using the RDP option byte. The readout protection prohibits any access to the flash memory, the SRAM 2 and the backup registers by the debug interface or when booting from SRAM 1 or when the bootloader is selected. The proprietary code protection is configured using the PCROP option byte. This option protects a specific code area from any read or write access. The code can only be executed. The protected area can be defined with 64-bit granularity and one area can be defined in each bank. The write protection is configured using the WRP option byte. This option protects specific code areas from unwanted write access. The write protected area can be defined with 2 kilobyte granularity. Please refer to the specific training about system protections for more details about these protection options. Several option bytes are available in the flash memory to configure certain specific features of the device. The user option bytes are loaded in two cases. Either after a power or brownout reset when exiting from standby or shutdown modes or when the OBL launch bit is set in the flash control register. Three option bits are used to configure the brownout reset threshold. Three options are available to prohibit or allow the stop, standby and shutdown low power modes. Four options configure if the watchdogs are enabled by hardware or after a software configuration and if the independent watchdog is frozen or not in the stop and standby modes. Two options are used to enable dual bank boot and to configure the 512 kilobyte or 256 kilobyte devices in dual bank configuration. The end boot one option is used together with the boot zero pin to configure the memory used for booting. Two options are used to configure if the SRAM 2 is erased with the system reset and to enable the SRAM 2 parity check. Several option bytes are used for memory protection options. The RDP for readout protection, PCROP for the start and end addresses of each bank and WRP for the start and end addresses for each of the two areas of each bank. The PCROP RDP bit is used to preserve or erase the PCROP area when the readout protection is removed from level one to level zero. Four interrupts can be generated by the flash memory. The end of operation interrupt, which is triggered when one or more flash program or erase operations is completed successfully. The operation error interrupt is triggered when a flash memory program or erase operation failed. The read error interrupt is triggered when an address read through the core data bus belongs to an area of the flash protected by the PCROP option. The ECC interrupt is triggered when one ECC error is detected and corrected. When two ECC errors are detected, a non-mascable interrupt is generated. The flash memory's consumption can be reduced when the code is not executed from the flash. The flash clock can be gated off in run and low power run modes. It can also be configured to be gated off in sleep and low power sleep modes. The flash clock is configured in the reset and clock controller. It is enabled by default. The flash memory can be configured in power down mode during the sleep and low power sleep modes. It can also be configured in power down mode during run and low power run modes when the code is executed from SRAM. Gating the clock and putting the flash memory in power down mode significantly reduces power consumption. In run and low power run modes, the flash memory is active. Its clock can be disabled if code is executed from SRAM and the flash memory is in power down mode. In sleep and low power sleep modes, the flash clock can be disabled and the flash memory configured in power down mode. In stop zero, stop one and stop two modes, the flash clock is off. The content of the flash interface registers is retained. In standby and shutdown modes, the content of the flash interface registers is lost and must be reinitialized after exiting the mode. The performance of the flash memory is almost linear with the frequency using the ART accelerator. The core mark score is 268 at 80 MHz, which corresponds to 3.35 core mark per MHz when the instruction cache, data cache are enabled and the prefetch buffer is disabled. In range one at 80 MHz, the performance is 3.32 core mark per MHz when the instruction and data caches are enabled but the prefetch buffer is disabled. When the ART is disabled, the performance is only 1.55 core mark per MHz. When comparing the energy efficiency, enabling the cache is very interesting as the results are 24.4 core mark per milliamp when the ART accelerator is enabled and 13.2 core mark per milliamp when disabled. In range two, energy efficiency rises up to 28.6 core mark per milliamp at 26 MHz. This is a list of peripherals related to the flash memory. Please refer to these peripheral trainings for more information if needed. For more details, please refer to application note AN2606 about the STM32 microcontroller system memory boot mode. This slide presents the key differences between baseline STM32L47X48X devices and other devices. The maximum size of the flash memory and the number of banks differ for each device.