 Hello and welcome to this presentation of the ARM Cortex-M4 core which is embedded in all products of the STM32G4 microcontroller family. STM32G4 microcontrollers integrate an ARM Cortex-M4 core in order to benefit from the powerful performance of its 32-bit processor architecture and particularly high level of deterministic processing. All Cortex-M CPUs have a 32-bit architecture. The Cortex-M3 was the first Cortex-M CPU released by ARM. Then ARM decided to distinguish two product lines, high performance and low power, while maintaining the compatibility between them. The Cortex-M4 belongs to the high performance product line. The processor core implements a Harvard architecture because it supports concurrent instruction fetch and data load and store transactions. The instruction pipeline features three stages, fetch, decode and execute. Conditional branch execution is accelerated by early fetching the target instruction. SIMD techniques operate with packed data. For instance, two 12-bit samples acquired with the ADC can be stored in the two half-words of the same 32-bit register. In the example described in this slide, two pairs of samples are multiplied and then accumulated into a destination register. Since data signal processing is based on some of products, SIMD instructions contribute to increase the performance with regard to regular scalar fixed point instructions. The Cortex-M4 present in the STM32G4 implements the optional single precision floating point unit, which is compatible with the IEEE 754 standard. Add, subtract and multiply instructions, take one clock to execute. Multiply and accumulate instructions, take three clocks and divide and square root instructions, take 14 clocks. The Cortex-M4 has neither a cache nor internal RAM. Consequently, any instruction fetch transaction and data access is steered to the internal bus matrix. This bus matrix selects the output AHB light masterport according to the address and the access type, instruction or data. Three AHB transactions can be in progress at a time, for instance, an instruction access from flash memory using the icode masterport, a constant data access from flash memory using the decode masterport and SRAM access using the system masterport. The Cortex-M4's bus matrix is connected to STM32G4 MCU's AHB bus matrix, enabling the CPU to access memories and peripherals. Since transactions are pipelined on AHB light, the best throughput is 32 bits of data or instructions per clock with a minimum two clock latency. One of the output of the Cortex-M4's bus matrix is the private peripheral bus or PPB, which is internal to the CPU. It's used to access memory mapped registers present in NVIC, MPU and debug units. In the Cortex-M4 core, the memory protection unit, or MPU, is used to protect address ranges according to the configured access permissions. When enabled, it intercepts any access initiated by the processor core. The MPU in STM32G4 microcontroller offers support for eight independent memory regions with independent configurable access permissions for access permission allowed or not read or write in privileged or unprivileged mode, execution permission, executable region or region prohibited for instruction fetch. The MPU is also in charge of assigning attributes to regions called normal, device and strongly ordered. Normal is used to map memories. Device and strongly ordered are used to map peripherals. The difference between them is the capability to buffer data. The device memory attribute enables write posting while a store to a strongly ordered region stores the pipeline until the response is received from the targeted peripheral. The NVIC and debug units are described in separate presentations. For more details, please refer to these application notes and the Cortex-M4 programming manual available on www.sd.com website. Also visit the ARM website where you will find more information about the Cortex-M4 core.