 Hello and welcome to this presentation of the BUS Matrices Interconnecting Masters and Slaves in the STM-32U5. The BUS Matrix provides access from a master to a slave, enabling concurrent access and efficient operation, even when several high-speed peripherals are working simultaneously. The STM-32U5 ARM Cortex M33 Core is optimized for execution thanks to an instruction cache with direct access to flash through the fast master port. The main 32-bit AHB multi-layer BUS Matrix in the center of the figure interconnects 11 masters and 10 slaves. The 128-bit AHB5 instruction cache refill BUS Matrix is made up of 128-bit interface and two 32-bit interfaces. The 128-bit interface connects the instruction cache to the flash memory interface, or FLITF, allowing fast execution from internal flash. The 32-bit AHB3 smart run domain, or SRD BUS Matrix, has two slave interfaces, main matrix and LPDMA, and two master interfaces, AHB3, APB3 peripherals, and SRAM4. These BUS Matrices feature a fast BUS multiplexer used to connect each master to a given slave without latency. For the same master, other slaves undergo a latency of at least one cycle at each new access. You can see that a unique fast BUS multiplexer is present in any particular column. It selects the default slave for the related master, which is accessed without latency, for instance, the iCache slow port features a reduced latency with FSMC on STM32 U575 U585 devices, and with Octo PSPI1 on STM32 U59X U5AX. Accesses to internal SRAM memories initiated by the Cortex M33 are performed through the SAHB port. The demultiplexer connected to the SAHB port selects the slave port in the main bus matrix according to the address. S1 in order to access the SRAM1, S2 in order to access the SRAM2, SRAM5, SRAM6, SRAM4 and backup SRAM, S3 in order to access the SRAM3. For SRAM1, 2, 3, 5, 6 latency is zero when no other master currently accesses the SRAM. SRAM1, SRAM2, SRAM3, SRAM5 and SRAM6 are accessible on SAHB bus with a continuous mapping. Accesses to external memories connected to FSMC, Octo PSPIs, HSPI1 and GFX MMU controllers are done through the Dcache even though requests are marked as non-cacheable. These accesses can be data requests as well as instruction requests mapped in the external data region of the mapping. Note that fetching instructions through the SAHB and Dcache is less efficient than fetching through the CAHB bus and ICache slow port. This is the reason the ICache supports the address remapping capability. The GPDMA has a dual bi-directional master port, port 0 and port 1, to support concurrent transfers over these ports. These buses connect to two AHB master interfaces of the GPDMA to the bus matrix and target the internal flash memory, the internal SRAMs, SRAM1, SRAM2, SRAM3, SRAM4, SRAM5, SRAM6 and BKP SRAM. The AHB1 peripherals including the APB1 and APB2 peripherals, the AHB2 peripherals, the SRG peripherals and the external memories through FSMC, HSPI1 or Octo SPIs. The default slaves of these buses are HB1 peripherals for port 0 and SRAM1 for port 1. The GPDMA port 0 also has a direct access to APB1 and APB2 peripherals with reduced latency. This bus connects the OTG HS master interface to the bus matrix. This bus is used by the OTG HS to load store data from and to the memory. This bus targets the data memories, internal flash memory, internal SRAMs, SRAM1, SRAM2, SRAM3, SRAM5 and SRAM6 and external memories through FSMC, HSPI1 or Octo SPIs. Its default slave is SRAM3. This bus connects the LTDC master interface to the bus matrix. It is only used to load data from the memory. This bus targets the GFX MMU in addition to the data memories, internal flash memory, internal SRAMs, SRAM1, SRAM2, SRAM3, SRAM5 and SRAM6 and external memories through FSMC, HSPI1 or Octo SPIs. These buses connect to GPU2D master interfaces to the bus matrix. They are used by the GPU2D to load store data from to the memory. These buses target the GFX MMU in addition to the data memories, internal flash memory, internal SRAMs, SRAM1, SRAM2, SRAM3, SRAM5 and SRAM6 and external memories through FSMC, HSPI1 or Octo SPIs. A 16 kilobyte data cache, DCASH2, is present on the GPU2D M0 port in order to improve performances when fetching data from external memories. The default slaves of these buses are SRAM1 for DCASH2 and GFX MMU for M1 port. This bus connects the GFX MMU master interface to the bus matrix and slave bus connection to be accessed by graphical peripheral master buses. The master bus is used to load data from the memories and to store data to the memories. Internal flash memory, internal SRAMs, SRAM1, SRAM2, SRAM3, SRAM5 and SRAM6 and external memories through FSMC, HSPI1 or Octo SPIs. Its default slave is SRAM6. The SD and MMC controllers are master modules. They can access any memory, internal or external because data written to SDMMC is read from buffers in RAM and data read from SDMMC is stored in buffers in RAM. The default slave is SRAM1 for both controllers. Some peripherals support autonomous mode. They remain active while the microcontroller is in low power stop mode. These peripherals generate a kernel clock request and an HB-APB bus clock request when needed in order to operate and update their status register, including in stop mode. If the autonomous peripheral is configured with DMA requests enabled, a data transfer is performed thanks to the AHB-APB clock. The autonomous peripherals mapped on AHB1, AHB2, APB1 and APB2 belong to the CPU domain, also called CD, and are autonomous in stop-zero and stop-when only with the GPDMA and SRAM1, SRAM2, SRAM3, SRAM4, SRAM5 or SRAM6. The main matrix belongs to the CD. The autonomous peripherals mapped on AHB3 or APB3 belong to the smart run domain, also called SRD, and are autonomous in stop-zero, stop-when and stop-to with the LPDMA and SRAM4. The LPDMA has access to SRAM4 and AHB3, APB3 peripherals only. The best matrix manages the access arbitration between the masters. The arbitration uses a round-robin algorithm which starts with the lowest-numbered requester. Arbitration is done on allowed arbitration points to ensure that bursts cannot be preempted. This best matrix features a fast-best multiplexer used to connect each master to a given slave without latency. For the same master, other slaves undergo a latency of at least one cycle at each new access. In addition to this presentation, you can refer to the following presentations. Instruction Cache, iCache, Data Cache, Dcache, Power Management, PWR, Reset and Clock Controller, RCC.