 við presenting open stack for computational workloads. Æm Jöntor Kristessón. Æm been building basic clouds focused on hyperhorms computing for the last decade or so. Started using cloud stack, and then moved on to open stack. So I'm going to highlight a few applicable use cases over where this kind of computational workloads might exist. One is the Automotive and Aerospace Actors, where HPC is used for simulation of workloads. There's a computational fluid dynamics to simulate how vehicle might perform or be affected by the real world. Automotive has also been an adopter of AI and ML for autonomous driving and various other workloads. The energy sector optimises the placement or operation of wind turbine or solar panels and optimises the delivery of electricity through its network of power lines. Exploration and production workloads are used for discovery and recovery of natural resources. In the healthcare sector, HPC and AI ML is used for anything from detection of cancers, cancer cells to the simulation of blood flow through vital organs. It's used to understand our major organs. In pharmaceutical and genomics, the use of HPC is used to discover the function of genes and to understand complex biological processes, such as protein folding. Research and universities are using HPT to reach new discoveries to better understand or improve the world we live in today. The media and entertainment industry uses large rendering clusters for video rendering and generating visual effects. The financial sector uses HPC for risk and anomaly detection, high-frequency seed rating and portfolio management. To better support these workloads, I will provide some configuration suggestions that could be applicable with the idea of reducing latency and or ahead via direct IO and direct usage of CPU features leading to improved performance. First, I will start by highlighting some hypervisor configurations that might be of use. Consider EPA features. Avoid emulation and interaction can significantly improve performance. Many features available that can improve performance in a variety of ways. For example, huge pages allow for the usage of memory pages larger than the standard size, meaning fewer memory transitions require fewer cycles. This improves overall memory access speeds. Do consider Numa awareness and ensure CPU execution processes and the memory used by these processes are on the same Numa node. Ensuring local memory access and avoiding the usage of limited node, cross node memory bandwidth. Avoid unnecessary latency to memory access. Do consider CPU pinning. Avoid rescheduling or the moving of guest virtual CPUs to other host physical CPU cores. Improves the overall performance and makes CPU scheduling deterministic, which is critical for the scheduling of computational workloads such as those in HBC. Do consider host CPU feature requests for workloads that can take direct advantage of CPUs instruction sets such as advanced vector extensions or crypt offloading or any other CPU features that might be applicable to your workload. The host pass-through gives the best performance but can limit line migration options in mixed CPU environments. It's not advisable to run workloads across mixed CPU environments and that should not generally be a concern. Do consider PCI pass-through for direct access of computational PCI devices, such as GPUs and FPGAs. Endev and VGPU features can also be used to take advantage of some of the PCI devices, often giving access to more possibilities in terms of device segmentation. An instance type for HBC could effectively consume the entire hypervisor. If the workload can take advantage of all available resources, a compute flavor for HBC instances could expose less than the total available CPU cores to provide a cost-efficient offering for users of commercial HBC workloads that are memory-bound with limited to a greater scale and they are limited by availability of CPU cores. Consider not overcomitting on memory and CPU cores for compute-focused flavor under these circumstances. Consider dynamic frequency scaling so workloads can make greater use of dynamic frequency scaling, if not all cores are in use. For example, when memory-bounded is limited, do consider hypervisor tuning. Reference configurations exist from all major CPU manufacturers. Give them a try and validate it. They often provide guidance on biosettings and other settings to maximize performance of underlying hardware. Now I'll highlight some of the options available for networking. Do consider SRIV for networking if workload is sensitive to latency and bandwidth. Most computational workloads that scale and communicate across a cluster of nodes are or possibly consider hardware of load using oven and DPUs. Do consider DPDK for routing if, for example, computational storage or project storage traffic goes across a router. Overall, best to avoid routing for computational traffic if possible. This graph should preferably be designed in such a way that computational storage goes across networks. Next I'll go over some storage considerations. Do consider direct and local IO or computational storage. The further the workload has to go to access its storage, the more latency is added increasing overall computational time. Do consider workload requirements for computational storage. HBC and machine learning have different IO patterns. HBC workloads generally have higher write than reads and machine learning often the opposite. There are cases when it can also be mixed. Performance considerations also differ with HBC often needing high bandwidth for larger IOs and machine learning with smaller IOs. HBC workloads use IOPS for metadata while machine learning use IOPS for both data and metadata. But overall it's always about observing, measuring and monitoring. Consider using a comprehensive monitoring tool to observe, measure and validate any assumptions made on performance and limitations. It helps with identifying any bottlenecks such as CPU, memory bandwidth and IO limitations. Observe, measure and test assumptions. Plimp your tools available, make incremental changes and test duality. Nón tools are, for example, high performance link pack. It's useful for kind of figuring out both CPU and network limitations. For memory bandwidth, stream triad is a popular tool. Storage, FIO. A network there are various MPI tests and of course IPERF. Lastly I will highlight use case. We have a reference configuration available here where you can scan it with a QR code. This case is called Firmascript. It's a local cloud private in Tasmania. They are basically offering AI ML and HPC with computational focused clusters in data centers with extremely low PUE. Feel free to take a look at the surface case. Thank you. If you have any questions feel free to ask them now or find us at boothB11 to have a discussion.