 Thank you very much, Kenneth. Good afternoon, everyone from Luxembourg. We are very pleased to have been invited to share our experience and our plans at this sixth EasyBuild user meeting. So please allow me to introduce myself. I am Valentin Plugaru. I'm the Chief Technology Officer of Lux Provide, Luxembourg's HPC Center. And joining me today and will present part of this talk is Robert Mjaković, HPC Systems Software Architect at Lux Provide. Hi, guys. So to give a bit of hint of who we are and our context, so Lux Provide has been recently created for to take care of HPC, HPTA, and AI services in Luxembourg. We are hosting entity for the HPC joint undertaking. We're building up the Meluxina supercomputer as we speak, actually. And we're also part of the Euro-HCC project, which is joining the National Competence Centers for also HPC, HPTA, and AI across Europe. So to give a bit of hint of the Meluxina supercomputer, the building blocks for Meluxina have always been to serve a wide variety of user workloads to have a design that corresponds to the needs that we have seen that were on the horizon. Well, at the time, they were on the horizon, and now they're pretty much confirmed. And all the centers are doing things to take care of this big convergence of different workloads. Some of them being truly data-driven, but some of them are still pure HPC heavy-duty computing in nature. And to build up a system that is resilient, secure, very capable, of course, and highly efficient. You may have seen this in some different presentations before. The architecture of Meluxina is done as a modular system. So the modular approach joins together what are different type of computing systems. So in our case, there's a cluster module, which contains, I would say, traditional HPC computing nodes with quite a good size of memory on top of them. So they can be considered the fact nodes by traditional standards. An accelerator module which has GPGPU accelerators, AI accelerators, but also a smaller section with FPGA accelerators for highly specialized workloads. There's also a large memory module with the fact nodes in terms of memory that are allowing in-memory computing. So for different kind of data sets that can help accelerate some type of workloads. And underlying all of this is, of course, a storage module with different tiers, with different performance and sizing characteristics. Most of the storage is actually on our last-door file system and with a different bandwidth and IOPS characteristics depending on the tier. Joining all the modules together is an infinite-band fabric, HDR200. It's now well-established in the HPC community. And actually, this is in a Dragonfly Plus topology and on the Lucina. There is one module I didn't speak about. So this is our cloud module, which is meant to host persistent services, data portals, different kind of orchestration mechanisms, APIs, things that can drive computation on the heavy-duty compute or storage modules. And all of this is well-connected to both internally and to the outside world. A few numbers of my Lucina. So in total, we're supposed to be around a bit above 18 petaflops aggregated performance around all the modules. Of course, the heavy-duty ones are the cluster module. Above two petaflops is supposed to be the measured HPL performance and 10 petaflops on the accelerator GPU part. The data tiers are both disk-based spinning rust, but also all flash for very highly intensive IO operations. Different kind of data sets will be hosted there for machine learning, for deep learning, and other kind of workloads. There are also data tiers that are for redundancy, for data backup, and finally for long-term storage with a TAPE archive solution. Again, we are linking internally with very high bandwidth. So HDR, everything is on the HDR200 interconnect. But there are also high Ethernet. It's a high-performance Ethernet solution deployed internally. And to connect to outside, we'll have very good links with Jean through the Restina and Ren in Luxembourg. And this will link us, of course, to all the URHPC sites, but also in Luxembourg with Public Sector Research Academia. A few words about the technologies. I mentioned the CPU nodes, GPU nodes, just to give you some hints of what they are. So the systems, the heavy-duty compute ones, the CPU nodes, GPU nodes are delivered in DLC packaging. So liquid cooled. Of course, all of this is delivered by ATOS in our case. So we are counting on high-performance AMD Epic CPUs, the ones that are at 280 watt. So the systems that we are having is CPUs with the nodes are 128 cores, around 4 gigabyte of RAM per core on that. While on the GPUs, the amount of RAM is the same. They have a bit less cores, but this is also coupled with NVIDIA Ampere generation of GPUs. There's four GPUs between the link and 40 gigabyte of high bandwidth memory there. On the GPU nodes, we also have local storage, which is a differentiating factor for the other ones. And they are linked with dual rail on the fabric. Large memory nodes, the differentiating factor here is that they have 4 terabytes of RAM. So there's 80 terabytes across this module for in-memory computing. FPGA nodes, they're similar to the CPU nodes, except that they have two Intel Stratix accelerators. And then there's also some cloud nodes, which are rumored for the persistent services. But today's talk is supposed to be about the software stack. So let's get to it. To talk about the software stack, we have to talk a bit about the workloads that we're going to see. So Luxembourg's key sectors are, as you're seeing, for example, health tech, logistics, space, financial services. And those will translate in the use cases on Meluxina. So it's, again, a coupling of HPC modeling, HPC simulation, but also data-driven workloads, data analytics, and also predictive AI workloads. What we are doing to handle that infrastructure on those use cases is what we're calling the MUSE in a gentle way. So the Meluxina user software environment. Our goals have always been to enable the kind of applications people would expect on HPC platform, and also to be prepared for the future. So the reach of the environment is supposed to be provisioned on the system. This will be with free open source tools, community applications, but also commercial packages for simulation modeling analytics. Of course, we plan to keep this close to recent versions. Otherwise, people will complain. And to maintain it for, this is supposed to be a production system. So the software sets need to be also maintained for a good amount of time. So we can talk about what the good means. The idea is that all of those goals come with some challenges. So it's a reach of the environment, so it's complex. We already have different kind of architectures on the CPUs and to the ampere and GPUs, the Stratix FPGAs. If you plan to update the software quite a lot, then you have to also validate the functionality and the performance. And if you maintain the production set for some time, then you have interaction, of course, with operating system packages that need to be upgraded and other things. So now to talk a bit more in-depth about the software environment, I will give the token to my colleague. Thank you, Valentin. So let's continue with the Meloxina user software environment. So deploying software is done a bit easy build, but also with spec. And we keep a close eye on the easy project and want to learn lessons from its experience. For the module system, we have selected the ELMOD module system with a hierarchical naming scheme from easy build. We take what is already available from the community and plan to adapt it, upgrade and contribute back to the community as much as possible. So that basically goes into the area of easy conflicts, easy blocks and spec packages. The first production environment is already available on our tests cluster, Goupi, with tool chains from FOS and IMD. The initial set of common HPC compilers, libraries, frameworks and real applications is already available and these extensions are gonna be based on the user's needs. Yeah, to follow the vision of looks provided, we plan to optimize our software stack as much as possible for all Meloxina segments available. To select the compilers that support the instruction set of roam micro architecture and compilers that allow us to run optimally on all segments of our systems, including our accelerators. Therefore we use the newest AOCC compiler and GCC that both supports Zen architecture, micro architectures version two with a CUDA toolkit and Intel FPGA OpenCL SDK. So to provide our users always with the newest software stack, we plan to have a stable production releases. As you can see at the bottom of the slide, there is a release timeline. Before each of the releases, we have the staging release to make the software stack as stable as possible. But also this software stack is used for our internal testing and it's provided to our users as a preview release and then our most adventurous users can give it a try. The software stack is released, sorry, can you go back just one slide? The software stack is released at least once per year. It is not upgrade but maintain for two years. After two years, the stack is available but we no longer plan to maintain it, actively maintain it and we consider it deprecated. So users that haven't migrated yet to the new software stack are suggested to do so. And we can of course as a center help them to do so as our additional value. So in the fourth year, the software stack will be hidden from the users and they're not supposed to use it any longer except in the extreme cases. Yeah, so then about the available software groups at Meluxina, here I only mentioned few groups of mostly widely used software from the compiler's perspective that is those are compilers starting with AOCC. As I mentioned, this one is the most optimized for our micro architectures. Then comes the GCC, Intel, which might be a good solution to some extent but it's known to not favor optimizing for architectures that it doesn't understand. Then NVIDIA HPCSDK, which includes PGI. Then there are several MPI suits such as OpenMPI, Intel MPI, and our partner, Parastation MPI. So wide range of numerical and data libraries, frameworks such as SpyTorch, TensorFlow, Corvod, Keras, Apache Spark, et cetera. There are also groups of software packages for visualization build, performance and debugging and of course applications. We plan to have approximately 100 application packages without their dependencies and different specifications for each of them to satisfy the needs of different Meluxina segments. At the moment, there are more than 50 packages on our test cluster. And as I mentioned, all these packages are delivered with both easy build and spec. Yeah, so as we know, there are certain software stack challenges. We try to be ready for them. But yeah, so we are gonna face some additional stuff. So the software stack of every data center is big. It's complex and it's very sensitive to changes. So even sometimes when you have an innocent change, it might introduce big troubles for software packages that are installed on the system. So those changes can be, for instance, system configurations, OS level changes, for instance, drivers and libraries. So therefore, testing will be essential for us to keep most of the things working as expected. So of course, it's not possible to catch every corner case that can go wrong. So therefore, we will try to test our software stack in a consistent, maintained and automated way. The testing will be periodic and on demand depending on our changes in the software stack. And we will monitor not only functionality but also performance for that stuff we are gonna use a reframe. We will do regression tests on new versions of software installations. To better understand the needs of our customers, we also plan to monitor software usage. And with that, knowing that we can adapt our effort much quicker and easier to rapidly evolving users' needs. Yeah, so one alternative to the software stack that we provide will be possibility for users to bring their own software stack in a container. In such a way, users can control the complete software stack which is also beneficial for reproducibility. For achieving it, of course, we are gonna face some challenges that needs to be solved such as integration with MPI suits, integration with outer software stack and host device drivers for optimal performance. And of course, it's a matter of security. For the moment, we plan to use singularity as our concretization technology. Thank you very much for your attention and please tell us what you are interested in and we are gonna answer it. Okay, thank you very much, Robert and Valentin. Thank you, Kenneth. If there are any questions, please raise your hand in Zoom and we will let you unmute your microphone or ask the question in Slack and we will pass it through. I don't see any questions right now so let me maybe ask a question that you may be expecting. You're using both EasyBuild and SPAC you mentioned. I haven't seen many sites doing that. They either go one or the other usually. Can you say a little bit more about why you're using both? Is it for different use cases, for different applications? So here I can say a few, so thank you for the question. A few hints and insights on that is as follows. So some users are used to EasyBuild, some users are also used to SPAC. We plan to divert most of our effort into EasyBuild. However, there are things that we can make at least as a transition step for users of SPAC to help them bootstrap, for example, on the MPI suite, which will be highly tuned for Medoxina or some of the compilers that are highly tuned. So if somebody wants to build with SPAC, we will definitely help them by allowing them to and enabling them to use the EasyBuild modules that are already there. And that means that some of the effort in optimization for Medoxina, they will not need to redo. We do hope, of course, to see many users on Medoxina with EasyBuild, but we will not say at this point that people with SPAC will not be helped. Yes, so. Robert has a very big experience on SPAC as well. I will let him say a few words. Yes, so I mean, so both tools have their advantages and disadvantages, of course. So and I mean, it's not bad to have both tools and support packages, which are maybe to some extent better supported on one of the two also. And yeah. So as Valentin mentioned, the main distribution tool would be EasyBuild. Yeah, so EasyBuild would mainly be used for centrally installed stuff and you would help users if they prefer SPAC or if they need the flexibility that SPAC gives them to reuse what was installed with EasyBuild and then install the missing parts with SPAC in whatever way they want to. Is that sort of? That's exactly right, I can. Correct, okay. That's the plan. Yeah, that sounds very reasonable, yeah. But as I also mentioned that on Monday, and it's pretty clear in the surveys, if you look at both tools, that SPAC is more geared towards developers who need lots of flexibility and EasyBuild is maybe more geared towards people who maintain the system and the central software stack. And there's definitely a case for using one on top of the other or combining the tools. This makes a lot of sense. There's a play between the two sets. Yeah. Okay, let's see if we have any more questions. I don't see any other questions popping up.