 Yes, thank you. I hope you had the opportunity to have a quick lunch. So I'm going to talk about an easy project. My name is Bob Droege and I'm from the I'm from the HPC team of the University of Groningen and I work for the Center for Information Technology there, which is the Central IT Institute. I'm mostly doing HPC user support and training, which also involves quite a bit of software installations, a little bit of system administration and working on some larger projects. So at the moment that includes the Euclid space mission from the European Space Agency, which I'm doing quite a lot of HPC and infrastructure related work for our local data center. And also I'm doing quite a lot of work for EZ, which I will tell about today. So I will tell you all the details about the project, so who is involved, why did we start it, what are we trying to solve and how the current status of the project and also the future work that and the things we're working on at the moment and in the near future. And of course I want to show a live demo, so show you what we currently have and how that works on a bunch of different systems. So I hope to finish the slides way before my time runs out, so I have some time left for the demo. So first some details about EZ. So in a nutshell, again it stands for the European Environment for Scientific Software Installations and we abbreviate that to EZ, pronounce it as EZ. And it's a collaborative project with both partners from academia and industry. And our common goal is to build a large scientific software stack for HPC systems, but also for different kind of systems. So for instance laptops, workstations, cloud infrastructures, etc. So most of us are from HPC, but our scope is a bit broader than that. Initially the project was started by a bunch of Dutch universities and also by Dell and the University of Cambridge. After a meeting that we had in Cambridge organized by Dell and during the meeting we were mostly talking about common challenges and issues that we all had, not a kind of HPC experiences. And after the meeting we decided that maybe it would be nice to collaborate on at least one of those shared challenges that we all were facing. And well almost immediately installing software came up as a common struggle that we all had. So we started working on that as a project and in one of the first meeting we also invited Kenneth because we were planning to use EZ build as most of us were already using EZ build or planning on using EZ build in the near future. And since then lots of other partners have also joined the project. So on this slide you will see a whole bunch of them, but that's not, that doesn't include every single partner and every single person from all the university. So we have quite a lot of people already who are at least interested in the project or contributing in some way. Also including some larger commercial companies like Microsoft and Amazon. So the main motivation for starting the project as I mentioned we were all having this issue with installing software and that is taking up more and more of our time for different reasons, mostly because lots of things are increasing over the last couple of years. So most of us have way more users than before, also from lots of different and newer backgrounds, not only the exact sciences anymore, but quite a bit more. There's an explosion in the availability of all the different software that is available, especially in Bioinformatics. There's lots of new hardware available, so different kind of CPUs are popping up, different kind of GPUs. AMD is now back with new GPUs, Intel wants to bring out GPUs. You get more and more of the specialized accelerators, mostly targeted at for instance machine and deep learning. Then there's also a completely new kind of infrastructures like clouds, which are getting very popular and where you just get often MTVMs and you don't really even know what kind of hardware you will get. But on the other hand, the available manpower often doesn't really keep up with all those increases. So it's becoming a challenge to support all these different systems and to make all the software available on those systems in a good way. About two important remarks, as probably most of you know, since it's an easy built user meeting. So the first issue with installing scientific software is the installation itself, of course, which is often tricky and not trivial, which is more or less literally illustrated on the slides. So all these different comics and also Kenneth's talks from a few years ago are all about this same topic. So that already shows that there's an issue with this particular topic. Now this easy built, of course, which partly solves this issue by making it easier to install particular applications on specific hardware. But it doesn't solve all those issues that I mentioned on the previous slide. And another thing is that we are all basically now redoing each other's work by again installing all these different applications for our different clusters. Then the other remark, since this is mostly for HPC, it's important to look at the performance of the application and properly optimize it for the hardware that you're going to use it on. So this shows an example of what can happen if you don't properly optimize Gromax, which is a popular molecular dynamics tool. And then running it on the same hardware, but with different kinds of optimizations, you can get quite a lot of speed up if you do it right. And that's sometimes this is just easy. If you just have one CPU architecture to support, it's easy to just build it on those machines that you're going to use. But nowadays we see lots of these HPC clusters, which have a mix of CPU architectures and different kind of accelerators, which makes it a lot harder to install the software in an optimized way for all the different hardware that you have to support within that cluster or within a bunch of clusters. So with EZ, we hope to solve lots of these issues and take the work that EZ builds can already do a few steps further. So what we want to do is basically make one large shared repository with lots of scientific software already installed in that repository and do that in a collaborative way so that we are not all doing the same work all over and over again for all our different clusters. That will also bring in other advantages, of course, because the more partners that get involved in the project and that are going to use the repository, the easier it will become for users to switch between clusters and use that software. And then they will basically have the same software stack on all those different clusters. Also, we want to make it work on different kind of systems. So as I mentioned before, not only on HPC clusters and not only on specific Linux distributions, but in principle, we want to support any kind of hardware. So if it's a HPC cluster or a cloud instance or a laptop, it should really matter, which also makes it easier for a user to just try something first on his or her laptop or workstation and then use the same tool in the same way on the HPC cluster. But as I mentioned, we want to support basically any Linux distribution and even macOS and also Windows by using the Windows subsystem for Linux. And in terms of hardware, so lots of different CPU architectures, of course, special interconnects that we need on these HPC clusters, like in FiniBand, different accelerators, so of course the GPUs should be supported. And the focus of our project is on different aspects of course performance. That's really important to optimize properly for all those different hardware that we want to support. But also a strong focus on automation because we don't want to manually do all these builds and all those different architectures. So we basically want to automate that and make it easy to add new software to our stack. We also want to focus on testing so that we are completely sure that everything included in our repository has been verified to work correctly. And finally on collaborating so that we don't do each other's work all over again. So that sounds quite ambitious but as I will show later on, we already have some kind of pilot version at the moment which will do lots of these things. So we are quite confident that we can solve lots of those issues with our project. So the project is not completely new or the idea of the project, so it's largely inspired by the Compute Canada project. So they already have such a kind of software stack for all the different clusters in Canada, also for some smaller ones. And we also, when we started the project, reached out to the people from Compute Canada and we get lots of input from them. It's part already mentioned as well after the last talk. So we can learn from what they did and we even discussed at some point at the beginning if we do it together. But since we have slightly different scopes and we want to take it a bit broader and support more architectures and more sites basically, we decided at some point both that it would probably be better to start a new project next to theirs. But most of you are probably familiar since their project has been presented at several conferences, also the previous user meeting, but on the slide you can find links to papers and talks from their software stack presentations. So besides the Compute Canada stack, let's take a look at if there are similar kind of projects or projects that have some goals in common. So there's not a lot of projects that have the exact same goal set of goals, of course, but one that we get lots of questions about in recent talks is, so what about containers? Don't they solve lots of those issues as well? Well, they do partly solve some issues, but in terms of HPC, they also don't solve a lot of issues. And first the disclaimer, it's not that I don't like containers. I actually like them a lot. I also use myself a lot, but I really don't solve all those issues that I mentioned on one of the previous slides. So what you often hear with containers is, well, they offer native performance and they offer this so-called mobility of compute. So you can basically run wherever you want to run your program, which is partly true. But if you translate that, that basically means native performance means that the container runtime itself doesn't have a lot of overhead. So then it doesn't really matter if you use a container or run on the real hardware, the physical hardware. And mobility of compute often means, well, here's just a very large container image, which includes the entire operating system, for instance. And then the image can become very large. It does include everything. So it's in that way very reproducible. So you're quite sure that it will work on different systems. But it can be a challenge to even copy that image around to different systems where you want to use it on. So in terms of ASPC, there are some issues still with containers. So then maybe the major issue is how to get optimized containers for different kinds of architectures. So that's what I meant with the native performance. The software itself is often not optimized for specific hardware. So you can ship a container with very generic binaries around, and then it will indeed work on different systems. But as I showed you on the previous slide, you can get quite a performance loss if you don't use the right optimizations. And that's not easy to solve with containers. Then you have to provide containers for all the different architectures, which will be a challenge to maintain, which brings me to the second thing. Someone has to build and maintain all these different containers, because you don't want to include all the software in one container. And you really get a huge container, which also ties into the third question then, how to make these containers easily and quickly available to users. So you don't want to ship around containers of like hundreds of gigabytes, which include the entire software stack. And finally, you want to be sure that users can just run those things out of the box easily on HPC systems. So supporting different kind of schedulers, MPI implementations, accelerators, et cetera, which will work nowadays with singularity, but not always. And you still have to be careful with, for instance, MPI jobs in containers using the right MPI versions, et cetera. So that's not a completely solved issue yet. Then there's one other project, which has quite similar goals, also quite similar name to ours, which is E4S or the extreme scale software stack. So if you look on their website, it sounds like they also want to build one large software stack for HPC systems. But looking closer at that project, they don't really provide that, at least not yet. It's basically a set of stack environment files that they offer. So a bit similar to the easy stacks that Kenneth has shown yesterday in his talk. But that basically a collection of applications recipes basically that are known to work well together. But it isn't really offered as a takeaway stack that you can just use out of the box. They do offer that in some way by offering, for instance, containers, pre-built containers, and also a spec build cache with pre-built packages. But then again, there's an issue that that software is not optimized for specific hardware. So you get quite generically compiled applications, which is, of course, not what you want. And a container image that shows the issue or one of the issues with shipping these things as containers. One of their images, the latest GPU image is already more than 25 gigabytes, even though it doesn't include that lot of software. So that will grow to large sizes quite soon with the more software you add to it. So let's zoom in a little bit now on our project and see how we hope to solve these issues. So our structure is quite similar to what compute Canada does. We basically have three different layers, or four if you include the host operating system here, which can in principle be any kind of Linux distribution. And we basically only depend on the host for the drivers and tools like Slurm, so drivers for the GPUs and the network file system, etc. But what we add on top of that is first our file system layer, which distributes our stack to the client system. Then we add a compatibility layer that basically levels the ground and makes our stack independent from the host operating system. And finally, the software layer that contains the actual scientific software installations and dependencies. I'll zoom in on those a little bit later. First, some of the tools that we are using. So EC itself is a free and open source software project. And it also depends on quite a lot of other first projects. So this slide lists quite a lot of them that we heavily rely on where the colors match with the previous slide. So for instance, for the software layer, these three tools are being used each build ArchPak and Elmod. So we'll not go into details. I guess most of you are familiar with those. Maybe ArchPak mentioned a little bit, which is important in our project for determining on what kind of client system we're running. So it's a Python library that basically detects your micro architecture. We use that to make the right software stack available, which is optimized for that architecture. Then a compatibility layer that's based in Gen2 prefix. I will not tell anything about it since Fabian has just done that. But that's what we use basically to provide the operating system layer, which has all the dependencies for our software over here. And finally, the file system layer that heavily relies on CERN-VMFS, which was covered by the talk that Jacob gave yesterday. Then there's some additional tools on the side that we use for other purposes, not particularly for those layers, but for other things. So for instance, there's Reframe. We're not using it yet, but we want to start using that soon to do at least lots of testing with our software stack and make sure that everything is okay, that it performs well to scalability tests with software included in the repository. A few more. We use Ansible for doing automation. For instance, to automatically deploy lots of CVMFS infrastructure to install the compatibility layer, basically to install a Gen2 prefix and add our customizations on top. The Staraform and cluster in the clouds that we want to start using to easily spin up cloud instances or even clusters that we can use for building and testing both the software but also, for instance, the compatibility layer and building the software itself. Finally, the Singularity, which we are currently using for our built machines so that we have a very clean environment and a controlled environment where we can build the software. And we also use that to offer an easy way to clients to get access to our software stack without having to install CVMFS. I'll show that later on as well. So zoom in a little bit on those layers. So first the file system layer, again, based on CVMFS. So that's our software distribution layer. And Jacob already told all the details about this yesterday, so I'm not going to repeat that. And we're also giving a CVMFS workshop this week. So if you're interested in how this works, you can have a look at those materials from that workshop. But the key message here is basically that CVMFS is a very reliable and scalable software distribution service. And you can very easily scale up by adding more replicas and or more caches. So if you have a large cluster and you need more performance, you can easily add more caches very close to your HPC cluster, which will cache the parts of the software stack that you are using often. So it's specifically intended for distributing software, so it's very good at that. And it does that via HTTP, which means it's also very firewall friendly. But for us, the most important thing is that it allows us to make our software stack available on any client system in the world. So then the next layer, the compatibility layer, as mentioned, we use Gen2 prefix here to do a Linux installation in a non-standard location, so the prefix. An occasion that prefix looks something like this. So this is in our CVM repository. We have a pilot repository, which has a version directory, and then the compatibility layer. And then we have several ones, at the moment only for Linux, but we also want to start adding one for macOS. And then we have one installation for each family, basically. So we have one compatibility layer for ARM64, one for Power64, one for X8664. So everything in the compatibility layer doesn't need to be heavily optimized or doesn't really need to be optimized, so we can just get away with just one for the entire family. And this makes sure that we can easily add our software on top of that layer, and then we don't depend on the client anymore. So that brings us to the software layer, which is the final layer, and which provides all the real applications. And these ones are heavily optimized towards specific micro-architectures. So we have different trees in our repository, and then one, for instance, one for Intel Haswell, one for AMD Rome, one for specific ARM processors or power processors like Power9. And to make sure that they work on any kind of client, these installations may only depend on libraries or applications from either also the software layer or from stuff from the compatibility layer. So it shouldn't link against any host libraries, otherwise it will break stuff. So to install all these software applications, we basically use three main tools. First easy build, of course, for the actual installations, and then we run it for different or on different kind of systems with different micro-architectures so that we have optimized versions for basically any micro-architecture that we want to support. Then easy build also generates the module file that we need for our environment modules to Lmod, which we also use to offer multiple versions of the applications to the users. And finally, as I mentioned before already, we use ArchPEC to basically run time detect what kind of host architecture, micro-architecture this system is based on. And then we look in our CVMFS repository for directory that matches that micro-architecture. Of course, we can't support every possible micro-architecture, but that's also where ArchPEC is really useful because it's also capable of telling you, well, there's not the exact match, but the best possible match for this micro-architecture that we do have in our repository is another one, which means that we can also always fall back to a slightly lower micro-architecture or all the micro-architecture for which we can be sure that all the software is also running fine on this particular micro-architecture of that client. So that's also something that ArchPEC can do for us. So that all sounds like a bit of magic, but we actually already provide quite a lot of stuff to make this happen. So on our GitHub page, you can find all that we have at the moment. So we have lots of Ansible playbooks, for instance, to deploy the different layers of our projects, of the file system layer and compatibility layer mostly. There's lots of scripts that we use to build software. There's already a bit of documentation. It's not that extensive yet, we're working on that. We also have an initial setup for the CVMFS infrastructure, since it's still in the pilot phase. We don't have a lot of servers yet. We don't need a lot of servers yet, since it's just testing at the moment. But we have some servers in Groningen and Oslo to allow people to try our software stack. Our current compatibility layer supports both x8664 and ARM64, and the slide is already outdated since yesterday when I added a power compatibility layer as well. So that's also in place now. And we support basically any kind of Linux client, which also includes Windows, if you use WSL on Windows. I'll share that later on as well, how that works. Then in terms of software, we don't provide a lot of software yet because we're still working on, for instance, the overall structure on automation, on solving issues that we have encountered with our current pilot version. So we're first tackling all those issues. And at some point when we are quite sure that most of the things should be okay, then we will scale up in terms of software and start adding lots of more software. But it should be rather straightforward by using just easy builds. And for the same reason, we also don't support a lot of micro architectures yet. So we just chose a few ones that we, that most of us are using at the moment or can be using for their kind of systems. But at some point when, again, the overall structure is working fine, then we can also scale this up to more micro architectures, which should probably be easy for at least the, for instance, the x8664 ones. Probably we will run into some issues if we start supporting more specialized micro architectures. But yeah, we'll see about that then. So if you go to this link over here, you can find the current documentation for our pilot version. That includes all the instructions that you need to start running this yourself. And there are several ways how you can do this. So this basically shows how that works, that whole process. So first, if you want to use it, you need of course access to a repository. And that basically just requires a CVMFS client on your machine or whatever machine you want to use. Jacob already mentioned this a little bit yesterday as well. So you can either install the client natively, for which you of course need permissions, because you need to install some packages, et cetera, do some setup. This is the recommend method, especially for production systems, then it's probably best to have the native client installed. But instead, if you don't want to do that or you don't want to do it yet, for instance, if you just want to play around now with our pilot and see how it works, there's a very easy alternative by using a singularity container. So if you have a system which already has singularity, you can use this option. And then you don't need any kind of root permissions, then singularity will take care of doing the mount for you inside the container. So this is especially useful if you just want to play around for a bit. So once you have access to the repository, the next thing that you have to do is source an init script that we provide. So it's just running one command as I will show one of the next slides. And what it will do, it will basically use Archspec to check your micro architecture and set up the paths that point to the location that we have in our repository that includes software installations for your particular kind of micro architecture. And once you've done that, you can finally start using the modules. So you can just do the module commands that you're familiar with, the module avail, module load, etc., and start using those modules. So a bit more details about those steps. So this shows how to install a native CVMFS client on your machine. So just an example of a CentOS 8, but it doesn't really matter which distro you have, they provide packages for most of the popular Linux distributions. So it basically means that you have to add their repository, install the package, and you have to install one of the packages that we provide, which contains configuration files for our repository. So you don't have to set it up manually. And then with CVMFS, you'll always need one machine specific file, which for instance says how large the cache on your local machine can be. So that's where all the cached files of the repository will be stored. So basically when you start using applications from our repository, it will pull in those files on demand over HTTP and store them in your local cache. And once you've done that, you have to run this single command provided by the CVMFS client, which will set up the repository on your machine and mount it under CVMFS. And once you've done that, you can run the LS command for instance on the repository and start using it. And the alternative that I mentioned is Singularity. So we provide the Singularity container or basically a Docker container and Docker image stored on Docker Hub. So that's where you can find that one here. And basically you can pull that in with Singularity and use some special flags, the minus minus fuse mount flag, which Jacob showed yesterday as well, which will tell Singularity to use fuse to mount CVMFS while your container is being initialized. So you just need to provide some configuration for this, which we here put in environment variables. And you have to provide some local directory where the CVMFS cache will be stored. So that's those directories that we use here and bind into the container. But in principle, you can just copy paste these commands, maybe change the pathway you want to store your cache, but then just run it like this that will pull in the container and start a shell inside that container, which will have CVMFS mounted for you, and then you can just start using it. Note that this will give you quite powerful, but a very small container. So coming back to the container issues that I mentioned before, this container actually is very, very small. So it's just 167 megabytes approximately. So you can pull it in quite quickly. But maybe with some kind of loss of reproducibility, because not all the software is actually included in that image, then it will be very likely would be very large again. So all the actual software will be pulled in on demand if you use this container. So you do rely on us somehow to keep the software available. Although we could think about things there as well, if you really want to store some particular set of applications, for instance, in a container, we could probably provide some scripts that will, for instance, dump our compatibility layer and those applications that you're interested in into an image. But then of course, you get more reproducibility again against the cost of a larger image. So fortunately, you cannot have it all. Then sourcing the init scripts. Well, that's just running this command source and then the init bash file that we have in our version directories. So you can pick a version that you want to use. At the moment, this is the latest version. And then if you source that file, it's going to detect what kind of system you have. And over here, you will see this is recognized as Intel Haswell, or maybe something which is slightly newer, but then it falls back to Intel Haswell if we don't have an exact match. It will set up Elmod. It will set the module path for Elmod. For instance, here, you see that also the module path points to the modules in the Intel Haswell software directory. So it uses the right optimized versions of that software. And then the last step, now you can just start using the modules. So you can look for modules, load them and start running. So I have about 50 more minutes, I think. So I can do a demo of how this looks like on different kind of systems. So we have different applications that I could show you. I can run this on quite a lot of different systems. So we'll just pick a bunch for now and see how much time I have. What I'm going to do is use some files from Get Repository where we store some demo scripts, which allow me to easily run, for instance, one of those tools. For now, I will stick to Gromax. Gromax was initially developed in Groningen. I'm from Groningen, so that's probably a good match. So I will switch to my terminal screen. So I have a bunch of tabs open here on different systems. Again, I don't know how much I can show in the 15 minutes that I still have left, but let's start with this machine. As you can see, I hope you can all read this font size. This is an ARM machine. It's actually a VM in AWS, so an Amazon Cloud instance, where the VM itself is still pretty clean. I only installed Git this morning because I wanted to clone our repository, but other than that, it's still a completely empty VM. So what I'm going to do, I haven't done the actual clone yet, so I'm going to do that now, demo. So I'm going to use some files in our demo repository. The first one is this script that I prepared just to quickly install CVMFS, but that's basically just the steps that you have seen on the slide that I already showed. I'm going to run this, which will just install CVMFS for ARM on the CentOS 7 system. So there's some specific package names here to install the right version for my ARM machine. So that should not take too long, I think. There we go. So it also takes care of setting up the configuration files that I need, so the ones in CVMFS. So it already created the default.local, for instance, with an appropriate cache size for this machine. And now I should be able to access the repository. As you can see, we have two versions at the moment, so we sometimes delete the older pilot versions that we had. So this is the latest one, 20.12. We try to make a new version every one or two months. And just to quickly show you the structure of the repository that we have, so there's compatibility layer, at the moment only Linux compatibility layers, and three different ones, so one for ARM, one for Power, and one for X8664. And the contents of all those three is more or less the same, so you will just basically see a kind of Linux installation, which is the Gen2 prefix installation. Then there's also the software folder where you will see a similar kind of structure, so it starts with the architecture. And let's go into this one, for instance, X86. You will see also a generic version, which is just generic installations without optimizations for specific micro architecture. But in Intel, we will have more folders now, so there's one for Haswell, one for Skylake. Quickly can show the other ones that we have. So AMD, where we have a Zen2 version. And for instance, in ARM, we at the moment support both generically compiled software Graviton, which is actually the one for AWS instances. So the machine I'm currently running on, that's a Graviton 2 instance. And there's a Thunder X2, which is another kind of ARM 64 processor. Well, similarly, just to quickly show that one as well. For Power, we currently have some initial Power 9 support. It's not completely complete yet, so it's still missing some installations that the other architectures have. But for instance, Gromax is already there. That's basically most of the contents of the repository. And then there's one more folder in it, which just provides some best scripts that you can source to set up the repository and some additional scripts to help it detect, for instance, your micro architecture. So I'm going to source this one now. So it says it will find the pilot repository. It says what kind of micro architecture it has detected. So it's going to use this one for the software installations, which is in the right one, and then do the setup. Now I can do module avail. And then you can see what kind of software we already have in our pilot repository. There's for instance, Gromax, there's also TensorFlow. What else do we have? There should be open phone. If I scroll down a bit, two versions actually of open phone. And there's also MPI, of course, which should support quite a lot of different interconnects already. Even stuff like ParaView. It's even our installation. So already some applications, also some harder ones like TensorFlow already included here. So I can start running those now. And I will do so by again using a preconfigured script. So I'm going to run this, which will load Gromax, download a Gromax benchmark from the praise website, and just run Gromax on that benchmark. So we'll download and then it should start running Gromax with two threads. And well, now it's running. And that might take a bit. So in the meantime, I can show the same thing on a different system. Maybe first show you the singularity container, my connection broke. So I'm going to log in at the University of Groningen Paragring cluster, the HPC cluster, where I have a test node where I can use our repository. And here I will show you how it works if you want to use singularity. So also the script here that will use singularity to mount the container for, to mount the repository for me. So again, this is more or less what you saw on one of the slides that I already showed you. So I'm just going to run this and it will pull in the container from Docker Hub. It doesn't actually have to do it because I did this before, so it's still in my cache, the container itself. You might see some ugly warnings, for instance, with about the number of files, open files. That's not critical. And also this one, that's a known issue, but you can ignore that for now. As you will see, it does just work. So I do have access to the repository. And again, I can do the same thing as I just did. So I'm just going to source the init script. And in this case, you will see that it's taking software installations from the Intel Haswell tree, which is more or less correct. The actual CPU in this node is, if I'm not mistaken, a Broadwell machine. But since we don't have a Broadwell tree, it falls back to using Intel Haswell. Let me go into the same directory and also here run the Gromax benchmark. And you can see that it will do the exact same thing as on the ARM64 machine, except that this one will use 28 threads, so it should complete much quicker. The one on Amazon is already halfway. This one is starting now. Then to make it more interesting, even this is on my Windows machine. So this is, as you can see here, if you see the Zoom toolbar, so it's on my screen barely visible, but it says WSL Ubuntu. So this is basically Ubuntu installed under Windows 10. So this is just the laptop, the processor in my laptop. And even with WSL, you can just install CVMFS, as it would be a Linux installation. So I already did that, otherwise it would take too much time. But also in Windows, I can just access the repository. I can init the bash script. And here we'll also use the Haswell tree. I think I also have the demo folder over here. So I can even run that here at the same time. It's now running on my Windows machine as well. Let me go to this tab. Well, this is another one on power. I think I have a few more minutes, so I can quickly do this as well. So also on a Power 9 machine, I can access the repository. This one is a little bit slow. I think it's oversubscribed quite a bit, so it's not that fast. But at least I can show you how that will work. So also on Power, it will say, okay, I'm detecting this machine as a Power 64 and then a Power 9 CPU. So also here I can do what you will feel. You might even hear a noise from my laptop now because it's running Gromax, so it starts spinning the fan quite a bit now. But you can see Gromax in here already. So not everything is here yet, so no open phone for instance. So for instance, for now I will also start running Gromax. So now running Gromax on both X8664 Linux, on my Windows machine, on Power and on an ARM64. So ARM is already done. Also on X8664 on our cluster, it's already done. My laptop is even done, so it's now running on Power. And one more minute, so quickly, we'll show you this one as well to make it even more interesting. This is on a Raspberry Pi system. So we have a small Raspberry Pi cluster at our university. This is just on one of the machines. And I'm quickly going to launch the container that we also have over here. So it's same kind of client container for our stack. Oh, I see VMS. And then again, I could just have access to that repository. And I can source the in its script. As you can see, we don't have an optimized version for the Raspberry Pi CPU. So it will fall back to the generic version of ARM64. But, well, it's just a Raspberry Pi. So that should be okay. So I can even show you how it works if I also run the benchmark here. But I'm not going to wait until it's complete because, well, this one is very, very slow. So that might take a long while before it's done. But as you can see, it is running now. And it's using four threads with the four cores of my Raspberry Pi over here. So even the one on power is already done. So I will not wait for this one again. So I will switch back to my slides. But I hope that this at least showed you how easy it is now to basically run the same tool on whatever kind of system you have, whether it's x86, 64, whether it's ARM64 or Power64, and even Raspberry Pi will work. So to conclude some future work that we are currently working on, or start working on in the near future, you of course want to further improve the repository that we have. And we do that by making monthly revisions, or sometimes bi-monthly. And we use this to basically test things, identify issues, and then solve those issues in later revisions of the repository. For now, we're mostly focused on adding more automation and more testing. So for the automation part, we ideally want to make it possible that someone can just open a pull request on GitHub with, for instance, an easy config file and says I want to add the software to the repository that we can then review it and at some point approve it, that that will automatically fire up some virtual machine somewhere in the cloud or some other build machine that we have, start building software, and then at some point automatically ingest it to the repository once it's verified that it's working correctly. Also want to do more testing so that we are sure that everything with the installations is okay. Also working currently on adding more support for, for instance, Mac power, so that already works a little bit, but we have to solve some more issues with openfoam, for instance. We also want to start supporting GPU soon, so that's not there yet, but that's one of the next things we want to do since those are very popular, of course. At some point, we will start adding more and more software, it's still limited for now. We also hope that the developers of software maybe want to start helping us out at some point by verifying the installation of their software in our repository so that both they and we can be sure that users are using a correctly installed version. And we want to make things a bit more official by getting more manpower and more funding to make project more sustainable. And at the moment, we're also working on setting up a consortium to make this more official. Also, since we got lots of questions about the European part in our name, we want to change that. We don't want to limit it to Europe only. We aren't even doing that anymore at the moment, so it doesn't really make sense to have European in the name, so we will probably change that E to something else. And finally, we want, of course, to work towards production setups that we can all start using this, but we don't really have a planned date yet for when this should be ready. If you want to find out more, you can find more information on our website. You can also find a form to join the project, so then you will basically get an invite to our mailing list and our select channel. There's already some documentation that you can find over here with GitHub repository with lots of our source code and documentation. So, I have a Twitter channel and we have monthly online meetings every first Thursday of the month, which you can also join if you're interested. So, that's all I wanted to tell you about EZ. So, thank you very much for attending the talk. I think we still have a bit of time to answer questions, so feel free to ask now or otherwise you can always reach out to us on Slack. Thank you for the talk, Bob. We have first question. Yes, you said that the software installs are not allowed to use libraries and stuff from the OS level. How do you then handle Infiniband and those parts? Because the OpenMPI, for instance, needs to be built against the UFED stack to use Infiniband. So, we make a very broad installation basically of OpenMPI with support for basically different kind of interconnects. I think that question was also raised during one of the EasyBuild tech talks about MPI and where they also said that it's possible to have such an installation which basically supports Infiniband and different kind of interconnects. So, that's what we do. Yeah, but let's go to the lower level. OpenMPI depends on useX and that useX needs to be built against the correct version of the MOFED stack that is actually induced on that host. Yeah, so we include packages in our compatibility layer that support different kind of things. So, for instance, that's package RDMA core which provides all the user-level libraries basically towards Infiniband but indeed the host system, of course, should have the right or should have the drivers for the interconnects that we cannot provide. So, that's what we rely on and I don't know all the details about that specific part but I haven't seen any issues with that on all the different systems we are testing this on. So, at runtime, that's often just automatically found and used by default. Yeah, but there are API incompatibilities between different versions of MOFED stacks. So, if you have built it against MOFED-4 in this compact layer and the system is using MOFED-5, everything may not work. Well, we're currently just relying on what LibI-B verbs or RDMA core does and we're not really... I think when you're building against the specific version of MOFED, it's more about optimizations. So, making sure that OpenAPI is properly optimized and properly configured for that particular version of MOFED. The OpenAPI installation we have currently doesn't do that and one thing we definitely need to look at is how much of an impact that has on the performance. So, it is properly using Infiniband and we've tested that and for example in Julek, Alan has tested that as well. He was getting pretty close to the bare metal API installation in terms of latency and bandwidth. We probably need to test it a bit more and see what kind of impact there could be on not using a fully properly optimized for this version of MOFED on that host. But we've been in touch, like Bob mentioned, with the OpenAPI developers on that during the tech talk and we specifically asked the question, like, is there a way to build one OpenAPI that works everywhere? Just a very fat installation and he said, yeah, you can do it and it's actually well supported. So, yeah, I think we need to really evaluate what we currently have in there. It's probably not perfect and we may be cutting corners slightly in terms of performance, but yeah, we haven't taken a very close look at that yet. My problem here is mainly compatibility, that it actually works if the host that you're running on have, for instance, a very old MOFED stack. That doesn't have the RMA core libraries. Okay. Yeah. So, yeah, we haven't run into those issues yet or maybe we haven't tested enough yet. So, when we build OpenAPI, we do it in the container, like Bob mentioned, where RMA core is installed. But if you're then trying to run on a system that has a very older RMA core, then yeah, there may be issues. We haven't seen those yet. We should definitely start testing this more and more. So, that's also one of the points that I mentioned. But so far, we've mostly been testing on single machines. Only Alan, I think, has been doing quite a lot of work with multi-node runs. I think also in Norway, they did some multi-node tests that seem to be working well. Yeah, we need to test more and get a bit more organized and maybe also start reporting performance somewhere on different sides so we can compare things. Yeah. We've reached enough COVID stacks with probably real workbooks. Yeah. I see two more raised hands and I don't know who was first, I think Bart. Yeah, I can just add to the answer about the MOSFET and whatever COVID layer gets by the Linux kernel. We have some experience with that in that RMA core libraries that you can just get from Github that we install. So, in our 10-2 prefix layer, they work fine with MOSFET kernel modules or the stock kernel modules. The question is mostly what kind of optimizations have melanocs done to their libraries, to their live IB verbs libraries and code that would make it specific to the kernel drivers or not. So, how intrinsically linked are they together? I don't know the exact answer for that. I just know that they're compatible enough and actually over some past years, melanocs has upstream some of their modifications to the IB verbs libraries and to the upstream libraries. So, they've actually become closer than they used to be in the past. But if you absolutely need vendor-provided libraries, you can always stuff them in the directory and add that directly to the IB library path. It's just that in several years of use on clusters with and without MOSFET, it was just never necessary. A similar issue actually happens with CPUs when you have the lib-cuda libraries. Those are quite intrinsically linked to the CUDA drivers. So, in that case, we have to opt to use host-installed CUDA libraries, which has some issues involved, which I don't have time to explain. It's basically in the same ship as the IB verbs. That's basically my contribution. Thank you. Thanks, Bart. We'll definitely keep in touch about that. Yeah, my remark was in the same direction. It's also more than infiniband. There's also the Cray system, for instance, with Slingshot Interconnect, which I believe supports libfabric, but especially if you look at largest systems optimization, maybe a little bit more critical before then you have those specialized networks that may have acceleration for collectives. Also on Melanox, you've got the base Melanox support, but then you've also got those extra devins that you can run to have acceleration for collectives on big machines and on applications like Romax and so on. That may actually matter a lot. That's where you can see the difference, not on a fore-known trend. And I think there the only thing we can do is rely on indeed libraries like libfabric and UCX. So, we compile OpenMPI against those, and then that's sort of our compatibility layer in some term for MPI. And then indeed, you may need to do some site-specific things in terms of making sure a more optimized or a more capable UCX or libfabric is actually being used at runtime. So, yeah, especially for things like the Cray Slingshot, you may need to do extra things. And I think in the Easy Project, we're very ambitious already. So, X86 and ARM and using GPUs and proper interconnect and all of that. And I think we're doing what we can and keeping an eye on performance, but we're never going to get all the last percent out of every possible system. So, we will pay for things somehow, but we hope to keep that as minimal as possible. And I think with recent developments like UCX, for example, and also PMI and all these things actually enable us to limit the impact of making those choices or providing a central software stack in a good way, in a performant way. Yeah, and on top of that, we're also thinking a lot, especially recently about ways that sites can easily add their own things on top of what we provide. So, if you need a specific versions of or slightly modified versions of whatever we provide, you can easily make a local installation available on top of the easy stack. So, currently what we're doing right now is linking with our part. So, to make sure that everything is picked up from the compatibility layer and nothing is being picked up from the host and things like this. And I'm starting to think we may need to switch to run part where you have at least the option to use LD library parts to overrule certain things so that you can easily inject a custom libUCX that is compatible with the one we compiled against, but that's which is more capable in some way or better optimized for your system. And currently with our part, that's a lot more difficult. You have to pull some pretty nasty tricks to be able to inject stuff that overrules what we provide. If you use run part, then you can probably go through all the library parts. And yeah, you do have to be a bit more careful then, but at least that will be possible. Not sure if there are any questions in Slack. Alex is making the remark that we may need to provide compatibility layers not only per architecture, but also per network stack. Hopefully not. And I don't know if Alex can clarify if he's in the call and willing to speak up. Yeah, Alex. Yeah. Yeah. Can you hear me? Yes. So for instance, what we see is that everything that we update and lib verbs or the Moffitt stack in the host system, we have to rebuild all the open MPI modules. And so how can that be handled at the easy level? Well, assuming those libraries or at least the user libraries are now part of the compatibility layer, if we need to do that, we could always, it will take a bit of time, but we'll full new release of our entire, in this case, pilot version, but in that case, in the future, the production version. So that's also something we are still thinking about how often we want to do that and should do that. Should we install the entire thing all from scratch every year, for instance, with a new compatibility layer or not. So that's not really clear, I would say. And if it's indeed a breaking change, then we shouldn't suddenly start doing that. Or at least not make it a default. Don't know if that answers the question. Yeah, now I'm in mute. Yeah, absolutely. So yeah, that might be something that we will discover in the future. Yeah, that's why I was suggesting to maybe half compatibility layers, not only per arc, but also at some point might be needed to have per version of the stack of libraries needed to handle the network. Jor has a question as well. Yes, coming from the bioinformatic area, and at least here it is lunchtime. Are you providing or are you planning to provide software from the bioinformatic area like Stake, SEM and Bismarck? They really have these funny names. Well, in principle, everything that is already supported by EasyBuild, that's something that we can easily add to the repository and basically for any other kind of tool that's not part of EasyBuild yet. So if there's no easy config in the EasyBuild project for it yet, that probably should be written first and merged into the EasyBuild repositories, and then we can easily add it. So assuming it's something that can be done easily, I don't really see roadblocks to not do it. So I think the main reason we're now sticking to what we have in the software layer is just so we can focus on other things. So there's lots of other aspects, especially automation where we want to build up a mechanism so people can open a pull request to the Easy Project and say, I want to add Bismarck into the software layer, and then after some review, so if somebody approves the request, then just fully automatically it gets installed in all of the different architectures that we support, either through AWS or through build hosts that are provided by partners in the project or something like this. So we want to avoid that the humans have to start all those builds and keep an eye on them. So that's why we're currently a little bit reluctant to add more software. Even though it's not a lot of work, I mean, everything is scripted, so the script will just run a little bit longer, especially if it's already supported in EasyBuild. It's not a big issue. So if it will be helpful for many people, especially in terms of testing, we can certainly consider adding more software. So maybe jump into the Easy Slack or open an issue in our software layer repository, and then we can definitely look into it. We've already added things like reframe or the OZU benchmarks for Destiny PI. So if it's connected to people testing things, then yeah, there's no good reason why we shouldn't add more software. But that's not what we want to focus on right now. Cool, thanks. We're 10 minutes away from the reframe tutorial, which is actually a separate Zoom session. So if people have registered for that, keep that in mind that you'll need to jump to another Zoom session. We can have a couple more questions in here since we're not blocking the tutorial in any way. So Guilherme has a question. Hi guys, thanks. Can you hear me? Yes. Yes, okay. Thanks for the presentation. So do you guys have a kind of roadmap of what is needed or missing to make it easy to ready to production at least as a better version? Not a very official roadmap or something yet. So right now we're completely organizing everything on GitHub. So if you go to the different GitHub repositories that we have, there will be lots of issues open with either real issues with our repository or maybe marking things that should be improved or documented. So most of the work we're doing at the moment is solving those issues and at least the most important ones and adding new features that we think is important. So for instance, GPU support that I already mentioned is something that we want to start doing soon. But other than that, we don't really have people working on this full time yet. I hope that will soon change. So it's hard to give a clear roadmap on when this will be production ready, but I definitely hope that we will have something which is at least approaching production level quality this year. Like the monthly meetings that we have which are open Zoom calls so anyone can join that meeting and listen in to how we're doing. We also share the presentation. So we work together on the shared slide deck for those meetings. All those meetings are available in the GitHub repo in PDF format. It's not ideal in GitHub, but still, it's useful. So you can keep an eye on that there or join the meeting to see what we're currently working on, what has changed, what we're working towards. It's currently, I would say a bit disorganized. We have issues in GitHub, but there's not a lot of prioritization or it's not clear who's working on what. That's definitely something we will do. So we'll start one of these project dashboards or actually multiple ones and under the easy organization in GitHub to get a bit more organized and I think that's going to be the closest thing that we have towards a roadmap. We're not going to pin us down to a specific roadmap or a specific time because currently, like Bob mentioned, it's done a bit on the side by many people and we're making very good progress, but nobody can really commit a lot of time on it currently. That may change with the German HPC consortium that got involved in the project for me recently. There they actually have dedicated manpower for working on something like this. So at least there will be some man hours being spent in the coming weeks and months on it. And another very important thing is we're looking into setting up a proper consortium that has, where you can sign into an SLA where we say, okay, we provide this and we give you this and this guarantees or you have to make sure if you want to adopt the easy software stack that you make install, for example, Stratum 1 on your site. So you're protected against a disconnect on the network or anything like this. And yeah, all those things which are a little bit less on the technical side, but more on the organization side is things we are working on and thinking about but are not in place yet. So do you guys are thinking to kind of approach in both ways like providing easy service where you could consume the software stack that you build for yourselves and at the same way a product where someone could use the easy project for building their own software inside their facility. Is that correct? The way I'm understanding the easy go or? In principle, we want to make this open to anyone who wants to use this. And of course, if you want to, for instance, connect your HPC cluster to our repository, then it's important that, for instance, you add some local caches as well. Otherwise, it's going to generate lots of network traffic to our servers. And you will get a bad performance anyway. So yeah, that's the goal. And the second question about adding stuff on top, we indeed want to make it possible that if you need, for instance, commercial software on top of what we provide already, that you can easily have a local set of modules with those commercial packages or whatever kind of other customized packages you want to provide to your users. So yeah, both of the things that you said are actually correct. Okay, so just one more question. What do you guys... Sorry, hold on. Just let me interrupt you. We have to stop the live stream here so we can reuse it for reframe. So let me do that first. So we're wrapping up the stream and the recording here. We can still continue the discussion in this...