 Thank you, George, for inviting me to come here today. I added the word, new, because that's important to me, because I'm a new user of it. So I'm currently working elsewhere, but I've been, for the past two years, working at Imperial College, building that microsimulation. My background is motor racing. I spent most of my past two decades racing cars around the world. And so I have a different mindset when it comes to academia, because I need to think about global optimization results. Optimizing local things is not always feasible for us, because we need to race. There's always the next race. And the times are very short between each race. And we never race the same car. So for me, it's more important development speed. And when you try to develop at speed, you don't have time to do a lot of optimization and concentrate on local problems and make sure that your CPU is running red. So I'm going to go through my experience here, how I got on by using EasyBuild for the first time to get something actually deployed to a HPC here at Imperial. So the outline of my talk is that I'm going to give you an introduction to it. How GPS is. And then I will reveal the two parallel, one doing the process by hand, which is usually where how you start. And then my transition trees will then how that has gone. And the final part is have we actually accomplished that? Are we done with that transition? So first, have GPS or global health policy simulation. It's a micro simulation tool being developed by the Center for Health Economics in Imperial College. And they are in France as part of this top project. It's a European-wide project to tackle childhood obesity. So the simulation simulates country-wide population, so a fraction of the country's population. And we are interested on lifestyle, behavior change, and metabolic risk factors that how that one leads to chronic disease in late life. So we simulate several scenarios and test the effectiveness of policies, high-level policies for government. And that includes something like nutrition labels. They want that you go to supermarket and you'll see the red, how much sugar, how much fat is on the product. So that is one of the policies that you need to evaluate and see how that will affect your population. Others, which will affect you when you go to the shop, the sugar tax, anything that has sugar, you'll pay more for that or less. So this simulation is done for high-level policymaking. And so in terms of computation, the biggest limitation is how many people can use simulators. So it's an individual kind of actor-based simulation. So we need to simulate very large populations and they interact over time. By time, we talk about 30 years or even more 50 years because you want to see people go to school, get old, and see what's happened when they get old and what the outcome of the health is going to be. And that is what you try to balance if it's worth doing this intervention at this stage. So I'm going to now move to the software, which is open source. It's a cross-border form piece of software. It's written in C++. And there are other more limiting points is that we are not focusing specifically on high-performance computing. Our software needs to run on a desktop cloud. High-performance computing is a specific use case that we need to be able to do so that we can run larger populations and scale our software. The development chain is CMake. I don't have problems with CMake. I've been using CMake for many years. And one of the pain points for every software in the developing C++ world is dependence, resolution, compatibility, and be able to actually assemble your dependencies, build everything. So we've been using package managers to do that in recent years. There are a number of them. I chose VC Pak as a developer in Microsoft. But there is also Ninja, sorry, not a more Python-based as well. And the project is hosting GitHub. And we use GitHub for pretty much every integration test and so on. Here's the link if you want more information about it. I've been thinking of source code, whatever you wanna do. It's all available. There is well-documented and get involved if you think it's something that you like to do. So I will start how I started working in the Imperial HPC. That was before EasyBuild being introduced and wasn't introduced later for me. So the first thing you have to do is you go to the HPC, try to see what was available there. And to my surprise, the compilers were very old. So I had to start by requesting a happy day ticket and that ticket took quite a considerable amount of time to come back. While that was happening, then you go DIY and you build your local environment on your login nodes, which is complete against all the guidelines. And the important bit is that when we eventually are able to build that software, we tend to store somewhere in the file system and then let people know where it is and stuff running from there. And incredibly, these tends to be most of the way that people actually work because install the software, you need to go through the chain of tickets. I got a new version, need to install again. And so as a research group, we tend to actually share the binaries and the data, complete independent of the HPC deployment system and then run the experiments, analyze results and so on. You know, well, the steps here and hopefully it's accepted for publication or not. So trying to reproduce that environment is almost impossible because I've done myself, even I probably don't know how to do it again because it's over time and I start doing one step or another step and this is what is built supposedly is here to do is to actually solve that problem so that we can actually be reproducible. And that's a major, major addition and a solution for academia, especially, but not only for academia, we also need the reproducibility at the industry. Industry today is very keen to be able to repeat to reproduce your results is not unique for academia. This is the global requirement that we be reproducible that we can repeat ourselves and other people can check the results. So they moved to EasyBuild. Thanks for this, for George actually introduced to EasyBuild and was looking for volunteers to provide a project. And I volunteered myself and that is the story began. So I start by setting up EasyBuild on my machine as a developer and the first step was the integration with GitHub is a long list of steps and you better make sure that you don't miss them but be patient, you get there. EasyConfig was another journey. The first encounter is, it looks similar to what you'd say a CI script for GitHub but what actually makes difficult for me was the dependencies and having to find all the EasyConfig for everything that I was using. And if you don't find those then you need to get your hands dirty and do yourself and make sure that you get everything in place or there is config that you need in place before you can actually have your own EasyConfig accepted and actually published. The good news there is after lots of trying I must say I wouldn't done by myself it wasn't for the support that I received. So when George and his team, they started by creating this script which was the first thing that caught my attention that all right, let's remove the package manager and you see make a new chain. Wow, that's now is gonna be interesting. And then we start the journey of building up all these dependencies. And I personally found that quite intimidating because most of the dependencies that I was using wasn't there. So we had to actually do quite a lot ourselves. The good news is after all that work is done then now we have, we can install the application on my local space and there are three different stacks here at Imperial there is the development and then production. So while you software, you use configs on PR you're not allowed to go to production. So there is a link between software being accepted in config and then you'll be allowed to go to production until then it is local or is it development stacks. And now I don't need to keep sending mails to you that they all can go to the command line and check what's the latest version that I have there. And all that is this layer that's been removed. The same process will continue to happen with running experiments and trying results and so on. But this time there is a big difference. You can reproduce that result, the installers and so on. So that's for me is the biggest contribution for from easy to do. And I thank you all for that. And I know it's open source, it's a hard work being there for quite some years. But in the end that is what actually makes the difference. And I wouldn't, today if I need to do another kind of similar work, I would start with this build at my starting point because it removes, it's not perfect, but it's much, much better than the previous option we had. So now I'm going to go to the question if we are there right now. And unfortunately, we're not there and presumably, you know we're not there because there are things that can be improved. There's always things that can be improved including our own lives. So my first problem I had with easy build is that it doesn't integrate into the development. There is config somewhere, my code is somewhere, two different post control. And it's not part of my continuous integration. Yeah, I don't test ease config as I go developing my tool. So at some point when I actually say, here's a release of my tool, then I go jump over and I need to create ease config and then get things approved and hopefully go again. So that is something that I don't know how easy it is to do it but if we could integrate easy building to CI builds and so that it can be tested as part of the development cycle, not separate from your development cycle. All the issue I found might be unique for Imperial is that easy build optimized for the HPC hardware but the question is which hardware? Because I noticed those flags being added to my GCC. And then the next thing you'll know is that Imperial doesn't have a single piece of hardware that has several kinds of hardware. And if I optimize too much for a specific architecture and try to run on the next one, on the different one, then I might be up for a surprise and that there is a way I can specify that I can say, well, I only want to use these hardware, type of hardware on my experiment but that might lead to a very inefficient environment because I'm now, if everyone starts to choose the same CPUs because of the latest one, then you might end up with longer wait times for your jobs to complete and you're going to have a lot of idle nodes which no one is trying to use. And it's not a solution. If you look at Imperial, these are the Qs and then you have the CPU times of each one of them. Now just choosing the Q is a, there's another big grid which you need to navigate which size of memory, number of nodes and duration of your jobs. But as soon as you hit that Q, now you have several types of hardware available. And if optimized to one, then I might hit the other nodes as part of that. So in the case of health GPS, we are not trying to be too optimized to the level of the hardware because we must be able to run on a desktop, on a cloud computer and when you go to the cloud, you have no idea which hardware which is run there. You just have the account and you say, I want to run this and just go to some data center that will choose which VM you want to actually run unless you pay for the specific VM I want to run on this one. So I'm more interested in generality, have a piece of code that can run everywhere without getting too close to the hardware because that gives me a lot more flexibility than being completely attached to a specific machine. And as you know, HPC is getting upgraded. And you might spend a lot of time tuning, tuning and tuning for a specific architecture. And next year, this come along, it's all right, we are upgrading the HPC and now all these new CPUs coming along and now your code is no longer good for the next generation of CPU. So I might be going against the wave here, but from my point of view, I prefer that the software be generic that can actually scale across different environments than actually be too focused on a specific architecture and CPU. The next thing that we love to do in open source is that we like to try different compilers. And compiler version, my code should compile with GCC as old as I can go, but also do the next one, the new version. And so it's quite a common practice for us to create pipeline to try to compile your code with a new CPU, sorry, a new compiler. As soon as a new version of GCC comes along, then we say, let's try to see if my code is still compiled with that one. And sometimes the compilers are improving to such a level that just moving the same piece of code from one compiler version to the next one, you're already getting performance on that. An easy build is not, doesn't help me a lot on this transition to try out a new compiler. I need to go back and redo all my dependencies now with the new compiler. So the initial cycle that I just gone through my initial development, I need to repeat that one again with a new compiler or a different compiler. So the bottom line message that I have for the easy build developers is that we still have a very steep learning curve to start with easy build. And my suggestion as a new user is that we should try to lower that entry point because that is very important for any software to live for a long time. And I do hope that easy build will stay around for a very long time because it solves a very important problem. And that's probably the biggest message I would like you to take away. It's for easy build to be here in 20 or 10 years, 20 years time, you need new users. Without new users, we can't go forwards. No open source project will survive if there is no new users. That's all for me. Any questions for Israel? Yeah. We'll just- Do you want to actually? Sorry? So bombing for a different architecture is a time of moving and that is just your page is probably long to do that. So that's very, very easy to do. Maybe we do that on room all the time because compile all the time or once for a different target. So we cannot use minus 10 part to make this data. And I don't really understand your remarkable. It is true if you defend from dependency that comes from easy build and if you have your own step of easy build which comes in terms of also which we have on room I have to review for different compilers like every four months. And I build for four different compilers. And typically this is just because we have a room for dependencies and compilers just changing a single line that could even use the setting structure in these compilers. Generate new video compilers for my new compilers and my- But my point is that I don't- Can you repeat the comments first? Yeah. So the question is why I need to, I'm not able to do the cross compilation cross compiler with easy build that is possible to do. My point is that I don't wanna do that. I want to, because if I want to run my software without worrying which node I end up. I shouldn't go that far. You have to cross when you have to cross compile. You are on one node type but you want to compile for the minimal node type that you will point minus N part maybe and have minus N mark if it's a visit for some of those core settings. Generate. Yeah. That's what I'm saying. I want it to be generic. Generate before that's really, you know, there's no, then you're really that's very easy to do. There is support for this and it's in the documentation as well. Yeah. Easy build has a way to build generically that work anywhere. So there is a caveat there and I think it's an important one. So today that's feasible because you have mostly Intel CPUs and so many in CPUs. But ARM is coming and ARM is gonna be very hard to ignore and say two years from now with the Nvidia based CPUs coming people will not be able to ignore ARM and say, oh, we won't use it because it's not the X86 it's gonna cause trouble. It's a similar story to the GPUs. People that at least we did, we actively said, we'll wait on the GPUs, right? Because it's gonna create more demand for software, gonna create more support requests. We didn't have the time for it so we actively waited. But at some point we couldn't ignore I mean, we were not gonna be relevant if people were gonna run away if we didn't have GPUs. So at some point this will be true for ARM as well. And then it's a very different story because you cannot take one binary and run it everywhere anymore. You need at least two, right? ARM one and the next X86. Yeah, we. And that make it get worse. I think that that is true. And GPUs have that problem on the beginning, right? Moving your piece of code from one GPU to another GPU it's usually required quite major rewriting. You think we didn't sync those vendors? Yeah, but that- It's easy to create both the architecture Yeah, and you're the anti-example, right? On Lumi, the software that works on video GPUs is- No, it's enough between GPUs that the same as ARM and X86. It is. But within the single GPU family, we didn't get- You can do FAT binary. But within the AMD family, it's actually easy to do the single binary. Make sure that you're near optimal for all different GPUs that you may encounter. Yeah, well- Yeah, maybe X2 essentially. Yeah. Yeah. Yeah. That's version 2, yeah. So, yeah. So, Bart is saying there is some kind of standardization in X86 where they define different levels. V1, V2, V3, V4. Where V4 is, I think, Vx5, V6, V6, V6, V6, V6, V6, V6, V6. So, yeah. So, yeah. So, yeah. So, yeah. So, yeah. So, yeah. So, yeah. So, yeah. So, yeah. So, I think, if X512, V3, as ADX2. So, if Easyboot could be aware of those, it could be a lot easier to build generic binaries. Work, let's say, on recent but not hardware. Yeah. Comment on that what we are doing at the Imperial is we said, okay. Zendipridge, Ivapridge is basically the same instruction set. Haswell-Boardwell is basically the same instruction set. So, basically, we are only building Haswell-Broadwell, Sandy Bridge, Ivory Bridge works like a charm until you find a piece of software who is using on the Haswell-Broadwell line. Something one of them is supporting, the other one is not. And of course, if you are building it on the one where it is supported and try to run it on the other one, you get a sec fault. So yes, we were thinking along these lines basically what that was saying with the V1, V2 and whatever. We were thinking along these lines with a heart there. Unfortunately, at one point it didn't work. Yeah. Yeah, I think most, let's say, support teams or Shizamins are doing this because they know how important it is for performance as well. But from a user's perspective, it's very different. We see this in our users as well, in the tickets that we're getting. But people don't actually know why it matters or why. I mean, we have an issue where our logon nodes are now AMD Rome, so it doesn't support 8XX12. And people run into illegal instruction errors because they're running software that was built for our Skylake node. And it's very easy to accidentally run into that. And all you get is like illegal instruction. And I said, they said, I did something illegal and they opened a ticket. So they don't really understand, first of all, what's going on and why it happened, why it's important, why it is like this. It's not we trying to be annoying. It's trying to help them and give them faster software. And then it's actually... Yeah. It's not older. AMD Rome is newer than Skylake. But the architect, it's just incompatible architectures. And if you go into ARM, it's even worse because nothing will work. Yeah, and if you have the latest instruction set, there's very good extensions that see views can choose independently from... Yeah, that mess is way bigger. Yeah, if nothing works in some cases better. Is there a question or comment in the back as well? As for the trying to view compiler versions, there is a fine two-chain version option. Yeah. Yeah, so Easyboot does have some support for trying different two-chain versions quite easily with the try two-chain version option, which sometimes work, not always, but it does give you a push in the right direction. But you would need to have all your dependencies already also failed for the other one. Yeah, if you enable robot, Easyboot will do it recursively. Okay. I haven't used that feature. Mostly works not always. Yeah, mostly works not always. And it's maybe not very well documented. But that could be a reason why you're not aware of it. Okay. Anything online, Simon? Before we wrap it up. You've lost the power. I can take a quick check. Yeah, it looks like it's mostly comments, not really questions. Yeah. Yeah, okay. Yeah, I was just going to give you one example of GPUs that I had on one of my previous work is that we had any video GPUs we developed quite a lot of software for any video. But then AMD came as a sponsor. So first thing you do get rid of all the any video because now you have to use AMD GPUs. And that was a very heavy rewrite. I think that's a bit of the, let's say the damage that will be done with coming from 10 years of only Intel and that's changing very rapidly and GPUs are only going to make that worse. Yeah, I want to say there's an easy answer there. Okay. Thank you. Yep. Thank you very much.