 Welcome back to everybody. Our final talk for today is a site presentation from Jonathan Jackson of AstraZeneca. Pass over to John now. Thank you very much. And thank you for the opportunity to present what we're doing here at this conference. Will I just say that all opinions are my own and they do not necessarily reflect my employer? But with that, let's begin. So AstraZeneca, you may know, is a global pharmaceutical company. We have research and development centers in the USA, UK, and Sweden. And personally, I'm located in Boston in the United States. The talk, firstly, starting out with the introduction, I will cover the platform that I'm involved with, the scientific computing platform at AstraZeneca. I'll cover how we implemented EasyBuild and spent a lot of time talking about the ecosystem that we were using around EasyBuild, the context in which we're using it, and then talk about our learning experience, what we've enjoyed, the successes, and also challenges and questions that we have, things that we'll be working on this year. And that will hopefully wrap up. Hopefully there will be some topics to discuss following the talk. So a little bit about me, my background, to put the talk in context. So I'm the lead engineer for applications on our scientific computing platform. I've been with the company for two years, but with four years of experience using EasyBuild. So I've been involved in two deployments of EasyBuild. 10 years in total working in high-performance computing. And before that, as I think others in this community, I, prior to working on the IT side, I was working in academia. Looking back, we, back in the day, a few loosely coupled Spark systems was what we considered high-performance computing and things have moved on a long way from there. But that's how I got started. So the scientific computing platform at AstraZeneca is, it is supporting research and development activities. And it is in a preclinical context. So it's basic science functions that we're supporting. There are many different types of science taking place on the platform. I've highlighted a few at the top under purpose. So drug discovery is a big one. Lots of chemistry, structural chemistry, computational chemistry, those kind of activities. There is a lot of simulations that go around the process of drug manufacture and ensuring a quality product. And those kind of activities take place on our platform. Next generation sequencing is a big computational demand. And also machine learning and AI is taking off in a big, big way. We're seeing machine learning touch all different aspects of science. And so a big demand for the tools, the tooling, and also the computational platforms around machine learning. So I'm sure that's familiar for many HPC centers, but that's where our focus is. In terms of scale, we have at least 800 easy build applications when you sum up all the libraries and multiple versions of each one. We have a number of web-based applications that are linked to the platform as well. And within the platform, we've got two Linux distributions. So from a design standpoint, we are looking to support multiple Linux distributions and about 1,600 users. So that's the scale of the applications world that we are operating in. And in terms of capacity, at least 7,000 cores, at least 300 GPUs. We have a virtual desktop service. So people do run graphical applications on top of our platform as well. And 3.4 petabytes of storage, although I'm sure it's grown since these numbers were put together. We have a private cloud for hosting our web apps. And in the future, we hope to be doing hybrid cloud as well. So that's the context. And when we approach easy build, some of this work is done before I join the team. When we approached the deployment of easy build, we had a few design criteria in mind. And perhaps, like other centers as well, we were coming from a world of having a module system on legacy high performance compute platform. So already familiar with the modules framework and hand-compiling applications. Transitioning, we built a new cluster in sort of 2017. And at that point is when the easy build was adopted. And the design criteria that we were looking for were with as follows. So first off, a reproducible pipeline. We were looking for a tool where we could reliably, predictably deploy applications. We wanted the outcome of a packaging process to be a deployable artifact. So something, an entity that we can take and deploy to multiple platforms. And in our context, while the primary target for deploying these applications is the high performance computing cluster. But there's a number of Linux workstations around the organization that we also support and those same applications are needed on workstations. And we had a view that we would want to deploy the same applications in the cloud. And so to have an object, an entity that we can deploy at the end of the packaging pipeline was highly desirable. And the obvious artifact, so the RPM packages, Dev packages and the Singularity Image Format. So that was kind of what we were looking at what we were hoping for in terms of a design. Further requirement is for the ability to rebuild packages offline if necessary. So we wanted to be able to archive the sources, the input installation files that go into building an application. So in the future, if we depend on an application and we needed to patch that application or otherwise rebuild it in the sort of the offline scenario, we can still do that. We would be able to draw the input sources from the archive and run through the packaging process. So that's a simplified version of our design. A couple of other things that we were looking for was an isolated testing environment, the ability to test applications without deploying them to the full production cluster. The ability to optimize our compiled applications for the target CPU architectures. We wanted to be able to do regression testing so that we can ensure ongoing quality of our applications across operating system upgrades. And another key design criteria really is to keep it as simple as possible but not more simple than that. And the reasoning here is I think sort of something around 15 to 20% of our users have no prior high performance compute experience. And looking back, sort of getting up to speed with modules and compilers and tool chains can be a little bit daunting, can be a little bit overwhelming. So as much as possible, we wanted to create an applications ecosystem simple as possible. So that's what we were thinking in terms of design. We adopted EasyBuild, I think it was a good choice. And I think other speakers have mentioned many of these criteria and requirements. Dependency management, EasyBuild checks that box, deployable artifacts, it has support for all the three output formats that we were looking for. There's a great community around EasyBuild. The ability to retain the build logs is important to us and EasyBuild can do that. Check something in version pinning so that you know that if you redeploy, you'll get the same input and the same output. Reproducible builds and minimizing OS level dependencies. So these are all, you know, lots of check boxes there for EasyBuild in terms of meeting our criteria. So moving on now to the our implementation, how we built applications ecosystem around EasyBuild. So at the beginning of our process, EasyBuild very, very much simplifies the process. And, you know, I'm suddenly was involved in building applications before prior to having a framework like EasyBuild. And it's a complicated process and it does require a specialized skill set. EasyBuild has simplified that a whole lot by providing the dependency management and the templates ready to go. But still a specialized skill set, especially if you're writing custom EasyBuild templates. So we start out with application specialists, you know, skilled engineers in this area who can do the development work, get the applications built and then tested. And we do that essentially with EasyBuild in the Singularity containers, we have a reproducible environment and just in your personal space, we do development and commit to version control. Once we have something that's working, we will run it through a Jenkins pipeline. So this meets the reproducible pipeline requirements. We will, Jenkins will pull from Git we're not doing full continuous integration at this point, although that is something we would like to do, but we will trigger a build running through Jenkins. And the outcome is an RPM or a collection of RPMs if there's multiple dependencies, which we can store in an artifact repository. From there, the Jenkins process will deploy something to a testing environment. And we isolate that by making it only accessible through a Singularity container. So it's not at this point accessible to all users to be incorporated in production pipelines. It's in a testing area. We'll iterate in this build, test, deploy, cycle, involving the scientists who's making their quest usually to make sure we meet all the requirements. And once this peer review internally and it's past user acceptance, we will move on to deploying it to the main cluster. I should know that we do use the compute cluster itself for building these applications. So some applications will need a GPU as part of the build environment. Some need a lot of memory. So we farm out the actual build to the compute cluster in a Singularity container. And then yeah, once we've been through the testing process, we have a deployment pipeline in Jenkins that will push it to the compute cluster and it can also push it to our workstation. So that's kind of our workflow our day to day. We spend a lot of time in this workflow. And so once we have this ecosystem of easy build modules to give you some context of how it fits into everything that we do, easy build is right there in the middle with the modules, libraries and the tool chains. Supporting the easy build, we've got Jenkins for the automation which you've seen we have module usage logging. So just simple Lmod usage logging which allows us to know who is using which modules so that we can think about things like the application lifecycle if we need to communicate out to a specific group of users regarding changes and those two kind of things. So that's been pretty helpful as to log which modules loaded. We have started using reframe about six months ago and I know this community is very supportive of reframe and yeah, we've had a very good experience so far with reframe and Singularity is part of that supporting ecosystem as well. But built on top of it, we have an applications catalog. So we have a web-based Django-based applications catalog where you can search for the applications that we have on the platform. The great thing about that is it can literally pull from the easy config files, the URLs, the descriptions and the dependencies of each application and we can publish these in an online format for finding our applications. We have a number of applications that are deployed to the virtual desktops. So graphical applications that we build whether commercial or open source, we build them with EasyBuild and people consume those by virtual desktops or also on workstations. And obviously the traditional high performance computing apps we've got lots of those but also I just wanted to mention part of this whole ecosystem is browser-based applications and EasyBuild is supporting that. So web-based tools like, for example, RStudio or Jupyter Hub, there's a web framework there but there's also Python libraries or R libraries, multiple versions of those that people want to use within the web framework. So we provision the R and the Python and all the supporting software ecosystems through EasyBuild and we consume it in these web-based apps and that's been quite successful. So, yeah, looking at it really contributing across a lot of what we do is the EasyBuild framework. That's a summary of what we have in terms of how EasyBuild fits in and I'll move on now to successes and challenges, things that we're thinking about. So successes. I think it's been really, I think the adoption of EasyBuild has really helped us standardize and coming from the world where each application was hand deployed, you have to make a lot of decisions about how to deploy, what optional features to use, what bundles to make. Adopting the EasyBuild community standards, especially around, say, R and Python really helps simplify things, saves a lot of time and improves the overall quality. We don't need to make so many decisions about how to package, how to bundle all these tools. So really, really improves things. So that's a big success, adopting the EasyBuild community standards. And yet, within that framework, we can still make custom variations. So we, for example, with R, we have the scientists come with specific requirements. They want a specific set of R libraries within that build and we can do that within EasyBuild. We give our custom suffix. We still deploy like an EasyBuild community standard R build that we can use as a dependency for other applications, but we will deploy the custom variant that our scientists went under a custom suffix, and that's working pretty well. So there's flexibility, but also the standardization is really helpful. We have the deployable artifacts that was part of the original design. So we're outputting RPMs from EasyBuild in Jenkins and deploying those. We encountered a couple of things along the way that needed to work around. There can be limits on the RPM file size or file count, depending on what version of RPM you're using. So we had a few things to work around. But overall that's been a success and we're hoping to follow up with Debian packages and singularity images as well in the future. The reproducible pipeline has been a pretty good success. We're so the basic pipeline is standardized, but the parameters will let you choose which obviously which easy config file that you want to build and which get branch you want to build it from. It'll let you select the EasyBuild version to build with. So sometimes we might want to go back and build an older module with an older EasyBuild version. And we can set like the build environment, does this application need to be built on the GPU enabled machine or what's what's CPU and memory should be allocated to the build. Since it's going through SLAM, the compute scheduler, that kind of thing is pretty easy to do. And yeah, in terms of the offline rebuild, that was one of our criteria. So EasyBuild is great because it gives you the download step separate from the build step. We will perform the download step first and then we will execute the configure in the compile in an environment without internet access so that we can be pretty confident that should we need to rebuild that in the future that we would be able to do that. And then finally, reframe for regression testing has also been pretty good. On the challenges side, keeping it simple is a challenge. We've decided to adopt a cadence of about two years for refreshing our tool chain. We wanted to keep the minimum number of tool chains because our scientists like to have all the applications loaded at the same time that they want to use and educating our user base regarding sort of tool chains and compatibility between different modules is definitely required, but we wanted to minimize that. So by adopting a fewer number of tool chains it's more likely that they can just load multiple things at the same time. That on-term requires that we will backport. So taking, we might need to take the latest easy conflict from upstream and backport it to an older tool chain which is a little bit of extra legwork, but it's feasible. And so far we're on a roughly two year cadence of tool chain upgrades, but still definitely complexity around how we bring our users up to speed and help them get the best use out of the applications landscape. Application life cycle management as well how do we clean up? So I think we're looking to implement something where we have a better understanding when we build something about how long this artifact is needed for and have an automated process to clean up after three years, after five years. Because at the moment if there's no expiry date assigned to an application you can quickly end up with a huge, huge collection and we'd like to avoid that and keep it simple for our users. Another thing we're thinking about at the moment is how to integrate containers into the modules ecosystem. And lots of great talks on this forum about that. I'll certainly be going back and re-listening to some of those again. Containers, so we want to start generating containers with EasyBuild and we also want to start leveraging container source from external repositories but how to best balance the different methods of deploying applications and have them all ideally accessed through a sort of a common sort of format, common interface is something we'll be working on this year and more than interested in discussing it with the community. Another challenge we're looking at at the moment is how to manage Conda. So Conda, the easy block for Conda is fantastic. It's really easy to deploy an application with Conda but at the same time, those applications can end up conflicting with R or Python or QDo or any number of other libraries because Conda can pretty much provide anything it's got its own tool chain in there. So we're doing a little bit of thinking about how to approach Conda and obviously there's been licensing changes around Conda as well, whether we're looking to figure out. But yeah, those are some of the challenges I would say that we're facing at the moment. One other sort of a challenge is one of our design goals was we wanted to have all our applications compiled for the target architecture to get the maximum performance out of our CPUs. We're not there yet. If these symbols here represent different CPU architectures and if like the most simple CPU architecture or the one with the fewest features is a triangle and let's say that a more advanced CPU with more features is this star over here. If you compile something on a minimum CPU architecture you can deploy it to that architecture or any of the other more complicated architectures that also support the same features. So you can deploy the triangle to the triangle or to a six-pointed star or to this 12-pointed star. What you can't necessarily do is compile on the six-pointed star and deploy on the triangle. So what we know is working is compiling on our base architecture with the target architecture, the mArch set to the base level architecture. We can deploy, we can build on the triangle and we can deploy everywhere. Whether it's Intel or AMD, we don't have any compatibility issues doing this but we're not getting the best out of all of our CPUs. So we'd like to move to a state where you compile on any architecture really and set the target architecture to the one that you want to deploy to and then we can get the best out of all of our CPUs. It's so far, we will be working on this but occasionally we encounter an application that ignores the mArch flag and then we can get these legal instruction errors if we try and deploy on a new CPU and deploy it to an old one. Not sure what the solution is there. We will obviously employ testing to be able to pick this kind of thing up and there's something that we're gonna be working on coming up soon. And finally the future. So where we're going next is a key goal for this year is to increase our contributions to the easy build this repository. We had some legwork to do and getting ourselves in a position to do that but so looking forward this year to be able to contribute more. We want to become provisioned in easy blocks. We've not really written too many in-house easy blocks but we want to get up to speed with doing that because it's a really powerful addition to toolkit. We do reframe but I'd like to move that forward and be able to display the regression testing in dashboard form. It's been good presentations about how to do that. We'd like to do the same thing. Continuous situation would be a nice addition moving on from manually deploying a pipeline to having it integrated with GitHub. We talked about containers and compiling applications. And yeah, we would like to go from having simply an applications catalog to having a self-service. So really building end-to-end automation from the user discovering an application, requesting it and then kicking off the build via Jenkins would be where we'd like to get to. And we'd also like to empower our users to leverage Easy Build to sort of maintain their own collection of modules on top of our stack. We have a couple of people doing that and I think we can simplify it to the point where that's possible by wrapping things in singularity containers and giving them one click or one command build capability. So those are our goals. And finally, I would like to say thanks to the Easy Build community for the fantastic work and to the SAP team for everything we've done together. And that concludes my talk. I would like to open for questions. Okay, thank you, John, for the talk. First question I'm reading from Slack. Victor is asking, how are you generating the RPMs? Are you using the minus-minus package function from EB? We are using the minus-minus package function from EB. Yes. The follow-up of are you seeing any issues with that option? I think that's probably the pointies because I think they're doing something similar. Yeah, I alluded to one of the issues which was with the RPM file size. There are some pretty large applications out there running into gigabyte size and we have to have a workaround to get those to work. Something to do with a post-install script, I think that we needed to do to get those to work. But in general, yeah, we're using the Easy Build capability to output the RPMs. I mean, feel free to ping me on Slack afterwards if you want to discuss it more, but it is in general working for us with those little caveats, yeah. And KF raised his hand, so I asked him for the next question. Yeah, you mentioned you have about 800 installations with Easy Build, but that includes different versions and maybe also different builds for different CPU architectures or... That's the total number of... Modules or... Modules, yeah, including all the different libraries, all the low-level libraries, everything like that. Okay. So it's a pretty large collection. Well, in Gantt or in our site, we have closer to 10,000 installations for different generations of hardware and pretty much anything we have in Easy Build, we have installed there. So I was just wondering how many, let's say top-level applications do you have, so without different versions or without taking into account dependencies that you have to install just to get to the application. You mentioned R and Python. I think that number is around about 300. So within that, we might have different versions of all of those. The 800 is the total number of modules, not including the versions of them. And I'm pretty sure we calculated about 300 that are not like the low-level library type installations. There's a mix of open-source, things like Romax and Relyon for CryoEM and then sort of more commercial things as well. So there's a big mix in there focusing on the chemistry, the funcology, those kind of areas. And in terms of, say, R library, so you mentioned you have the same R installation as we have centrally for standardization purposes. And then you have another one. Is that an entirely separate R installation with all the packages that your researchers need or is it something that sits on top of what is standard one? So it's in parallel. So we use a suffix, a version suffix to differentiate. So the module name is R, but the version number has a suffix in it. The reason for doing that is, so there's a requirement for more, how to say it, sort of the testing and acceptance for modules that get used for clinical purposes. And we're standing, although we're not a clinical platform, we're trying to deploy the same R environment as our clinical platforms. And so yeah, the thousand packages in the main easy build community are as fantastic, but we can't possibly certify all of those. So this is a slim down collection of R packages. And then on top of that, they're working on a framework to allow themselves to like deploy little bundles on top of that and then certify those bundles. There is, I believe there is an R working group which we are a member of, maybe it's a cross farmer R working group, maybe it's a cross bio working group for R, but there's a working group where they're coming up with some standards. And we're certainly participating there, but this particular standard is driven by the AstraZeneca scientists and what they needed in terms of ability to certify and test and R for clinical purposes. Okay, makes sense. I see Vasilius has a question as well, so I'll pass the virtual mic to him. Hi, thanks for the presentation was very nice. I'm from the refrain team and thanks for using it first of all, and for the good words. How many tests have you already written and are you using also performance test or just sanity, how complex they are? That's a good question and thanks for your fantastic reframe tool. The number of tests we have written is not huge. It might be sort of a 20, 30, 40 number. We started out looking to do benchmarking and so the initial tests that we wrote were of a benchmarking type where we wanted to be able to run the same test for different numbers of CPUs in order to understand scaling and things like that. And then we switched our focus to a more sanity testing and regression testing, the kind of thing that you might wanna do across a platform of grade. So yeah, as our tests vary from the very simple to the quite complex. So Relyon, we've built a relatively complicated test because there's multiple stages. Relyon is a cryoEM pipeline. There's multiple stages from like alignment to feature detection, the whole of the stages. And so we've got quite a complicated test for that one. Others of them sit somewhere in the middle. We're pretty early days with the reframe. We haven't looked at contributing back in that space yet. But that's the sort of thing that we're doing. We're getting started and we really like it. So we're hoping to have at least one test for all of our key applications by the end of the year. Okay, thanks a lot. I muted myself. Yeah, so whatever you want, join this Slack channel or yeah, contributions back are really welcome. And thanks a lot. Thanks a lot. Any other questions? Yes, Jorg, go ahead. Yes, thanks. Jorg speaking here. I gave the talk about singularity containers on Tuesday and what you are doing is very much winging more than one bell what I'm doing in the hospital. NGS rings the bell, quite a loud one. High plans rings the bell and so on. I was wondering, do you think it does make any sense? These pipelines to somehow publish them in such a way that other users can use them, for example, get an easy build configuration file out of that. Is that a crazy idea? Or does it actually make sense to have some standard and other people can reproduce what you are doing and what I'm doing, but not me personally, but where I'm working? Is that a useful idea? Yeah, it's an issue of question. We're manually, well, we're either sourcing our easy config file from the community or we're sort of hand-creating those. And we're using Jenkins Pipeline to ensure like a predictable build process to know that only the libraries within the build environment are available at build time. If I remember, you're talking about you're looking to be able to share that outside your organization so that someone else can deploy your singularity container with an identical build of your easy build app inside of it, which is a really great thought. I've not had time to think about how potentially to share beyond what we would share through the sharing the easy config files. But it's definitely worth discussing. I really like the idea of involving singularity so that, yeah, there is more than just the easy config file. It's not only that, it's also the entire end-to-end process and not just the recipe for building it, which, yeah. Anyone else have thoughts on that question? It's a great question. Somebody wants to react, just raise your hand so we can allow you to unmute. Or maybe this is good to follow up on in the Slack channel as well, I can imagine. This could be a good back and forth. Just to mention as well, York has a second part to his talk on Friday, the more hands-on singularity workshop. So that's maybe also interesting for people interested in building containers and using easy build to do that. So keep an eye out for that as well. We could certainly, it certainly might be useful to share singularity recipes that are pertinent to easy build, because there is some thought that goes into, well, you know, what components do you need in the singularity to be able to use easy build out of the gate? And we do have a feature in easy build itself for some of that. So easy build has the capability of generating the recipes for containers itself. I have to admit, it's marked as an experimental feature and it's not very actively maintained, because we don't see a lot of use of it. But that can change, of course. So what is there should work or is definitely a very good starting point. And that's also what York used as a starting point for his work. OK, if there are no more questions, I think we can wrap up here.