 Welcome back to the third day of the 6th EasyBuild user meeting. This afternoon we're starting with a site presentation from the Swiss National Supercomputing Centre. This will be from Luca Marsella. Over to you Luca. Thank you Simon. So, welcome everybody to the EasyBuild CSS site update. I will go directly to the outline. Now, I will talk briefly about the EasyBuild timeline at CSS over the past years with an overview of CSS HPC systems, in particular the Metro Swiss, the production system pitstint and the Alps system. And then I will mention how to customize the user builds with EasyBuild at CSS. And I will dedicate most of the presentation to the Jenkins, the use of the Jenkins interface that we integrate with EasyBuild, and in particular the Jenkins pipelines that we have adopted at CSS. Concluding with some final remarks for possible developments in the future. Now, the EasyBuild timeline at CSS began more than five years ago and it was, let's say, starting with an experimental trace support developed by Kenneth with Peter Forai and Guilleme Peretti-Petzi that ended with a stable support for trace systems with the 11th EasyBuild hackathon back at the end of 2015. And in the meanwhile we adopted EasyBuild also on the first Metro Swiss system that had a software stack that was completely built with the EasyBuild recipes. In 2016, EasyBuild was also integrated with Jenkins and GitHub with the major pitstint upgrade, the flagship system at CSS in December 2016. And following in 2018, thanks to our colleague Theophilus, Jenkins with pipelines was integrated in our software stack release process. So we now, in production, we always use the pipelines now, which are available on our GitHub production repository and can be checked quickly by all users. Now, in the past two years, we had some updates like Metro Swiss systems that was updated and still fully using EasyBuild and then also pitstint, the flagship system, updated to CLE 7 up for 1 and then CLE 7 up for 2 at the end of 2020. I didn't have space here but there is also this latest update of pitstint. In the meanwhile, we have a new Metro Swiss system in production and we have now a Cray EX supercomputing system available for early user access selected users only that also uses EasyBuild at the moment. So I will go now to describe the main HPC systems where we employ EasyBuild to deploy the software stack. In this slide, I show you the main features of the system like I already mentioned pitstint, which is dedicated to the CSS user lab and it's the flagship system and it features most of the nodes with a single GPU per node, a P100 NVIDIA Tesla GPU with an Intel household processor on the so-called Cray XC50 partition and we have also a Cray XC40 partition with Intel Broadwell processors and node GPUs. The ALP system is a new system available only to early users and it features nodes with CPUs only. It's a Cray EX supercomputing system featuring AMD ROM processors and then the new production Metus Swiss system which is called Arolla and Tsa, the development partition features eight GPUs per node which are NVIDIA Tesla V100 with an Intel Skylight processor. The Metus Swiss system is actually as I mentioned composed by two partitions, one development called Tsa and one production called Arolla. Both features the Intel Skylight processors and Tesla V100 that I just mentioned and the EasyBuild software stack here is in production since January 20th. The machine was in pre-production at the time, now it's in full production. The modules here, we employed a lower case naming system, naming framework, sorry, naming scheme with few exceptions and the exceptions are listed below the EasyBuild custom module file that we use at CSS for let's say a custom environment to build the recipes and then these two modules that are metamodules called program GNU and program PGI that emulates the modules that you find on a Cray environment and they provide a modules environment in a hierarchical fashion so only after loading one of these modules then you unroll the underlying modules. So this is done in a sort of emulation of L-mod because this system does not feature L-mod but the environment modules. The new system is called Alps and the production partition of the system is called is named Iger. This is an HP Cray EX supercomputing system. As I said, at the moment it's open only to select users for early testing. The compute nodes of this system feature two sockets with NAMD epic processor like the ROM processor in the standard naming, commercial naming and they interface each other, each nodes with a high speed HP slingshot interconnect. Now, we also provided here a software stack with EasyBuild. We have created some custom tool chains to address the new Cray environment which is called now CPE so Cray programming environment instead of standard naming convention on the current on the flagship system and these new tool chains are we have called them CPE CC for Cray compiled environment and CPE GNU for the GNU compiled environment. The modules versions in these tool chains in this Cray PE are pinned to avoid the use of collections because starting with this CPE environment Cray started using modules collections instead of the standard module files for programming environments. The system features L-mod. It is supported officially by Cray as of the Cray PE of October 2020 and the modules are based on Lua A31. We have some scientific applications that have been built successfully there and tested with EasyBuild and I list a few of them here in the slide which includes some popular codes like Gromax, LAMS, NAMD, Quantum Espresso. Now, PITS-Dyne to the flagship system as I mentioned it is a Cray XC system with two partition 50 and 40 with Intel Puzzwell and Broadway respectively and the 50 partition also holds a GPU per node, a Pascal GPU per node. Now, as I mentioned the EasyBuild software stack is in production on the system since 2016 and it has gone through several operating system updates and the most recent one was in November 2020 to the Cray Linux environment 7.0 Apple II. We have also now in place an automated update of EasyConfig files and I come to that in a while. Here is a short list of some scientific applications that we built with EasyBuild and PITS-Dyne on the left side you see the GPU enabled one with a label version suffix dash CUDA on the right side the ones that are available on the multi-core partition without GPUs and also here we feature popular scientific codes like Gromax, NAMD and Vasp for instance. Now about EasyBuild for CSES users. So users are given EasyBuild recipes when they request software in general except if this software is a community code that we already provide. So this is less error prone than the manual steps of course and also for users they can keep some installations for their own groups without having their whole HPC center to maintain this installation. We have on the CSES user portal at user.css.ch instructions to load the EasyBuild framework at CSES with some custom module file EasyBuild dash custom one that creates a specific environment and user can adopt this framework to build their custom builds. They can clone the CSES production project on GitHub at the address in the slide. They can customize also their own recipes with the custom repository environment variable and they of course can use the standard EasyBuild environment variables to have their own prefix and so on and so forth. Now let's go to the EasyBuild integration with Jenkins at CSES. So we use Jenkins as a service for continuous integration to deploy the software packages on the systems in production and test new EasyConfig files that can be submitted by staff and also by users. We also check regularly regressions of EasyConfig that are listed in our production files and we now also update the production recipes in view of systems upgrades. The project names in Jenkins that run with EasyBuild are the following ProductionEB is the one that builds the EasyConfig files once they are in our production repository. Testing EB is triggered when there is a new request appearing on GitHub. Then we have the new rebuild EB that actually rebuilds in fact staff but from scratch also checking for instance the download for broken links and so on and so forth to ensure reproducibility. And then we have now the update EB project that runs EasyBuild to update recipes in view of a CrayPE update on the system. So creating new tool chains and using the tool chain the tri-tool chain version in order to update the recipes. So the Jenkins projects defined by the pipelines provide a names fix and names flexibility and also is version control that is important and runs in parallel on different systems so it optimizes the available resources. We have as I mentioned a CSES EasyBuild production repository on GitHub you can also as a user submit a pull request you can add the EasyBuild configuration files to a new branch in your fork and then your pull request must include the dependencies in order to create the pull request we have some policies let's say the title of the request must match one of the supported systems so that it can be tested. If you work in progress then you need to put the WIP text in the subject and then you can test multiple architectures or single architectures at the same time on some systems like explain here GPU or NC for multi-core systems. And once you submit a pull request the CSES Jenkins project testing EB will test the build of new EasyBuild recipes and the Jenkins pipeline script of this is available at the link in the slide. Now this is an example of the production EB pipeline where you see the multiple systems Dient, DOM and Iger listed and green sometimes it's happening it means everything was fine before the pipeline production EB goes on there is a testing EB pipeline on the left side you see a pull request omitting on GitHub in the subject you see the name of the systems in this case DOM, Dient, Iger and SAP on the right side there is the blue ocean interface of Jenkins showing the pipeline triggered in different systems so the systems that you see listed in the vertical order and green here also luckily now I also mentioned the rebuild EB pipeline so something that rebuilds the easy config files in parallel downloading everything from scratch and we added a step with the Atlassian Jira integration of Jenkins in order to create automatically issues on issues on Jira when there are failures so you see here there is a rebuild stage red here because something failed and then there is the Jira stage that created the corresponding Jira issues and then last but not least the update EB pipeline that actually creates new easy config files for the updates so the additional step here with respect to rebuild EB is the GitHub stage so after the successful recipes have been created and updated they are submitted automatically to our GitHub production repository and done then the final remarks so as usual moving HPC software stack to easy build takes some time there is a learning curve in the users and resistance to change but of course there are a lot of advantages reusable let's say recipes for users as well and then our perspective of easy configs versus easy blocks we prefer to use easy configs more because they are self contained although we have also some custom easy blocks but not too many and then just to give you the highlights so we could then deploy the new software stack with a new CrayEx system with some custom tool chains and especially we can use the automated easy config updates now on our programming environment so if you have a Cray installation then it's also worth going through our Jenkins pipeline scripts in order to see how we do the integration with Jira automatic submission of the GitHub PR on our repository and work in progress updating hopefully also the versions and dependencies and replacing current production with update and in the future use reframe stack in order to check the recipes once before going production so I put here a slide with the useful links and I thank you for your kind attention thank you for the talk Luca if there are any questions then if you're in the Zoom meeting can you use the reactions to raise your hand and we'll unmute you and if you're watching the live stream then ask in the easy build Slack Kenneth do you have a question? Yes I do so I was wondering Luca if there are any things in easy build that you're currently not happy with from a CSCS perspective so things that are not working as you would hope or that could be improved so as you showed on the timeline easy build has been used for over five years at CSCS now mainly on or at least initially on Bitsteins but also on other systems and I think you're one of the main users of the Cray integration we have so the integration with the Cray programming environment and leveraging all the modules they provide and building stuff on top of that are there things missing there are there things that are not working that we could improve? I think we are overall satisfied in the sense you've been interacting with us as you said more than five years now so we have been managing to integrate easy build very successfully with our software stack deployment with minor corrections and minor integrations or let's say contributions occasionally like so Victor and Irini did in the recent past we might have let's say new contributions I mentioned this new system this Cray EX super computing system these Cray machines names are always a bit difficult to pronounce we went from XC now EX and so here as I mentioned the problem at the moment is still under test but I created two custom easy build tool chains because the current Cray tool chains are using programming environments as modules and unfortunately now Cray moved to module collections instead which made things a bit more difficult and also at the moment in general Cray in the environment which is not always user friendly they provide a kind of meta module to be able to switch for different programming environment releases because they deploy every month and you release like you don't want to mix them up but unfortunately in this new system we don't have it yet so that's why I mentioned that now in this new custom tool chains we have to pin the versions of the module files because otherwise in general you rely on the defaults but then we'll mix things up in this situation we are very happy though that we have Lmod finally officially supported by Cray so that's really helpful I'm not sure if the Lmod installation provided by Cray is error free in the sense I had to revert back for the tool chains to use the TCL syntax of modules because the local Lmod installation was complaining about the module swap but this is something that I think it's a problem in the Cray side but anyway nothing particularly relevant and for the moment as I said this is still in test so this is probably something to be let's say we need to contribute that maybe in the next months to the main EasyBuilders Github but as usual anyway I put here in the links the fact that although we have a custom set of easy configs and well not so many easy blocks we always mirror our repository under the EasyBuilders official repository so you have the link at the bottom under EasyBuilders there is a CSS space where you can find all the recipes that we use on the Cray mainly although we have also a MeteorSuite system where we use the standard FOSS tool chain is that the Shasta stuff where things have changed? Yeah, it used to have the Shasta name now they call it EX super computing system anyway yes that's that one with this system they connect and mainly AMD processors And do you know the motivation why they switched to module collections for their No, they're not providing a reason for that and actually we asked to have let's say as quickly as possible kind of module file to let's say meta module like the programming environment that we use to in order to manage the different programming environments but also the different releases and they're taking some time because the system is not really production ready so they replied that probably by March they might get back to us as usual with Cray it's always a matter of pushing and because they have of course more I don't know many customers that say so many different needs from different sites Would it be helpful at all if easy build that support for loading collections as it is as it now does loading modules for tool chains or dependencies because I think that would be a fairly easy thing What is it called module restore? Would that be helpful or it would be helpful probably I think also in general for maybe other in other cases might be I'm not aware of other situations now but it could be needed the only issue with the module collections is now that the pinning of the versions because until the module collections are just loading new files I think it's still not ideal to use them but instead pin the versions because otherwise you are mixing up the environment with newer compilers or libraries and that can bring conflicts and full control that makes sense I see Victor raised his hand as well So I just wanted to compliment what Luca said for the reply what we need for an easy build canad we need the easy stack to be implemented with all these open issues related to add more options so that this entire framework that Luca said about testing testing production could be simplified we have this production dataset script that could be easily replaced by this easy stack but we really require that you guys get all those open issues for the easy stack to move on another thing is that we would like also that if you guys could test the minus minus package because first it doesn't work very well if we set default flags so if you do eb minus minus package and set default you just get the modules but you don't get the versions so the feature we have for creating rpms you are also using that or? we are investigating this to use in the future but it requires the set default to be working but you also need the set default to be working the easy stack so all these are open issues but it would be nice if this could be addressed so one of the related to your talk about having main power and addressing book fixes and I guess also maybe setting priorities a bit better so I wasn't actually aware that you were looking so closely at the easy stack files for example we now see it as a nice to have also for the easy project it's definitely on the top of our list but if other sites are also interested in picking up on it and waiting until the missing features get implemented then maybe we should push a bit more because everything gets really really simplified sounds good we'll have to draw a halt there so we can prepare for the next talk thank you to Luca for the talk and Victor and others for questions we can jump into the breakout room if needed so if anyone has questions maybe join there