 Thank you very much for joining us, everybody, for joining us at the EasyBuild tutorial at ISC. I'm Kenneth Hoster, I work at the HPC team at Gantt University in Belgium, and Alan and Bart will help out with the tutorial today as well. So I'll let both introduce themselves, Alan first. So my name is Alan Akash, I work at Udic Supercomputing Centre, and I've been contributing to EasyBuild for quite a few years now. Okay, and Bart? Hello, my name is Bart Olman, and I work at McGill University in Canada, and as part of the Compute Canada Collaboration, so that like praise in the EU and exceed in the US. We've also been using EasyBuild for quite a few years since at least 2015, and I will present later how we're using EasyBuild. So both Alan and Bart will be helping out mostly through the Slack channel, so if you have any questions feel free to post them there, or if you prefer not to create a Slack account you can post them in the Zoom chat as well as a backup. Any questions throughout the tutorial are welcome there, and I'll try to pause every now and then to take questions in the Zoom session as well if time allows. So let's go ahead and get started. This is our agenda for today, so I'll give a brief overview of the practical information regarding the prepared environment. Then we'll do an introduction to EasyBuild, explain how to install and configure EasyBuild, and the basic usage about using the tool. Then we'll go ahead and actually install some scientific software with EasyBuild and look at a broken installation and how to fix it. And then after the coffee break we'll look at module naming schemes and more specifically hierarchical module naming schemes, how are they supported in EasyBuild and how they work, how we can add additional support for software new versions or new software that is not supported in EasyBuild yet. Alan and Bart will then explain how EasyBuild is used on large HPC sites, both in Jullich and at the Compute Canada Consortium, and we'll wrap up with a little bit of information about the EasyBuild community and how we can contribute to EasyBuild. And in the end we will do a quick comparison with the main competitor tool, let's say to SPAC, and then wrap up with some closing remarks. So practical information, as mentioned already, please visit the tutorial website. This has these slides and also let's say a written out version of the tutorial in text with lots of examples where you can easily copy paste commands or other things that will be useful for the hands-on examples and exercises. I will try to show some of the the examples and exercises myself hands on during the tutorial, but I'll probably won't have time to cover everything. So some of the exercises in there will be left as exercises for you to get some hands-on experience with EasyBuild and I strongly recommend going through them because they're very useful to get a feeling of how EasyBuild works. If you have any questions, we'd prefer that you use the EasyBuild Slack and more specifically the tutorial ISC21 channel in the EasyBuild Slack. Both Alan and Bart are in there to help out if there are any problems or questions. And we also recommend using the prepared environment for the exercises because some things have been pre-installed in there that make the exercises a little bit easier. Otherwise, you may have to spend some time first to install lots of basic tools like compilers and libraries, which would take too long during this tutorial. So for the EasyBuild Slack, the questions that you ask in there, please use the threading feature of Slack. So ask a question and then Alan or Bart or maybe somebody else will follow up on that question through a thread that's just to make sure we keep things manageable in Slack if lots of people have questions at the same time. You should also see some polls popping up in the Slack channel, but you can answer with emojis. So each answer has a corresponding emoji and you can just select that emoji or click the emoji if it's already there to vote for the poll. So these are very informal and just to keep keep things a bit interesting throughout the tutorial. The prepared environment. So again, the details are in the practical info session on the tutorial website. You see the link at the bottom of the slide here. And it was also pasted in the Zoom chat. So you can request an account to get access to the prepared environment. You will need to pick an account name and a password and make sure you remember that. They should be approved pretty quickly. And then you can get access either through SSH. So using the password or through the browser in a terminal environment directly in your browser. So that's probably the easiest way to access it. And we will keep this up until the end of the conference. So let's then move on and get started with an introduction to EasyBuilds. If anyone has problems with accessing the prepared environment, please ask in Slack. And Alan can definitely help you out with that. So let's start with taking a look at what EasyBuild is. So we usually refer to EasyBuild as a framework for building and installing software and more specifically scientific software. So it has a strong focus on scientific software on high performance computing systems. So cluster supercomputers. And the performance of those installations is very important as well. So that's one of the core aspects we pay attention to in EasyBuilds. The EasyBuild code itself is open source, is licensed on the GPL version 2. And the implementation is in Python. It's compatible with both Python 2.7 and a sufficiently recent Python 3. The tool was created at Ghent University. So at the team where I am at, but before I was there, it was developed in-house for a couple of years before we released it publicly in 2011. And the first stable release was done during supercomputing 2011. And since then, a community has grown around it, which is not something we expected or wanted to happen. But it did happen and we're very happy with that, that EasyBuild is now being used all over the world. The slide on the right also shows some links to the website, the documentation, the GitHub repositories, the EasyBuild Slack, and we have a Twitter account as well. So very briefly and we'll gradually unpack this a bit and show it in action and explain the details behind this. What EasyBuild is a tool to install a stack of scientific software in a consistent way and in a way that it performs well. So that's quite important in an HPC context. You can see it as a uniform interface for installing scientific software, so no matter what software package you are installing, you're asking EasyBuild to install it and it does it for you. And you don't need to know what is going on behind the scenes if you're not interested or if there's no need to know. It automates the whole installation procedure so you don't have to manually unpack torbals or apply patch files or run shell commands and check their output or their exit code. It does all of that for you. It's usually used by HPC support teams or HPC system administrators who manage a central software stack on the systems they maintain. But it can also be used by scientific researchers to manage their own software stack in their own HPC account so it doesn't require any admin access or anything like this. Just write permissions to a directory is basically all we need. Over time it has grown to be a platform for collaboration between HPC sites around the world. So many people from a lot of them in Europe but also people in Asia and the US, the Americas are using EasyBuild and helping out making it better. So over time it has become an expert system. Many people are adding their knowledge let's say into the tool and that way helping out each other. Some of the key features of EasyBuild so it fully automates or it supports fully autonomous installation of software in general and more specifically scientific software. It handles the dependencies for you. It generates environment module files and will show how that works and what that means in a bit. It doesn't require admin privileges, I already mentioned that. It's highly configurable, you can easily extend it so if it doesn't include certain things that you need you can easily add that as a plugin in some sense. It supports hooks so you can customize the behavior of EasyBuild if there's something that is missing or that you need for your specific setup. You can easily add that without it being part of EasyBuild itself. It does detailed logging of the installation procedure it performs and it keeps track of that so you can revisit that later to figure out what happened or what went wrong and you can ask EasyBuild what it will do so it's transparent about the installation procedure it executes and you can get some more information on that if you want to or if there's a need for doing so. You can define the way the modules are laid out so what we call a module naming scheme it does something by default you can make it do something else or you can implement your own naming scheme if you want to do that and we'll show that in this tutorial. What that means and more specifically what a hierarchical module naming scheme is and why that's interesting. It integrates well with other tools like lmod for example the well-known modules tool but also with the Singularity container runtime with Slurm so you can basically submit installations as jobs to Slurm and basically use the power of your cluster to install a large software stack. Other tools as well which I won't go into so it also talks to lots of other established tools out there to get certain things done. It's actively developed and supported by worldwide community. We have lost count but I think about 20 easy build maintainers who look into incoming contributions and test them and make sure they are up to the standards that we would like to have for contributions that goes from people in the US to people in Singapore and everything in between including Bart and Alan for example are easy build maintainers. Since the public release in 2011 we have done frequent stable releases about every six to eight weeks. There's a new easy build release which has additional features some bug fixes updates for software and so on so that's that has been going on at a pretty regular pace since 2011. We're pretty serious about testing easy build so whenever we get contributions we make sure they are either covered by the unit tests or that we test them when they come in and then we make sure or we try to make sure that we don't break anything as we make further changes to easy build. Before every release we also do regression testing so we try to make sure that things that worked before didn't get broken unless we know about it and that was a conscious decision. There's various support channels for easy builds. We have a mailing list, we have a Slack channel. We do conference calls every other week where we talk about recent developments in easy build but also where we open up the floor for any questions related to easy build and the last couple of years we have done user meetings as well which have been pretty well attended. There are three main focus points in easy build. One is performance since we're in an HPC context that's very important so there's a strong preference in easy build to compile the software from source code if we get the option to. Sometimes we do divert from this if it's too difficult or if there's a good reason to use existing binaries it's possible that we do that or for commercial software for example where you only get the binaries we definitely support that as well but if we do get the chance or if it's important enough we do build from source. Easy build defaults by optimizing the software installations for the CPUs of the build host so if you're building on let's say an Intel Skylake system then easy build will tell the compiler to also optimize for Intel Skylake so that's what it does out of the box you can change that behavior if you want to and do essentially cross compile for other architectures or do a generic compilation of the software you're interested in installing that's possible as well. We pay close attention to reproducibility of installation so whenever somebody adds support for let's say a new software version to easy build it's typically tested across different systems different operating systems and we try to make sure that things are done in such a way that others will be able to perform or yeah do the same installation as well to basically reproduce the software installation. We're not very very strict about this so it's like a pragmatic approach to reproducibility we will certainly not guarantee that you get a bit for bit the same installation if you try it on another system but in general that works out quite well and whenever something is supported in easy build usually it just works for others as well. Part of the way we do this is by fixing the version of the compiler the libraries the dependencies of the software and that you are installing those are all fixed in easy build that's deliberate especially for the compiler for example that's deliberate so we don't use a compiler from your operating system but the first thing that easy build typically does is build its own version of GCC and then use that to build things like OpenMPI and other libraries and dependencies that you need for your software so that initially things maybe may take a bit of time to get set up because you usually start with installing a compiler from source which can easily take over an hour but once we have that in place the other installations just use that compiler and you're fully started. Certainly today easy build is a community effort so the development the changes we make in easy build the things we add fix updates are highly driven by the community so we we get lots of contributions on the yearly basis we merge about 2500 full requests so it's a very active community we have lots of active contributors I think we're over well over a hundred different contributors on the yearly basis and we have some features in easy build as well that that make it a lot easier to make contributions to easy build as well and hopefully we'll have the time at the end to get to that just to clarify what easy build is not so it's not a replacement for tools like CMake or Make or Pip it basically wraps around these tools if this is the way you're supposed to install a particular software package easy build just automates that whole procedure for you so you don't have to run these tools manually figure out how to use them it also doesn't replace traditional package managers like like yum or aft so some tools you will you will still need to install or are expected to install through your OS package manager like OpenSSL for example for security reasons you do want to get security updates for this so this is you typically still install OpenSSL through your OS package manager also system tools like Slurm or Luster so file systems are typically handled still through the OS package manager and not through easy build and easy build is not magic either so you it's possible or actually likely you will still run into some problems when installing software especially when you're trying to use new compilers or new software versions that have not been tested before you may run into surprises unless somebody has already fixed it for you and it's already supported in easy build then it should work out of the box I will actually look at something an example of a broken installation and how to fix it in this tutorial because that's a very good exercise then to to wrap up the introduction we'll go over some terminology that we commonly use in easy build so some of these concepts are very specific to easy build like easy blocks and easy conflicts have a particular meaning in an easy build setting and we'll also clarify some other terms like modules extensions and tool chains just to make sure we're on the same page in terms of what these things mean so first of all the easy build framework is the the heart or the core of easy build it's a collection of Python modules which are organized in Python packages so basically collections of modules and they implement all the common functionality that you need to compile and install software in general and scientific software in particular so it does things like applying patch files running shell commands generating module files so it has functionality to do all of that it also provides the eb command so that's the main let's say entry points to easy build but if you know a bit of Python you can also use easy build as a library and leverage it like that and it lives on github in one of the easy build repositories then an easy block is a single Python module which implements a particular software installation procedure so you can sort of see this as a plug-in to the easy build framework it uses all the functionality that the framework provides and it just fills in the gaps basically to implement a particular installation procedure so there's two main types of easy blocks there's generic easy blocks and software specific easy blocks generic easy blocks implement a standard installation procedure like running cmake followed by make and then make install so and configure a build and an installation step or how to install a Python package with pip install or Python setup by install there's an easy block for this as well and these generic easy blocks can be used for a whole range of different software packages as long as they follow a particular standard installation procedure you can use a generic easy block then there's also software specific easy blocks for more complex software or where things get tricky very fast even if they use a standard installation procedure but there's lots of configured options for example or lots of things to check and verify during the installation then we usually have a a custom or a software specific easy block for this examples here are open foam tensor flow wharf so these are complex enough that they have their own Python module that implements the installation procedure the procedures that we that are implemented in an easy block can be controlled or steered via what we call easy config parameters so you basically get knobs that you can turn or values that you can fill in which are picked up by the easy block and used during the installation I will see examples of that throughout the tutorial and I forgot the exact numbers but I think about 80 percent of the software that is supported by easy build actually uses one of the generic easy blocks so the software specific easy blocks are for the more complex exceptions let's say the easy blocks we have centrally live in their own Github repository but you can also use your own easy blocks or even use custom versions of an existing easy block and basically add it to easy build as a plug-in so it doesn't have to be part of the easy build installation itself it can live on the side and you can tell easy build to use it whenever it needs to then easy config files so these are basically text files they're written in python syntax but they're not really python code so you don't have for loops or if statements or functions in there they basically define a set of variables which we call easy config parameters so these are the knobs that you can use to to steer the behavior of an easy block they're basically key value definitions so you define values for variables and this collection of variables then tells easy build which software version to install which compiler toolchain to use and so on the file name of easy config files can be important and we'll get to that a bit later in the tutorial as well so typically you'll you'll see that easy config files have this type of naming scheme where the software name dash software version dash a toolchain and maybe a label at the end and it usually ends in dot eb so that's the typical name you'll see for an easy config file but it doesn't have to be like this especially if you're creating your own easy config files the name doesn't always matter only in specific situations easy build comes with lots of easy config files well over 10 000 in the latest version which all live in their own repository and GitHub as well and similar to easy easy blocks the easy config files that you want to use with easy build don't have to be part of the installation itself they can live anywhere you can have your own collection of easy config files for whatever reason and just use those so you don't need to use the central ones if you don't want then easy stack files are a new feature so i won't spend too much time on this here because it's already it's still an experimental feature so it's it's already supported in easy build what we don't consider a stable implementation yet so we're still playing with this and we might we may change the behavior but it's a way of specifying a set of easy config files so basically describing a software stack that easy build should install so rather than one installation so an easy config file is for one particular installation an easy stack file is for a collection of software packages that you want to install in one go so we're creating this this way of expressing what should be installed in an easy stack file there's documentation on this as well if you want to have more details then extensions have a particular meaning as well in easy builds this is additional software packages typically a python package or plural module in our library so something you can not use standalone but you need something else you need a python installation or an r installation along with it we call these things extensions so that's a general term for these kind of libraries that you have in a variety of runtime systems and easy builds can install them in different ways either as a standalone module in a bundle so a collection of extensions together or as an actual extensions into an existing let's say python installation so you can install python together with numpy scipy pandas as one whole thing and use it like this or you could have a separate pandas installation on top of another python installation but keep them separate then dependencies that this is a well known concept of course but just to clarify here so dependency is anything that you any software that you need to either build something else or use something else so so run something and there's a distinction between build dependencies and runtime dependencies so build dependencies or things like cmake that you need to install something else or pip or package config or lots of tools like this and then runtime dependencies of course are the things that you need to actually use the software so for a python package you need a python installation so python is a runtime dependency of for example pandas there's a distinction between runtime and link time dependencies strictly speaking but currently we don't really make that distinction in easy build so that's a bit irrelevant then tool chains or compiler tool chains are a specific type of dependency in some sense so with a tool chain in easy build is at least a set of compilers typically cc plus plus and fortran again since we're in an hbc context and that's the typical combination of compilers you get and then on top of that you can have additional libraries like to support npi support blast layback solenoid algebra functionality or fft and so on toolchain component is then a part of a tool chain so either the compiler component or the npi component when we talk about a full toolchain we mean both compilers and libraries for npi blast and fft and play pack so the whole collection let's say but we also have subtle chains which are for example only compilers so no support for npi or any additional math libraries and then it's it's kind of a hierarchy of things and we will get back to this when we talk to model name about model hierarchical model naming schemes as well so these there's an order to subtle chains coming from compiler only to npi math libraries on top there's also what we call a system toolchain that basically means if easy build is using the system toolchain it uses the compilers that are provided by the operating system we try to limit this as much as possible because then we don't really have control over the compiler version or how the compiler was built which may affect the reproducibility of the installations that we're doing so we try to minimize the use of the the system compiler when using the system toolchain the most commonly this is used when building our compiler building our own gcc version so we use the system compiler to build our own gcc but from that point on we're using our own compiler toolchain that we have full control over then there's two types of toolchains that you you will very often see pop up in the easy build community that's the FOSS and the intel toolchain the intel one is pretty obvious that has the intel compilers intel npi and intel mkl and we have an open source equivalent to that which is all open source software which is currently this could change over time but currently it's the gcc compilers together with open npi open blasts and ffdw and very recently we've also started using this flexi blast library which provides an easy way to switch between different blasts and layback implementations so throughout the tutorial we will be using the FOSS toolchain because that's all open source and just easy to get too easy to install then modules so modules are a very overloaded term kernel modules python modules there's all types of modules and an hpc context and easy build when we talk about a module we usually refer to an environment module file so environment modules are a way to specify what should be changed in your environment to activate a software installation and it's done in a shell agnostic way so the main one types of module files are either in tickle or lua syntax and the very popular modules tool nowadays is lmods that's the lua implementation which supports both tickle and lua module files that's also the default modules tool that easy builds uses if we talk about other types of modules like python modules we usually say python modules if we just say module we typically refer to an environment module file and easy build automatically generates these module files we'll see that a bit so that's a whole bunch of terminology if we all put it together we get this so the easy build framework leverages easy blocks python modules which implement installation procedures these could involve installing additional extensions like python packages into the installation easy build will use a particular compiler toolchain for compiling this software and it will do so as a specified in easy config file which defines a set of easy config parameters that steer the installation when doing the installation easy build ensures that all the dependencies and the necessary build dependencies are already installed and when it completes the installation it will automatically generate a module file for each installation so you get easy access to the software and then an easy stack file can be used to describe a collection of software or the software stack and let easy build install that so that's a whole bunch of terminology lots of these terms will come back throughout the tutorial but hopefully this clarifies things a bit already before we move on to installing and configuring easy build does are there any questions that people want to raise here in the session in this session so again if you do have questions don't hesitate to post them in the Slack channel if something isn't clear please ask there and Alan or Bart will happily help you out there are no questions in zoom I'll continue and we're glad you guys really work our way to getting our hands dirty a bit and doing some hands-on work before we install easy build let's make it clear what we need so we need Linux as an operating system typically which flavor doesn't really matter easy build should work on on any of them it also works on macOS to some extent but the support we have for macOS is pretty basic in the sense that we don't really test a lot of software installations on macOS so if you try to build GCC for example you'll probably get stuck pretty soon so we don't focus too much on macOS since we're focusing on HPC systems which pretty much all of them use Linux as an operating system you need Python either Python 2.7 or a sufficiently recent Python 3 version and obviously we strongly recommend using Python 3 since Python 2.7 is at that end in terms of functionality it doesn't matter which Python you use so all the easy build features will work on both but to be future proof we recommend using Python 3 and next to Python you also need an environment modules tool so this needs to be available already in your system we won't explain how to install that here that's usually pretty straightforward there's typically an OS package already available for things like lmod or the traditional environment modules implementation then installing easy build there's different options to install easy build it's a standard Python tool so you can just use the standard pip install method of installing the tool that certainly works if you're not used to using pip usually pip install easy build requires admin permissions to do a system-wide installation but it gives you options as well to do an installation in your own user account or to install in a specific location or you could use additional tools like virtual and so on so you won't go through that here either you know how this works already or there's lots of guides out there that tell you how to install Python tools another way is installing easy build as a module so as an environment module using easy build itself so this is a bit of a chicken egg situation but we we describe a three-step bootstrap procedure by first doing an easy build installation with pip and then using that one to do the final easy build installation that you want to use and that's also what we'll do here in the hands-on you can also do a development setup just cloning the git repositories and updating your path and Python path environment variables if you want to start hacking on easy build or making changes to it that would be a good setup so the the tutorial website describes all three in detail has separate sections for all three installation methods but here we'll focus on the three-step bootstrap procedure so so we get an easy build module as a result to use and that starts like this so using pip to do a temporary installation of easy build which basically boils down to running these commands now let me switch my view to a browser and go to the tutorial website here installation so here we see there's three methods method one is using pip method two is installing easy build with easy build which is what we're doing here or then the development setup so we'll only look at the bootstrap method here and we'll leave the others as an exercise if you're interested in those we'll try doing that here in the prepared environment so the people who have requested an account should have a terminal open already if you're having trouble with that please ask for help in the Slack channel and I'll do some hands-on demos starting with installing easy build configuring easy build if you're if you can't keep up with that don't worry too much about this everything is well explained in the tutorial website itself as well and this prepared environment will stay up until the end of next week so the end of ISC week if you want to play with things hands on on your own pace so let me go back to the slides so the three-step procedure is installing pip and a temporary directory for this you can copy paste the things here and there's this handy copy paste button that you can use to copy paste this and paste it in the environment so this will do a pip installation in the temporary directory which we control and this this may take a while in this environment because the file system that's available in here is not really quick and then the second part here once the installation is done is making changes to the environment so you can you can use this installation so update part and Python part and then tell easy build to use the python 3 commands to run this part should be quick and if that worked well then we'll have an eb command in a temporary directory which uses python 3 so we can run eb-ssh version for example and it should tell us we have easy build 4.4.0 installed so that's the first step of the bootstrap procedure we have an active easy build installation now but in a temporary directory the second step is then using this easy build installation to install our final easy build installation and we will do this in home easy build so we can just run this command eb install latest eb release so this will check and get up for the latest easy build version that exists and it will go ahead install that now I already did that in this environment here because this takes a couple of minutes and I didn't want to lose too much time so if easy build notices it's already installed it just skips ahead and says you already have this if you try this yourself you may see that it takes a couple of minutes to complete the installation on status done step two we can jump to step three so now we have an easy build module in here we can do we can check what's installed in here and we see you have a modules directory a software directory and the software directory will have an easy build installation 4.4 in the modules directory we have an all subdirectory which has easy build in here as well so this is the module that we can use to activate our easy build installation to activate it we have to tell the modules tool where modules are installed for this we use module use and then we can load to the easy build module now to show you that I'm actually picking up this final installation I'll remove this temporary directory so I will clean up the temporary easy build installation that I did and then which eb will fail on me then I can pick up the final easy build module so that's just one way to install easy build and get you started if you don't have it yet so we need to tell the modules tool that we have modules installed in this location and then we have an easy build module available that we can load it's actually showing me another one as well that's in the prepared software stack which we'll use in a bit but if I load an easy build module and I do which eb I can tell it's the one installed in my home directory and it should be working as well and module list will tell me that I have one module loaded the easy build module so that's I would say the recommended way of installing easy build installing it as a module basically next to your other software that you will be installing with easy build because easy build is also a software package that can be of use to the people who use the HPC system you are maintaining or you are installing software for okay so that's the three step procedure hopefully if people are trying that that should be working once you have done that you can verify the installation as well by running some basic commands like eb-version or check the help output which is pretty long you can ask easy build about its current configuration which we'll do in a bit or you can do show system info which gives you some basic information about the system you are on and what's relevant to easy build so the os here looks a bit weird but I guess this is because it's in this Jupiter environment but it also tells you what kind of CPU it found it can tell it's a Broadwell that's probably because of an extra Python package I have installed that it actually knows it's a Broadwell we have two cores available and it also tells you which chili T version and which Python version it is using so that's some useful information if you see easy build producing this it's basically working so your installation has been done correctly and then something we won't do here now because we already have the latest easy build if you want to update easy build it depends a bit on how you install it if you installed it with pip you can do pip install upgrade to get the latest version and do an in place update of your easy build installation which should be okay or if you install it as a module you can just run ed install latest ed release again and then you will get a new module with the latest version of easy build so the old one will stay there and if you don't need that anymore you can clean it up but this is not an in place update maybe you want to switch between easy build versions to make sure everything is still working as it was before once we have installed easy build the next important step is configuring easy build so it will work by default as long as you have lmot as a modules tool which is indeed the case here if we do module ssh version it tells us modules based on lua so it's using lmot if you have that kind of setup easy build will work out of the box but it will install software into home.local slash easy build which is probably not the best place especially not on an hpc cluster since your home directory might be quite small and quite slow so it's strongly recommended that you properly configure easy build before you start installing any software and to do this there's a couple of things you should configure so where will easy build install software where where will it generate module files that's something you should think about and maybe also where will it download any sources so software sources that it needs to install software maybe that's in a different place from where you install software that could be important and a very important one is also which file system should it use for the build directory so where easy build impacts a source star ball runs lots of compilation commands and then once it has the binaries and the libraries that it needs to install it will copy those out or install those to the final location and clean up the build directory the location of the build directory is quite important because that's an that could be quite IO intensive and using a shared file system like luster or gpfs is not a good idea for these build directories this has very little to do with easy build itself but running a may minus j16 in a luster directory it's quite likely that things will go wrong so they should be ideally a local directory or maybe something in ramdesk devschmem or similar so there's there's lots of configuration settings in easy build we will focus on the most important ones here and especially the ones involved we strongly recommend to tweak those in one way or another so not to use the home.localEasyBuild default the module stolen syntax could be important if you're not using lmod you'll have to tell easy build what it should use so if you're using the tickle-based implementation you'll have to tell easy build to use the tickle module syntax and to use a different modules tool so more details in the easy build documentation on that where easy build should install software and modules that's the install path configuration setting which will tweak in a bit where it should download source star balls is another separate configuration setting and the location of the build directories is also a separate one which is quite important the other ones will skip over for now these are the ones that you typically see that people tweak once they start playing with easy build what the ones involved are at least for this basic tutorial are the ones we will be changing there's one useful um sort of catch all configuration setting which is called prefix which controls everything that's marked with an asterisk here so these four will all be changed at once if you specify the prefix configuration setting which is what we will be doing in the tutorial and there's in total there's about 250 different configuration settings so very little stuff and easy build is hard coded um but for most things the default is probably okay if you want to get more information about what you can tweak or what you can configure you can check the output of eb dash dash help so to configure easy build there are three levels three ways of configuring easy build configuration files environment variables and command line options to the eb command um you can configure every configuration setting on on each of these three levels so there are no exceptions um and there's a hierarchy to this so whenever you specify something in a configuration file or let's say both in a configuration file and with an environment variable the environment variable overrides what's in the configuration file and the same thing for command line options compared to the other two and we'll see this in action in a bit easy build configuration files to start with our standard any format configuration file so key equals value uh quite simple text file easy build considers a couple of options or a couple of locations for configuration files and you can tell it additional places it should look at uh typically a configuration file is used if you want to configure something for the lifetime of the the easy build installation that you will use so basically do it once then forget about it if you're not happy with the default you can change it there and then stop worrying about this um environment variable so every environment variable that starts with easy build underscore will be picked up by easy builds and it will try to match this to a configuration setting if it can't it will complain and tell you your you made a mistake um so for every configuration setting listed and eb dash dash help there's a corresponding easy build underscore environment variable all in capital letters no dashes of course in environment variables and prefixed with easy build underscore so if you want to specify the module syntax configuration setting you define the easy build underscore module underscore syntax in environment variable so this is a different way of configuring easy build in a sense it's more dynamic um it's also easy to set and then not have it visible all the time anymore since it's set in your environment um and we see many people creating a small shell script that they then source uh to let's say more dynamically configure easy build so depending on what system they are on or which account they are in the shell script may be doing different things and setting environment variables differently this is also the main way that we will be configuring easy build in the tutorial just because it's it's quite convenient and then of course you have command line options as well so any anything you set on the command line will always be honored no matter um how easy build is configured through a configuration file or an environment variable but you specify on the eb command line always wins um so there's lots of configuration options different ways to configure things so that could be a bit confusing so we have an easy way to to check the active configuration so with eb dash dash show config easy build will tell you how it's currently configured so if we do that here show config we haven't configured anything yet up to this point so this should be showing the default configuration of using uh home.local and I guess I already have environment variables defined here yeah so let me unset those you should build build path to get a clean environment prefix so if you run eb show config without defining any environment variables this is what you should be seeing um so easy build is telling us it will use the home directory.local easy build for pretty much any everything with one exception that's robot paths this is where it easy build will look for easy config files so this was by default it will use the ones in the easy build installation we are using so this is just a selection of configuration settings easy build things this this are the most important ones that you should probably know about um if you want to get the full configuration you can use show full config and this will produce an overview of all 250 plus configuration settings and tell them how they are currently uh set okay so this example of configuring easy builds is also in the tutorial website in here um so this creates configuration file a very small configuration file which sets the prefix setting to slash apps it also sets an environment variable and then we ask easy build to show the active configuration and we also tell it on the command line to use a different install part so if we do this show config output looks a bit different so it's honoring all these configuration settings and as a bonus it's also telling us on which configuration level each of these things were specified so it's giving us e for build part because we set the easy build build path environment variable so show config is very useful both to show how easy build is configured and also how it's configured this way so with which um configuration level so if you're not sure if you show config and you know in the heartbeat what's going on how easy build is configured for the sake of the rest of the tutorial i'll remove the configuration file and leave the rest in place slightly reconfigure um easy build to make sure it's okay for the rest of the tutorial so this is what we recommend for the rest of the tutorial the minimal easy build configuration so you set the build path to temp user which i already already did in here you set prefix to home easy build so easy build prefix equals home easy build and this is also i think in the exercises here so i'm kind of spoiling the answer here but it's important to have it properly configured that's the basic configuration just setting these two environment variables that's enough the prefix is telling easy build please install stuff in home easy build with one exception for the build directories please use the temporary slash temp for that because we don't want to use a shared file system for build directories that's really enough there's one more thing we have installed a small software stack in this prepared environment in easy build modules slash all so slash easy build has both modules and software and we need to tell the modules to to be aware of these installations so that's why we use this module use command as well so it knows for example that we have a gcc installation in slash easy build so you don't have to build gcc from source that would take a bit too long so for the remainder of this this tutorial this is the recommended easy build configuration on slide 39 then once we have configured easy build and we're happy with that configuration we can start using it so the eb command is the main way of driving easy build using easy build you give it command line options you usually give it the name of an easy conflict file or multiple easy conflict files or if you don't mind playing with an experimental feature of an easy stack file and that way you tell easy build what to install very often you will also need to enable the dependency resolution so tell easy build if dependencies are missing please install these first and then continue the the final installation that I specified so for this we have the dash dash robot option or just dash r for short so the typical workflow of using easy build is first finding an easy conflict file that matches what you want to install if it's not there maybe create one either starting from an existing easy conflict file or starting from scratch and we will cover that the second part of the tutorial you probably want to make sure that the easy conflict file looks okay that it has all the dependencies that you're okay with what easy build or how easy build will do the installation and will show how you can get some more information on that before actually doing the installation you want to make sure your easy build configuration is correct so check the output of eb show conflict and then you go ahead and fire easy build and you go for a coffee or if it's a big installation with lots of dependencies maybe only three two or three coffees so specifying what to install is usually done through easy conflict files so to give you a quick example in here so this for example is an easy conflict file that I know is included in easy build this is a very simple one version 106 of the bzip tool easy build will go and look for this easy conflict file read it figure out what to do based on that download sources if it has to and then go ahead with doing the configuration the building maybe the testing and the installation if it's happy with all of these steps it will create a module file to complete the installation and if we have things set up properly we should be we should see the generated module as a result so this one here in our home directory is what we just installed so that's the typical usage eb and the name of an easy conflict file or the name of multiple easy conflict files to get something installed sometimes you'll run into missing dependencies and we have an example of that coming up and to see which easy conflict files easy build knows about you can use the search option so let's look at a recent version of tensorflow so does easy build have easy conflict files for installing tensorflow it looks like it has and actually more than one the output here is pretty long because it prints the full path to these files if you use dash capital s so that's the the short version of search it will look a bit more readable so here it's telling us it also gives us patch files actually that are included in the easy build installation but it tells us it's telling us I have a found an easy conflict file for tensorflow 241 with this particular tool chain and also with another tool chain and a particular python version and so on so you already have multiple options here to pick and you probably want to figure out what all of these provides it's giving us other hits here for hard of hot as well because this is basically a partial match on the name of the easy conflict file and it's matching on this here as well you can tweak search a bit if you only want really tensorflow nothing else you can use regular expressions to search this will not show the horrible ones if you don't want to see patch files you can tell that the file name should end in eb or dot eb and then you will only see the easy conflict files so depending on what you're after you can find the output before you install anything you can look at the contents of the easy conflict file so if you want to get a better view on what this actually contains show ec or show easy conflict file will show the contents of the easy conflict file so you don't have to give it the part easy build will search for the easy conflict file and print the outputs now this is producing a lot of output of course for tensorflow there's a lot of things going on here for example we have a whole bunch of patch files to fix either problems in the installation procedure or fix bugs in tensorflow in this particular version of tensorflow so this has a lot of information a lot of detail but it may already give you some some details on what easy build will do like for example the list of dependencies that easy build will use for this tensorflow installation and you can see all of these are specific versions so there's really very little room for things to go wrong here since we fully control the versions of the dependencies and also of the compiler's toolchain and that means the compiler npi plus layback that's being used so next to doing a visual inspection of the easy conflict file itself you can also ask easy build we copy paste it from the tutorial website here I guess yeah so with the dry run option and let me use the dry run like this or the shorthand minus capital D easy build will give an overview of the whole installation in terms of which dependencies are needed for this and which ones are already installed this is already a pretty long list for samtools because it includes the compiler toolchain and everything below that but it will tell you here that most of the things are marked with an x so they are already installed except for samtools itself this is not installed yet so anything but samtools is missing here now this output is a bit confusing and a bit big so we have a shorter version which is dash dash missing or dash capital M and this will only show what is still missing to be able to install this so here it says there's 22 modules required but only one is missing the samtools one itself so this should be pretty quick to install so that's the dry run option and also the missing dependencies or dash capital M like this so yeah depending on this output here you have a good view of how much work it would be for easy builds to get this installed and you can also inspect the installation procedure without actually doing it this is a pretty cool feature so let's do this here for this boost example so with dash x or if you're if you like typing extended dry run you can ask easy build how it would install this without it actually doing the installation i'm going to pipe this to relays because this will produce a lot of output if you hit enter in a couple of seconds easy build will give a detailed overview of how it would install boost and it's checking a couple of things in the background now but it comes back with a detailed answer a bit of a wall of text which you can scroll through so it's telling us i want to install this module i will need this source file for that i haven't found it yet but i will try to download it from this location and i'll put it here for you i also have some patch files listed that i will use and then it explains i will unpack the source start wall apply the patch file set up the build environment by loading these modules and defining these environment variables and then it will go ahead and run this configure commands run this pretty huge build command and actual several build commands so not just one i guess to install different libraries or different versions of the boost libraries and then at the end it will also tell us once once things are done i expect at least these things to be in place in the installation and once it's happy with that it will go ahead and generate a module file which will probably look something like this so all of this is not exact but it's good enough it gives you a detailed idea of how easy build to do the installation and what it will what you'll end up with in the module file so dash x is a very useful option as well to figure out what easy build would do and the slides here basically zoom in on some of that okay any questions until now feel free to try some of that in the prepared environment yourself i do realize i may be going a bit quick but we have a lot of material to cover and everything is really nicely explained in the tutorial website here as well together even with some exercises to take things a bit further beyond the example and if you click the green box here it will tell you what the solution is to the exercise that's a very good way of introducing yourself to easy build hey Kenneth just just to mention and some of the some of the nodes didn't have some basic packages like patch installed and i fixed that so so if anyone had an issue installing visa too and you should be able to do it successfully now okay yeah so apparently there were some os tools missing that easy build relies on like the patch command and allen should have fixed that by now so if you saw any problems related to that before that should be okay now thanks okay if there's no questions here in the zoom session i'll continue with a more detailed look at installing software at easy build and also doing some troubleshooting in things in case things go wrong so let's actually do some of these installations in the prepared environment starting with some tools so this is an actual scientific software package so we checked before that there was only one thing missing to get this installed samples itself so we can ask easy build to do this installation using the dependencies that are already in place now one detail here is most of what is pre-installed is pre-installed in slash easy builds while we are installing samples in our own home directory here so we're basically using things that are installed in two different places that's absolutely fine as long as easy build can see the modules that are required for the gcc toolchain or the dependencies that samples has it's happy it doesn't really care where those modules are installed as long as they are visible through the modules tool easy build can pick them up and do this installation for you so this will take i think a minute or two or three to complete this is a pretty basic installation that all the dependencies are already in place so that's fairly easy but there are cases where this is not the case yet the case this took about a minute if we don't have everything in place yet like for example this bcf tools example so we're again checking the dash capital m or dash missing what is still missing here it says two out of three i cheated a bit i already pre-installed this gsl one because this takes a couple of minutes but there's still two things missing a dependency and bcf tools itself so if i just try to install this easy build won't be happy it will tell me it's missing a hts lib so it will go ahead and download the sources but then when it tries to set up the build environment it fails and it's telling me yeah i don't have this module and so missing dependents missing modules for dependencies and it's happily or yeah it's telling me maybe you forgot to use dash dash robot to enable the dependency resolution so if we do that dash dash robot or just dash r for short we're telling easy build if any dependencies are missing please install these first and then get back to doing bcf tools now we see it installing hts lib first and then it will continue to bcf tools if you try this yourself it will do gsl as well and this will take i think a minute or three or four we only have two cores in this prepared environment so that's a bit limiting but it should work for you in the end as well so rather than letting this complete let me cancel it because we see it hanging in the build step for a while and we don't really know what's going on that's a bit annoying to me um so we can tell easy build to be a bit more verbose and give a bit more information about what it's working on by enabling the trace modes so we can use dash dash trace on the command line or we can set the same thing through an environment variable so easy easy build underscore trace all caps equals one or equals any value doesn't matter and then if we rerun we just dash r again it's going to give us a bit more information about what it's doing so rather than staring at a line saying building it's now actually showing us what it is doing unpacking the source star ball setting up the build environment so loading a bunch of modules and then here during the build step it's telling us i'm running make minus j2 so easy build knows that there's only two cores available you don't have to tell it that it figures it out by itself and that's also why this is taking a little while of course to do the actual build in the example here i'm showing the output of bcf tools not hts lib which it's still working on here so it looks a bit different but yeah if you're as annoyed as i am by things taking a while and not knowing what's going on you can enable trace modes so once it finishes these installations and i'll let it go here it's wrapping up on hts lib it will do bcf tools next how do we start using this software once the installation is complete so again similar to how we installed easy build we have a module that can be used to activate the installation we have to tell the modules tool where the these modules are located and once it knows about them we can load the corresponding modules to get access to the software so if you're not sure where things are you can start with looking at how easy build is configured you are interested in the install part so that's the location where both software and modules are installed so in this directory you'll have a modules sub directory like this in this module sub directory you have an all directory and then separate ones as well so these other ones contain symbolic links to the module files in all that's because you can change the view of how modules look in the output of module available if you prefer having things in categories like this you can do it as well but for the sake of this tutorial we'll stick to the standard way of just showing everything at once easy build modules all as it says here and yeah we're having both modules in our home directory and in slash easy build so it's picking up things from two different locations but if we check for bcf tools and i think yeah lmod is configured case incentive here we check for bcf tools we notice we have a module for this now which wasn't there before we can load this module i think this is the one we're loading here as well yes load this module check the output of module list this will tell us bcf tools is loaded but also the dependencies of bcf tools gcl htslib and their dependencies together with the toolchain and what that depends on so this is already a small software stack that we have and then you can start running bcf tools dash dash version and getting some science done if you check where this is coming from it's indeed coming from our home directory and this particular installation and as mentioned before yeah stacking software is very easy with easy build if we look where our gcc is installed that is active here this is coming from slash easy build so this is the pre-installed software stack while our own bcf tools installation is living in our home directory till the slash easy build so it's fine that things are in two different locations as long as the modules are visible to the modules tool easy build is happy yeah there's a couple of exercises here as well i won't go through these i'll leave these as actual exercises for you if you have the time for it in in the slides here we do unpack things a bit more so the the installations that easy build does we go back to a non-trace output yeah like this so there's a lot going on in the background here but the installation procedure is essentially split up into different steps which always starts with parsing in easy config files so easy build knows what to do and then it just goes through the steps of fetching the sources making sure everything is in place unpacking the sources applying patch files preparing the build environments then doing the actual configuration of the build for example cmake or running a configure script doing the build step which is very often something like make or make minus j number of course if that completes it often runs a make test if that is supported by that software if the tests pass it does the make install or the installation step into the final location it may go over a list of extensions to install as well additional python packages or libraries or things like this before easy build declares success it does a sanity check which means looking for a couple of files that should be there in the installation and very often also running a basic command like dash dash help or dash dash version to make sure that the binaries that are provided are actually working and if that's happy with that it will clean up generate the environment from environment module making sure permissions are okay and then complete the installation so this this whole procedure is defined in the framework an easy build framework and easy blocks fill in the gaps basically configure build and install are always specified in an easy block that is used and you can tweak each of these steps through environment variables which are defined easy config files so that's how the whole thing fits together yeah this we have already done using software that was installed by loading the module and we can stack software installations easily by using a central software stack and then installing stuff in our home directory that works very well and transparent okay then a fun exercise to go through and actually a very useful exercise to go through is to troubleshoot something that goes wrong so even though installations especially easy complex files that aren't included with easy build often work out of the box so we try very hard to make sure they do but there's there's always a possibility for outside influence or something on your system being different or perhaps being broken that breaks the installation so that means if something goes wrong it's very important to be able to troubleshoot and figure out what went wrong and how to fix it lots of stuff can go wrong you can run out of memory out of disk space one of the shell commands that easy build runs could fail there may be a dependency missing either because it was accidentally removed or because easy build doesn't know about it that's certainly possible maybe your compiler crashes with a weird error so certainly if you're building C++ software and jumping to a new compiler this will sound familiar so one thing to keep in mind is that easy build keeps a thorough log while it's doing installation so you can dive into this log file and figure out what it was doing what went wrong what kind of errors popped up and so on so let's go through the slides here first and then do a hands-on example so whenever something goes wrong easy build will produce an error message so this is one aspect of easy build that's certainly up for a bit of improvement so you sometimes you get a bit of a wall of text that you have to go over and try to find the actual problem so here it's pretty clear g++ the g++ compiler is telling us this command and option you used I don't support this so this cannot work and then if you look at this a bit closer you actually see it's using user bin gcc here hard coded so this is the os compiler rather than the one we provide through our toolchain so we're essentially using a way older compiler here than we were actually intending and this could be because this user bin gcc is hard coded in a make file and we may have to change that or tell it otherwise to use the proper compiler this is an example of an error message that's useful because the after staring at it for a bit at least the problem is clear but from the error message itself but that's not always the case sometimes the error easy build just shouts at you and says it's broken and the error itself is actually not included in the error message and then you have to go ahead and open the log file so for every installation that has started easy build keeps track of a log file the first line it prints is always temporarily log file in case of crash is dislocation so you can always open this either during the installation to see what's going on or if an installation fails easy but we'll keep this file so you can open it and inspect it and see what went wrong sometimes you can enable debug mode by enabling the debug configuration setting to get a bit more information which may be way too much information because easy build is very proposed in its logging but that may be helpful in some cases and when an installation does complete successfully easy build copies the installation the installation log file to the installation directory itself so if we look at what we have installed up until now let's say our bcf tools installation 1.11 if we check the contents here we'll see a bin directory and a libexec directory this is part of bcf tools itself but there's also an easy build sub directory which has a couple of things including the log file of this installation and the easy config file that was used and if we look into this log file so let's open this up this is very very verbose of course but we can look for example for make minus j and we can tell this command was run this was the output so all the individual compiler commands are in here you can inspect those if you notice something wrong with the installation or you just want to double check how things were done we can look at the output of the configure command and see what it was picking up there what it was reporting and so on so that's very often that's very useful to figure out what was actually being done during the installation to navigate these log files so there's a lot of information in there so you typically don't want to scroll through them but search through them either using less and vim or emacs or whatever it's your favorite for opening text files so notice that there's a well-defined structure to this so every log message that easy build emits starts with a double equals and a space followed by a date stamp so you can use this easily to to search there's info log messages if you enable debug mode there's also debug log messages so you can look for this it tells you which of the python modules from easy build produced the particular log message and on which line the code was running when it was doing that and it also has these step markers so whenever it starts a new step it will emit a message like this so starting something something step to go back to the log file here if I look for starting let's say build step I should find that here if I look for starting sanity check sanity check okay test install ah here sanity check without a space so that's here you can see the different steps of the installation popping up um an easy build producing a bit of information co-op is doing that and if you're looking for actual usually compiled errors or failing commands these are a couple of patterns that you you often see popping up so it's not always trivial to find the actual problem that caused an error so it involves a bit of scrolling up and down and looking for particular patterns but looking for these are very common a reason why why make command for example fails next to checking the log file we can also check the build directory so if an installation fails easy build will leave the build directory in place so you can you can dive in and look at the files that are there or maybe additional log files like the config.log from a configure command and inspectives or cmake for example has very verbose logging as well so a fun exercise here and I'm doing quite well on time so let me go through it maybe go through the whole thing um you should definitely try this yourself as well because it's a very very good exercise of of learning to use easy build hands on and actually running into problems which is bound to happen when you do your installations yourself as well um so this is explained well in the uh here in the troubleshooting section so there's an exercise here that gives you the contents of an easy config file so let me go ahead and copy this and let's call it subread dot eb so the name here doesn't really matter in this um in this situation because we're going to give it straight to the eb command so it it doesn't need to automatically detect or find this file we're basically telling it where it is so at first glance this looks pretty okay um we'll get back to easy config files later so this is a bit backwards of doing troubleshooting first but it's more to give you the experience not to really explain what an easy config file is or how it works internally but even if you give this to an experienced easy builder they would say this looks pretty okay and it might it may work um so if you try it you'll you'll learn that it doesn't work out of the box and you'll run into some problems that you can fix let's give this a quick try and see what happens we just give the name of the easy config file to the eb command we hit enter easy build parses the easy config file and tries to install this software called subread so this is an actual scientific software package oh and i still have some modules loaded that's very smart so here it's yelling at me because it says you you've loaded a bunch of modules um and that's usually not a good idea and that's indeed true i have the bcf tool still loaded so let me purge my module environment and reload only the easy build module that should be a bit better easy build is happy that the easy build module is loaded but if it sees something else loaded that is installed with easy build it will yell at you and tell you not to do that okay so starting that again it's trying to download the source files for subread so this source file and it's producing quite a long and perhaps ugly error message but it comes down to that it couldn't find this source file anywhere um that means it's not already downloaded in my source archive so here if i look for subread i'll just find an empty directory and that's not surprising because we're not really telling easy build where it can find the source file the easy config file does have a comment here so that says we should download this from this location that's good for humans but not really for easy builds since it just ignores comments of course uh but we can use this so we can add an additional easy config parameter source URLs and use this location in here not like that the copper paste is from here that will be better so we define this additional source URLs line um that tells easy build okay if you find yourself having to download this file this is the place you should download it from and the URL doesn't include the source file name itself so easy build adds clues it to the end of it so let's save this try it again and now easy build should be able to download it by itself since it knows where to grab it from and if it did the download it will verify that it's the correct one by verifying the checksum it does that before the unpack so since it's doing the unpack it seems to be happy with the checksum now notice that i'm getting the extra output here because i still have trace mode enabled if you're not seeing this you can define this environment variable or you can use eb dash dash trace to get this additional information so it's happy with the source file it found it it unpacked it and then it barfed again then it said no module found for toolchain gcc 8.5 so let's see what we do have for gcc we do have gcc 10.2 so let's just switch to that toolchain instead so we change the version 8.5 to 10.2 we save the file we try again hopefully it gets a bit further now yeah don't forget to do this preparation step here as well if you haven't done that yet to make sure easy build finds the dependencies okay so now it got a bit further did the unpack it did the prepare step it's happy with that it tried the build then it failed horribly and even though we're seeing compiler commands here we're not really seeing any real error messages so it's not clear what is going wrong but we do have the log file easy build reminds us here where the temporary log file is so we can take this location and open it there's actually an easier way to take the location of the last log file it has easy build has an option there's this last log which will look into the location where easy build puts log files and print the location to it and then with a bit of command line magic we can give this straight to an editor so vim and then eb and last log in the sub commands that way we don't have to keep copy pasting the location to the log file so this is the log file of our last failed installation typically you want to jump to the end of the log file because that's when things went wrong so I do dollar capital G here and we're back to our compiler command we have some partial output here which is not very useful but if we scroll up a bit we do see an actual error message here from the GCC compiler so we tried using dash fast as a compiler option which is not supported and it's making the compiler itself is making a helpful suggestion maybe you were using or you wanted to use dash oh fast okay thank you compiler but we'll actually do something else so let's look again in our easy config file do we see fast here somewhere yes so through this build ops easy config parameter we will we were changing the make command we were adding stuff to the make command and we were hard coding dash fast here which is clearly a mistake so we could change this to dash oh fast and that should probably work but the better solution here is to use the environment variable that easy build sets so easy build sets not only things like dollar cc in the build environment but also c flags to a useful value so we can reuse that that value and add stuff to it so we have to have to add the f common because we're using gcc 10 here and if we retry that I'm sorry we retry that we'll see that it's using the easy build compiler variables that are prepared in the environment and remember to to inspect the build environment that easy build would set up we can use db dash x so do the extended dry run and get a detailed report on what easy build will do and if we do that look for the build environment and it sets up here it reminds us what kind of variables it's defining in the build environment so c flags is one of those and this looks a bit more useful than dash oh fast in particular emerge native means we're going to produce binaries that are optimized for our current host whatever that is it was a problem I think so that's a bit better than oh fast and it probably will result in the faster installation so what we're doing with the change I made here is just reusing this prepared environment variable and adding this f common to it because of the gcc 10 compiler you're using so that should hopefully fix that compiler error let's see how far it gets now and again this whole exercise here is also in the tutorial website together with the solutions so I definitely recommend retrying this yourself and getting the full experience this will take a little while since we only have two cores the trace output is telling us it's retrying the make command it looks like it's taking a bit longer that seems to be good news that completed exit code zero so no compiler errors this time but things go wrong during the sanity check that easy build does so it's looking for a bunch of binaries that looks okay looking for the non-empty directory that looks okay as well it tries to run a command that version and that fails miserably and we can actually see the the problem here the feature count command says I don't know that version as an as an option you'll have to use something I know about if we open the log file let me use my last log trick again you can just copy paste the location of the log file as well jump to the end then we actually get the full output if you scroll up a bit the full output of the feature commands test as version and here it's telling us that dash v is the correct option to print the version so okay I think that tells us what we should fix we look back into the easy config file we see the sanity check command listed here and rather than using dash dash version we can use just dash v and that should work so we save this and we can redo the installation starting from scratch which is a bit silly because we know with compiles it actually also it created the installation directory already copied the files in there but it's only this this sanity check command that was wrong so does that mean easy both has to redo the full installation no we can ask it look I know the installation is already in place it's correct I fixed the mistake in the sanity check so just check the installation again and if you're happy with it then generate the module file so that's a special option to the eb command module only that means skip everything but do the sanity check and if you're happy with that generate the module file so that should be a lot quicker than redoing the whole compilation from scratch so you see it's going through the all the steps again but lots of them are being skipped it's not unpacking sources it's not doing the build commands it's only doing the sanity check and now it's running the fixed command which fast okay it's happy with that and then generating a module for wrapping things up for us so that's a lot quicker than redoing the installation from scratch and once it wraps this up so again this is I think a side effect of the slow file system why it's taking so long to wrap up here but once it comes back it goes we can check our available modules we should find one for subread which was just installed we can load well just without the version is fine there's only one and if we do this we should have the feature counts come out and we can run dash v ourselves just like the sanity check that so we're pretty sure that this this will work since it was verified in the sanity check but that's a very good exercise going to this troubleshooting step and having a couple of things go wrong and then trying to fix them okay we're doing pretty well on time we're almost exactly following the agenda any questions before we break for coffee or tea or whatever favorite beverage your favorite beverage is on a Friday afternoon if there's if there's no questions in the in the zoom session we'll do the coffee break now I will be back in half an hour but yeah do feel free if you have any questions raise them either in slack or in the zoom session I will we'll start again in 30 minutes okay we'll get started again in a couple of minutes exactly at quarter past but maybe somebody has a question already they want to raise in the zoom session we should have time at the end as well for questions if people are have a burning question or sometimes something that's maybe a bit out of scope for the agenda we'll definitely take those questions at the end as well for the people who are playing in a prepared environment that will be available until the end of next week so the end of the isc week one small detail to keep in mind is that if you start a session it's basically active for five hours so if you're spending more than five hours in there you'll have to restart it so take that into account it's basically a bash session in the slum job you're playing around I'm assuming my my screen share is still up soon isn't showing me the the border anymore but unless somebody shouts at me for not showing my screen I'll continue we're doing quite well in terms of time let's see if we can keep that up and stick to the schedule I hope we can get through all the content we have in mind here so the next bit is about module naming schemes and hierarchical module naming schemes so we'll go over that now so what you've seen up until now when we install easy config files and we get a module file for that installation it matches very closely with the name of the easy config file so we get name slash version dash two chain and maybe a label at the end that's what we call a flat module naming scheme and there's another type of name of module naming scheme it's a hierarchical module naming scheme that's very different so let's let's try and explain what that is the default one the easy build mns module naming scheme is a flat one that basically mimics the easy config file names it just adds a slash after the software name rather than a dash in the file name in the easy config file name and it strips off the dot eb extension this is a flat module naming scheme because first of all there's a clear mapping between easy config file name and the module file but also if you do module avail you'll see all modules at once so if we do this in the prepared environment here assuming I do the module use of the centrally installed modules you'll see this is a long list and some of these names are quite long all of them mention which which toolchain wasn't was used to install them except the ones installed with the system toolchain which are just os tools so not a proper easy build toolchain so that's a bit daunting maybe ugly even to have names that long and also having everything visible and available for loading at once is a bit first of all yeah getting an overview of what's installed is a bit difficult but if you have stuff installed with different compilers which is not the case here but it's very easy to have things installed with different compilers it's can be very hard to combine those and not run into trouble so even though all these modules are available straight away it may not be a good idea to load them together in the same session so that's a bit unfortunate that there's no hints in terms of what is compatible with with each other so these two main problems long module names and having everything at once available for loading even though it may not be compatible with each other there's a solution for that which is called a module hierarchy or a hierarchical module naming scheme so there's there's different ways of doing a module hierarchy but the typical one has has three level hierarchy where there's a core level and this and the core level things like compilers like gcc or intel or whatever compilers you're playing with are installed and these are things that are typically installed with a system toolchain so not with a with a toolchain controlled by easy build but something that is using the compiler from the os to build this particular software the compiler a compiler installed with easy build is a typical example then on the compiler level you get modules or libraries or tools that are installed with a compiler only toolchain controlled by easy build so here in this example open api and mpich to api libraries were installed with this particular version of gcc and then there's another level in api level where software that is installed with both a compiler and a particular api library are located so that the ones at the bottom here ffdw scullapac and hdr5 were both or were all three installed with open api and gcc so with these two particular modules so these gcc and open api in this case are what we refer to as gateway modules so they provide access to additional software installations this is very different from a flat naming scheme in the number of ways like the the entries you see here in the in the circles well i'm not circle close enough and these are the actual module names so you just get software names slash version and nothing behind it so the module names are quite short and clean and what they were installed with in terms of toolchain is implied in the location that they are located in so that that helps a lot so there's pros and cons of both actually with a flat naming scheme all the modules are visible so you can get easily hundreds or even thousands of modules that are visible and especially people who don't really know what they need and are just looking around doing a module avail without any additional arguments get to see all these modules and this can be quite overwhelming so that's a downside the upside is if you do module avail and something is not there you know it's not there so there's no way or yeah no obvious way that it's it's hidden or unavailable at least not without taking additional features into account of the modules so in the flat naming scheme module names are unique so there's a direct mapping from at least if the module naming scheme is is designed well there's a direct naming direct mapping from easy comfy file to module name and the other way around the module names are long and maybe even ugly which can be confusing so you get these gcc or fuzz or gumpy tags in there so names of tool chains which can be very confusing and gets in the way more than it is useful but it's a necessary evil with a flat naming scheme and again loading modules together may cause problems if you're not careful so loading things with different compilers you may run into surprises and broken things because you were not careful in the module hierarchy you get short module names so it looks a lot cleaner and the tool chains are sort of hidden so they're they're encoded in the location where modules are installed so that helps with keeping it clean but the downside is that if you do module avail you don't see everything that's available and we'll see this in action in a bit when I do the the hands-on demo so lmod is a modules tool that that was actually built from the start for supporting this hierarchical module naming scheme type of approach and next to the module avail command it also has a module spider command to look through the hierarchy and see what's available where and how that can be accessed so module spider is mostly useful and in a module hierarchy and you actually need to use it to be able to find modules lmod is also very smart if you have the same package installed with a different compiler mpi combination so let's say you have hdf5 installed with gcc9 and open mpi4 and another hdf5 installed with gcc10 and open mpi3 or whatever makes sense lmod is smart enough that if you load a different compiler it will actually swap things around for you so you get access to the corresponding hdf5 that was built with that compiler same thing with mpi that's very useful if that's what you expect but if you don't expect this it can be very confusing so that's a bit of both a positive and a negative point the very good point next to the short module names is that if you can load modules together that means they are compatible with each other so there's no way that you can accidentally load modules that were built with different compilers because they're located in different places and unless you try very hard you will not be able to load them together and the gateway modules like the gcc and mpi and this will become more clear in the demo as well that's also a bit of a necessary evil let's say in a hierarchical module naming scheme so end users are forced to load a compiler and an mpi first before they get access to the actual software they care about so this may be a bit confusing or may have little meaning depending on how experienced your users are there's multiple module naming schemes supported out of the box in easy build but you can also create your custom one your own custom one so you can write a small python script or python class actually that derives from this model naming scheme abstract class you can define a couple of methods that explain to easy build how module should be named based on software name software version toolchain and whatever else is available from the easy config file and you basically make something that spits out the string and it tells easy build this should be the module name i want to use we won't actually go into implementing a custom module naming scheme here even though it's covered in the in the tutorial website there's a separate section a small section on how to implement your own custom module naming scheme that does everything lowercase and something silly like replacing all dashes with underscores something along those lines so as long as you know a bit of python you can make easy build do anything in terms of how modules are named so what i'll do here because this gives in a good idea of how module hierarchy is different is i'll look into installing hdf5 and using a hierarchical module naming scheme and that way you will hopefully become more clear how things are organized and how that's different from a regular flat naming scheme um we do have to be a bit careful when doing this because one thing you can definitely not do is mix modules that are installed in a flat with other modules that are installed in a hierarchical module naming scheme so you cannot easily combine both because some modules like the gateway modules for example gcc will have the same name in both the flat and the hierarchical module naming scheme but the contents of the module files are different they do different things so if you accidentally pick up one rather than the other things will just fall apart and will not work properly anymore so you have to avoid mixing modules that's one thing and an additional thing is that you have to reconfigure easy build a bit first of all to use the different module naming scheme so that's this environment variable but that's defined so we'll use this hierarchical mns that's included with easy builds and in this particular case we just want to play with the modules themselves we don't want to rebuild all this software so what we want easy build to do is we're going to tell it look the software that we want to use is installed in slash easy build slash software which is already there everything is pre-installed but we want to generate the modules for this in a separate location in our home directory slash hmns and what we will do is we will tell easy build to just only generate the modules for these installations so not to rebuild all the software we're talking about 41 modules by the time we have hdf5 and all dependencies and compiler toolchain and npi installed so we don't want to do this from scratch this would take a couple of hours but easy build is smart enough to reuse the installations that are in here and only generate a different view on those installations by generating different modules and that's what we're going to make it do in this particular demo okay let's try and get up to work this is the example part here and I think it explains it quite well how to set things up so to start with you have to make sure that your environment is clean so there should be no modules loaded and the modules tool should not know about any modules at all so if you type module avail it should come back empty so you can do that by running module purge and module unused module part so this is the environment variable that the modules tool keeps tracking where modules are located if we do this module list should come back empty and module avail should come back empty as well that's exactly what we want here now we we do have a small problem we had easy build installed as a module which is now no longer available for loading so how we were able to run easy build if you're really serious about using a module hierarchy you would first start with installing easy build in that hierarchy load that module and then continue from there I won't do that here even though it's quite easy I'll take a shortcut in some sense and just use a pit installed easy build in my home directory so I'll copy paste this which essentially just does a pit install of easy build in my home directory so that's this user will go to dot local in my home directory and I'll use this as the easy build installation I want to use here I'm just making sure the eb command is available and this one tells easy builds to use python 3 to run easy build itself so hopefully that shouldn't take too long and I can continue and after we get an active easy build again that we can use outside of a module system we also have to be careful about how we configure easy build so we're going to tell easy build to use home easy build as before as the prefix the build path set to temp even though we're not actually going to build anything in this demo but we make sure it's properly configured we tell easy build to use the hierarchical module naming scheme that's important of course and we tell it to reuse the software the pre-installed software that's in slash easy build slash software so easy build look there and expect to find things there already pre-installed and then the modules that we're going to generate in this hierarchical structure will be located in dollar home slash hms so we're going to copy this and once this installation finishes we'll configure easy build that way if we've done that correctly and we run eb show config that should return an answer like this so this has a a little bit more output than before because we're we're defining more configuration settings we have a separate location for modules and software so that's new the module naming scheme is also new that was not there before because we were using the default so just setting these environment variables and then running show config should pretty much produce exactly the same thing we have here except for the username part of course so that looks pretty good software comes from the central location which should be here so there's a whole bunch of stuff pre-installed here but we don't have any modules for those yet and we'll make easy build generate those for us so what we want to do is generate modules for this easy config file for hdf5 we want to make sure easy build does all the dependencies as well so there's remember there's nothing there not even a module for the compilers or the toolchain i will tell easy build only do the modules trust me the installations are there and easy build will actually check whether the installations are there and whether they work because when using module only it will still do the sanity check so it will actually verify that things are in a functional state rather than just generating the modules if if you do that in a prepared environment this will generate 41 modules each of these takes a matter of seconds because it's not actually installing anything but it is doing the sanity check and generating the module file doing a test load of that module and checking a couple of things so it does take a couple of seconds times 41 means a couple of minutes so i cheated a little bit here and i've pre-installed some of these modules already in this hminis directory but if i ask easy build what is still missing so dash capital m or dash dash missing it will tell me i think one or two or three are still missing so even if i run this command it will still do something useful yeah so it's telling me i don't have the module for hdr5 yet and look how this is pretty different the output than before so this is the name of the actual module file let's say the user-facing name of the module file and this is the location that easy build will put it in so here we can already see this structure this is the mpi level because hdr5 needs both a compiler and an mpi and this szip dependency 211 will be installed in the compiler level because this doesn't depend on mpi so it goes up let's say in the middle of the hierarchy the compiler only level so that's just me cheating a bit to avoid that easy build has to generate all these modules so what i'll do i'll do robot so it does both the missing szip dependency and hdr5 itself and i'll do module only to only generate the modules so this should be pretty quick since it only needs to do two modules again because of module only it's skipping the actual installation it's expecting things to be there it is doing the sanity check so it is verifying that we have access to a working in this case szip installation and it's doing the same for hdr5 itself as well which will be installed here or yeah at least the module will be installed in this location that's all looking good it's about to finish and then we can check how things actually work from a user point of view so that wrapped up the installation of those modules now easy build has installed stuff but we still haven't set up our modules tool we still haven't told our modules tool where things are so that's the starting point to get access to these installations the module files are generated here and you can see the three the three levels of the hierarchy here the core level which is where compilers live or anything built with system toolchain the compiler level where things live that were installed with only a compiler toolchain so no mpi and everything else basically goes in the mpi level so if we're looking for hdr5 that will be located here and the mpi level where things are located compiled with the gcc102 and the open mpi 405 and in here we'll have the hdr5 module which only has the version okay but now how do we tell the modules tool about this so step one is starting at the top of the hierarchy so this is not the modules all directly but the core some directory in there so this one has things like our gcc compiler and the dependencies for installing that so we start starting point is this module use command module use location of modules slash core the top of the hierarchy if we now run module avail we'll get a couple of modules available including our gcc compiler so this is the gateway module that we'll use that will provide access to the rest of the hierarchy and yeah the other ones are basically either dependencies for this gcc or just things that we have installed like the tool chains themselves are not installed with a particular tool chain controlled by easy but these are just module files there's no installation behind these they're just collections of other modules now that's all good but we actually want to have hdr5 if we run hdr5 avail hdr5 almod says i don't have this at least i don't have this available for loading straight away um but we can use the module the spider tool that almod provides and ask spider do you know of any hdr5 module anywhere in your hierarchy and almod says yes i know about 107 hdr5 107 and i know that you'll need to load these two modules before you actually get access to that so it's smart enough to know about the structure of the hierarchy and it's smart enough to know that there is an hdr5 built with gcc 102 and open mpi 415 okay that looks good so we can we can follow those instructions we can start with loading the gcc version check module avail again well this is not good enough because we don't have mpi loaded yet but if we check module avail again we now see these were already there we have additional modules available this looks like two levels of a hierarchy it's really not this is one whole level but we have to use different subtool chains here with easy build so things are a bit more spread around but the one that we care about is the second gateway module for open mpi so remember almod was telling us uh with spider you can run that again here the ones it's listing here are the ones that give access to the software we care about so these are what we call the gateway models and right now with module list we only have the gcc one loaded not open mpi yet which is why if we ask module avail what is available for loading it doesn't know about agf yet if we do load open mpi and check module avail again there's another box of modules that is opened in the mpi level and now of course module avail htf5 will come back with an answer that we do have a htf5 available for loading so we can go ahead and load this if I can type the type completion helps so this is available for loading module list looks very clean just software name slash version this looks a lot better than in a flat naming scheme and we can actually access one of the htf5 tools htf5 for example if we check where this is installed this is coming from slash easy build software so nothing was reinstalled here easy build just generates a new module three for existing installations yeah so that's what this whole example is about there's an an exercise in here as well that basically redos this for a different software package this sci-pi bundle which is a bundle of numpi sci-pi and pandas which has a bunch of missing modules so the installations again let me check with eb the installations again are there already but the corresponding modules are not available in this module hierarchy yet so if you do something very similar like I did with htf5 it should come back with an answer it's screaming at me here for having modules loaded in this case I don't really care oh and it is finding everything okay that's because I have the prepared hmns already in place so if you do this you should see things are still missing for doing this exercise and you can try it yourself so hopefully that makes it a bit clear what the deal is with a module hierarchy and why it may be interesting but again there's pros and cons about both approaches so there's no clear winner here at least not up to me can there was a question about module naming schemes and if you're using packaging and so what's what packet what if you're using rbms you're creating rbms but how does the packaging of modules work okay that's a very good question um I don't use the packaging feature myself a lot so I actually have to think about it I think in the in the packaging support we we only support basically one module three per installation right now so you can't really package modules separately from installations I think so that's something we would have to work on um so if you use so with packaging support let me show it in the documentation so easy build can talk to this other tool called fpm i think package manager uh where you can tell easy build to here eb dash dash package so if you do this easy build will not only install this particular software but then also talk to fpm to get an rpm for it that you can install and I think currently the rpm includes both the installation and the module file and there's no easy way to to pull that apart so that's something we would have to work I hope that answers the question so the packaging support we're not really going to cover in here it's a it's a pretty stable feature it's been around for quite a long time and I know several people use it and are quite happy with it um but yeah there could be gaps in there in terms of what is what is supported so we could work on that okay if there's no burning questions for now I'll continue and look at how we can add support for additional software so easy builds out of the box comes with support for a whole bunch of software I didn't include the link here I probably should have but it's actually better I think through the websites so we have a list of supported software here if you go to easybuild.io there's an easy link here which brings you back to the documentation and this has a well organized list of all the software that is installed currently in the latest easy build release so over 2,300 software packages including tool chains and bundles of things um and you will find something like tensor flow in here somewhere there it goes it's sorted alphabetically and all the versions that we support for tensor flow and for which compiler tool chains we have easy config files included with easy build itself so this is definitely one of the things that that moves pretty fast so with every easy build release we get easy config files for new software versions tensor flow 250 for example will be supported in the next easy build release so this is a very long and detailed overview of what's already there let me jump here as well to the next part but if there's something that you want to install that is not supported yet in easy build this could be a totally new software package that easy build doesn't know about yet or it could be a new version of something that's something that's already supported and that's what we're going to look at now so for every installation that easy build performs there has to be a corresponding easy config file so that's how you tell easy build what to install this could be included with easy build itself or somebody could give you an easy config file that worked for them and you can try using it if not if you're on your own let's say you could either take an existing easy config file and change it until it installs what you wanted to install like a different version like a slightly different configuration or you could create one from scratch yourself when writing an easy config file you're you're going to need to use an easy block or you're going to have to tell easy build to use an easy block and again I think for about 80% of the easy config files one of the generic easy blocks that we have is sufficient so again the documentation has a detailed overview of certainly generic easy blocks there it goes overview of generic easy blocks that's let's say what is this 20 30 different ones for example let's say configure make so that one does configure make install and you get a couple of easy config parameters specific to this easy block like if you want to change make install to something else you can redefine or you can define this install underscore command easy config parameter in your easy config file and then easy build rather than using make install to install the software it can use a different command so you get knobs to tweak to steer the installation now very often a generic easy block is enough which means you don't have to do any real python coding to get something to get something installed that's very useful but if the software is quite complex or have a very has a very specific installation procedure if they were reinventing cmake or have their own configure script that does something very different from a regular one you may need to implement your own easy block so implementing easy blocks is we can't really cover that here in this tutorial that's a bit too advanced for let's say a basic introduction like we have here but this is well covered in the documentation and we also have a previous version of this tutorial has a separate section on implementing easy blocks with an actual exercise from that so you could take a look at that particular version of the tutorial what we will do here in the demo i will create an easy config from scratch step by step and basically show you how that process goes and how easy build sort of pushes you in the right direction to make that work before we do that to clarify a bit when do we need a custom easy block or when can we use a generic easy block and relatively simple easy config file that's a bit of a difficult question the answer is it depends a lot on the software you're installing sometimes you can get away with a very fat easy config file where you're defining lots of easy config parameters and maybe where some of them may have to change if you change a toolchain or use a different version of a dependency so the more you're let's say fixing or hard coding things in an easy config file the more you're maybe going towards a custom easy block that could be a better solution or sometimes you really need to do things that are a bit too advanced for an easy config file like running interactive commands if you have a configure script that asks you questions and you have to answer those questions supposedly interactively that's too much for an easy config file but in an easy block python script essentially in python code you can do all of that quite well and easy build the easy build framework gives you functionality to facilitate that as well sometimes it makes sense to add some logic in the easy block that reacts to the toolchain so if it notices you're using a gcc compiler or an intel compiler it can basically create the corresponding configure options for you so you don't have to do that manually anymore in the easy config file same thing for dependencies if for example you have hdf5 loaded or included as a dependency the easy block could know oh that means i have to do with hdf5 equals location to that it's hdf5 installation and then you don't have to do that manually yourself in the easy config file so by creating a custom easy block you you can take away some of the logic and make it more dynamic make it derive all that stuff yourself so yeah there's a bit of a thin line here but very often you notice yourself if you go too far or too sorry too deep in an easy config file it may be better to write an easy block but then you'll need to do a bit of python coding for now we're going to stick to pure easy config files and leveraging a generic easy block so we don't have to do any real python coding what's important here so an easy config file is essentially a bunch of variables you define which we call easy config parameters it's in python syntax but it's not really python code in the sense that you won't be doing any logic like functions or for loops in there you're basically doing key value definitions of easy config parameters for telling easy builds what it should install and which dependencies which maybe configured options it should use for that installation there's a couple of things you have to define in every easy config file so these five software names software version homepage for the software a short description of the software and which toolchain should be used to do the installation that's the bare essentials you may say well stuff is missing here like sources don't you always need sources not really because you can write an easy config file that just combines different modules different existing installations together to make it make them easy to load like let's say the the fos toolchain so the combination of gcc open api openblast fftw has a corresponding easy config file but which doesn't install anything it just bundles things together so in that case there's no source files involved at all of course yeah having sources is very common having a telling easy build where to download them from so specifying source URLs is very common you've often see patch files being specified as well like we saw in the tensorflow example usually for both sources and patch files we have checksums in place so we make sure that the source starball you obtain is actually the correct one and it's not a corrupt download or something that was provided maliciously to make you try and make you install something that you don't want to install so that's ready to sources often especially when using a generic easy block you'll have to tell easy build which easy block you want to use if you don't specify which easy block you want to use then easy build is going to try and derive the easy block from the software name so that means it's going to look for a software specific easy block if you do name equals tensorflow easy build will say okay you're not telling me which easy block i want to use i'm going to look for an easy block specifically for tensorflow and if it doesn't find that it will give up and scream at you specifying dependencies or build dependencies like cmake is very common using the config ops or pre-config ops in via the easy config parameters for controlling the configure command and then the same thing for build same thing for install it's quite common as well and we'll see that in the demo and also customizing the sanity check so making sure that the things that should be there are there or that simple commands like dash dash help or dash dash version are working as expected this can be specified via the sanity check easy config parameters as well so those are the more common ones there's a long list uh which i didn't actually mention here but you can ask easy builds about all the easy config parameters it knows about with eb dash a which is short for available easy config parameters like this i think but eb dash a is a lot easier to type this will produce a lot of output but this tells you for example uh let's take a good example here like pre-built ops says extra options that are pre-passed to the build command and same thing for config same thing for install step and so on easy block is used to specify which easy block should be used for the build and if it's not specified easy build will determine that via the software and so on so this is a long list relatively long list and it's not actually a complete list even because for specific easy blocks like configure make pass that through less so here we're asking easy build give me all the easy config parameters you know about and include the ones that are specific to the configure make easy block in that case you'll get a couple more in here so a list of easy config specific easy config parameters like the one i mentioned before install command which lets you change make install to something else so easy build knows all the easy config parameters yeah it can take into account if you define anything else in an easy config file and i'll show that during the demo um easy build will scream at you again and say look you're defining a variable but that's meaningless to me so i'm just going to ignore it and that's probably not what you want to do yeah we'll get back to the rest of the slides later i'll check the example here so there's mandatory parameters we've been over those commonly used parameters like source files patch files checksums that looks like this in an easy config file specifying the easy block so just giving it a name typically of a generic easy block and examples are configure make for configure make make install cmake make for the same thing but with cmake rather than configure for installing a python package which can do python setup by install or pip install and you can control which one you want to use by setting the appropriate parameters or a bundle of things like just slapping things together without an actual real installation so those are pretty common generic easy blocks they're just a couple of examples you can also ask on the command line which easy blocks are known and it will give you a nice overview of that custom easy config parameters with just shown so some of them are custom to specific easy blocks and dependencies follow a syntax like this so dependencies is always a list of things and each dependency is this tuple syntax in python which has at least two string variable through string values the name of the dependency and the version of the dependency it could have additional values for the particular dependency like this part where we say name version and this is the version suffix that we want to use for this particular dependency but name version is what you typically see version suffix i'll mostly skip this is like adding a custom label if you're doing something special to the installation like using a different configure option you could give it also a version suffix to discriminate between the different installations easily like we do with sometimes we do with python for example python 2 python 3 installations have a different version suffix oh and here we're using what we call a template so rather than hard coding 2.7 in here we're basically telling easy build uh please replace this by the first two digits of the version in whatever python dependency is being used so we we try very hard to not hard code the version in multiple places which makes it a lot easier to change in existing easy config file customizing the configure build test install commands we have these easy config parameters for that config opts build opts or pre build opts and install opts so these tweak the corresponding commands in that particular step of the installation like with this config opts line we're basically telling easy build please do this so add this option to the configure command and this part is done by the easy block itself so dash dash prefix will always be specified uh but we're telling it to also add this additional configure option pre build opts is a bit different this basically gets glued before the build command and that's why we're using this double ampersand here at the end so we're basically chaining multiple commands and we're making sure that if this command fails it should probably can't if this command fails easy but we'll give up at that point and not do the second part of the build command so with this line we're essentially making easy build do this so this part is specified in the easy config file and the configure make easy block in this case will do this by itself run make after checking how many cores it can use to do the build and very similar install opts needs to this and then another important one is the sanity check so here uh being a bit familiar with python syntax helps i guess so this is a python dictionary so a mapping of key to some value and there's two parts in the sanity check the parts that are expected in the installation so files and non-empty directories or commands that should be run and should pass correctly so should produce an zero exit code that's very useful as well there's lots of other easy config parameters you can define which will not go over these are definitely the most common ones so before i go to generating easy configs and copying easy conflicts let me step through an example here because i think that will help in making things a bit more clear so what we've done is we've come up with our own small software package which we call eb tutorial um for which we provide a source star ball here at this location so it's in the github repository but you can download straight from github using this url um this is version 101 so we're going to try and convince easy build to install this now you can look at the sources unpacked in the repository so this gives you some hints it even has a nice readme file that tells us to do should make install that's probably not going to happen but at least it gives us a hint in terms of how we can get this thing installed we notice a cmake lists file that's pretty tiny so we'll probably have to use cmake for this and it has a c++ input file and then a header file as well hopefully cmake knows what to do with this so this gives us some clues already on how we can get this thing installed manually it's pretty easy but the question is how do we get this done by easy build so before we do that let's make sure we have our environment properly set up and i probably still have the hierarchy set up here yes so let me start a new terminal so try and get in this probably didn't work oh yes it did okay that's a clean environment that helps a bit so i'm going to do the basic easy build configuration install stuff to home easy build and use a temporary directory for built path because we don't want to have this on the shared file system and i'll make sure that we pick up the central uh software stack that pre-installed stuff so we don't have to start from scratch the starting point is the mandatory easy config parameters name version home page and description and then which toolchain we're going to use so that's the bare minimal so we create a file like this and let me actually remove the toolchain part and show you what easy build will do it will just scream at us for not defining the toolchain part uh okay it actually first looks at the easy block so that's problem one to fix so it says fail to process easy config file no software specific easy block found for eb tutorial that makes sense we're not going to implement an easy block here we're going to ask it to please use the cmake make easy block since after looking at the readme this has a strong match it seems we just tell easy build please use this generic easy block and try again and now i deliberately removed the toolchain definition and it says okay mandatory parameter not provided in well the pie header thing is i guess a bit confusing but it basically means we're not defining the toolchain so i'll add that here as well you see me adding things in a particular order here so i put easy block on top and toolchain at the bottom of what we have right now that's just out of pure habits there's actually no order here so this could be all jumbled up and they're just key value definitions easy build doesn't care too much but usually what we do is we try to keep the same structure in easy config files just to make it easier for humans when they open an easy config file they always see the same thing easy block on top name version homepage description toolchain and then it more or less follows the the order in which easy build will use these easy config parameters that just makes it a lot easier for us so we now have these mandatory parameters we we've told it which easy block to use so that makes it actually go ahead and do something useful so the first problem we hit is lots of screaming um it looks like it generated a cmake command but it failed to run it so let's open up our log file here and see what happened like there's no clear actual error message in here it just says i tried this and it gives me actually a partial command not the full thing so we jump into the log file and we start looking for error messages and this certainly looks like a problem cmake command not found yeah that's not gonna work if we don't have cmake available we won't be able to get this thing installed so first problem to solve is tell easy build that we want to use cmake and which version of cmake we want to use now luckily in the prepared environment cmake is already installed that's good this particular version so that's what we'll try to use um and the toolchain at least looks very close to the one we are using though there's a there's a difference between gcc core and gcc which i i won't go into here too much it relates to the module hierarchy and where things are placed and what you can reuse across different toolchains um but let's just uh say that this cmake version is compatible with the tool but the gcc toolchain we're gonna use here so we should be able to tell easy build to use this as a build dependency the module is already there that helps a lot cmake is a built dependency so added to build dependencies that that matters because build dependencies will be loaded in the environment that easy build sets up when doing the installation but it will not include the module load command in the resulting module file so when you load eb tutorial it will not load cmake because you don't really need cmake to run um this particular software package you only needed to build it so that's why we make this a built dependency so that should be enough that specifying it as a built dependency means easy build will load the cmake module when doing the build so if this module is correct it should provide a cmake command that we can use for the installation i'll um enable trace mode so we have a better view on what's actually going on so now we have to restart the build from scratch because yeah cmake failed so um we won't get very far right now it's again failing in the configure step uh at least thanks to trace it's now giving us the full cmake command but that's still failing and failing quite quickly so let's take a look again at the log file because there's no trace of an actual error in here so now it's actually cmake itself screaming at us and saying look you're telling me to configure something but i couldn't find a cmake lists dot txt that's strange because if we look at the impact sources here there is a cmake lists dot txt there so why is it complaining so let's take a look at the build directory to see what's in there so if we do show config again where does our build directly located we configured easy build to use temp username as build directory uh and in here we still have an hdf5 and an asset from a previous attempt i guess eb tutorial version 101 was be in here somewhere a sub directory for the compiler that we're using for the toolchain that we're using and in here there's an object directory this is created by the cmake make easy block so it doesn't do a build in the source directory but separate but all of this is empty so there's nothing there so it seems like cmake was right for once about complaining about something so why is it not there well because we didn't didn't specify any sources uh so remember sources are not mandatory which is a bit strange but it does make sense in some cases but in this case of course we need to tell easy build where the source files are uh if we go back to the top and this is basically what we need this is the location of the source star ball uh and i think i'm running ahead of myself a bit but so basically we want easy build to do use this as sources that's good now there's one thing annoying here i'm hard coding the version even though it's specified already there can i avoid that yes i can use a template for this similar to the five word template we were showing earlier so this is basically a placeholder easy but we'll replace this with the value here so we don't have this hard coded and actually this is a very common pattern name dash version dot r dot gz and so common that we have a constant for this which is exactly well it is basically this name dash version so what you'll often see in easy config files is a line like this that just says it's a standard source star ball um and easy build knows what to do with this so that tells easy build to unpack sources or yeah grab sources unpack sources and do the cmake uh command in there that's good so this will probably fail again because easy build doesn't have these sources yet come on vi i hope this comes back um we jump here this probably tells us yeah so the source directory the same error as we saw in the log file um empty directories which means we didn't specify the sources so that's what we did right now if i try this unless i didn't clean up properly uh it's going to scream at me again because it doesn't have the source file anywhere and it takes a while to check because it it actually has a fallback mechanism to some kind of source mirror but it won't be able to find this there so it does scream at us for not finding this we can fix this by telling easy build where to download it from so source url without the source file name and easy build will use this if it can't find the source file to download it and then happily continue it will only download it if it can't find it on the file system already okay if you try this again it's again failing in the configure step and again in the cmake command so maybe you'll start understanding why i don't like cmake no it's actually our fault in this case so if you check out the error message cmake is actually helpful in telling us look this configure option that is required and if we look at cmake lists this will probably be specified somewhere as required in here so if this thing is not defined cmake gives up and says look you have to tell me um the message to use for uh this particular software package so that means we have to add a configure option to cmake the config opts easy config parameter can be used for that we need to define eb tutorial message and here you have to know a little bit about cmake so you do minus d and the name of the variable you want to set equals a value so let's do hello isc 21 be a bit careful with quotes of course and then close the the string file so this is the actual argument that will be passed to cmake and we wrap this in single quotes because it's a string value a python string value hopefully now cmake is happy but let me do this with trace mode again we have a better view on what's going on that looks better configure step completed build step completed installation step completed exit code zero and now the sanity check is failing so we didn't specify anything in terms of how easy build should check the installation it has a fallback mechanism which is the best we could come up with that's looking for a non-empty bin directory and a non-empty lib directory or lib64 so either is fine in this case it found a non-empty bin it didn't find a non-empty lib if we check easy build oh having trouble again hopefully we'll come back so if we check in the installation we will indeed see only a bin directory no lib directory so the best thing we can do is come up with our own sanity check so customized to this installation so one thing we definitely want to look for is an eb tutorial binary so that's what we expect for this installation and then there's no other directories we can really look for that would be useful for this particular case and what also makes sense is trying to run the eb tutorial commands now this doesn't have any options but just running it should produce the message that we gave when doing the configuration let's see if I can revive this okay and to make sure it's configured properly and it was failing in the sanity check so we'll just copy these and assume it's happy and I'll add the missing bit which is not strictly required but the module class is usually specified as well this is what's used when you saw the different categories of modules when I looked into the installed module files this is what's used there to do the splitting in categories okay so sanity check is a bit more specific now that's probably good so we're actually checking on the installation and now we can either redo the installation from scratch or because we actually know the installation completed correctly but Easeable disagreed because of the sanity check and all we need to do is module only so just run sanity check if that passes generate the module and we're done so unless we messed up something that should be good enough okay I didn't use trace mode but if you do use trace mode you'll see this so Easeable will check for the bin tutorial file that should be okay it will try to run this command and that should pass correctly as well so it seems like that worked because the sanity check didn't complain if we do our module use correctly not this one easy build module so also the location where Easeable is installing the modules we should find EB tutorial that means we can load EB tutorial and we can run the command ourselves and that's working as expected so this gives you the complete easy config file which is basically what we came up here as well there's one small additional thing here this has a checksum which we don't have here so it's missing in this part we could add it manually by downloading the sources and running the SHA256 sum commands to figure out the checksum and then add it here in the checksums list or Easeable actually has a way of doing that as well as long as you're sure that the tarpaul that you have is fine Easeable can calculate the checksum for you and add it automatically to the easy config file so this did the replace did the well did an injection of the easy of the checksum and the easy config file which should match exactly what you have here so that's from scratch creating an easy config file and doing a couple of iterations to work through it and make sure that Easeable does the proper thing there's examples or I should say exercises additional exercises which I won't do in the interest of time to use this one as a starting point and tweak it a bit in several ways so both a different configuration and a different version and then using this one as a dependency for the Python bindings for EB tutorial so that's also definitely a very nice example to go through so try to create an easy config file from scratch for yourself and if you're stuck the solution is available as well to wrap up this part so manually creating or tweaking an easy config file is certainly possible sometimes it's a bit annoying especially if it's a very simple changes so just bumping the version for example but easy build has support for options like try software version but it basically does a search and replace of the existing version and replaces it with the new version similar for two chain whatever tool chain is used in here easy build will replace it with what you give it on the command line and also for other easy config parameters there's try amend but there you're already getting into a territory where things may not work as expected so at least the try options could be of use when making trivial changes and then to copy easy config files you could search them grab the full location and just use the cp command or what's easier so let me yeah this one for example um if you have an existing easy config file and just want to copy it to my sci-fi bundle.b you can use the copy ec option and easy build will go and look for this easy config file locate it and copy it to the location that you give it and you can use this as a starting point for tweaking it so that's a handy little option as well and that's the hands-on I did and there's more exercises that you can try yourself. Okay that brings us to the part where we look at how how Ulich and Compute Canada have been using easy build so we'll start with Alan and I'll drive the slide so Alan will just explain things. So I'm Alan from Ulich Supercomputing Center and yeah next slide already we're a bit behind time so I'll be brief and so Ulich Supercomputing Center is pretty old and pretty big so it was set up it's been around since 1987 about 200 people in there and we currently have three systems Jules is currently at least this week anyway is the biggest system in Europe at number seven and and it uses a modular supercomputing approach that means there are different partitions of the machine with different hardware capabilities and the Eureka, Eureka DC in particular is the newest generation so that was the system that kind of introduced this modular supercomputing technology and that has a CPU partition that's that's about 3.5 petaflops the GPU partition about 15 and another five from a K&L partition and we have a third system then Yusuf which is AMD again AMD based Eureka DC is AMD based as well the CPU part at least and it has some B100 GPUs as well so Jules has the A100 GPUs and we use that for interactive workflow and for community services and infrastructure as well. Okay and so a lot of easy build at JSC in particular so we've been using it in our production stack for I introduced it there in 2014 and we've been using it since then and now it's started on a single system on Eureka and now it's been used in all of the systems in Eureka and in general the way we approach is or at least how people interact but in the end it's geared obviously towards the average user right so the people we get when they get allocated a project on one of our systems and what that means is because we have such a large software stack and such a lot of tools that are dependencies that are usually not required or explicitly needed by people and we hide a lot of indirect software so we use them this hidden dependency feature if easy build a lot and we have lots of tool chains and because we have different hardware on these machines that require that optimally you know the the ideal compiler for these systems might change so we have lots of different compilers and lots of different MPIs as well and that we make available to people so from very early on which was to go with the module hierarchy and we also rename some of the modules that easy build so for example in easy build Intel MPI is called IMPI we didn't like that names use Intel MPI instead and so we make a couple of these kinds of changes and a few tweaks as our as regard mod as well so we implement this feature of compiler and MPI families and we also have custom module naming scheme so that I mean when I talk about them injecting the families at this family feature of Lmod and we have custom tool chains we have custom easy configs in our own blocks as well so we've custom parts of things are customized in all aspects of easy build and we especially at this amount of time we find that this is a maintenance and a contribution issue and having our own having our own stacks our own tool chains is problematic for us and in terms of contribute especially in terms of contributing back so we try to control that now and we're working hard to minimize this as regards upgrading and retiring our software and so our idea is that we have something called a stage concept so we always want the projects who come online to us use the latest available software the latest available compilers latest available MPI because our machine a lot of our machines are the latest right so we want people to be able to use all the features that are available on them and so we have this stages concept right now that used to be that we update twice a year but now we have so many systems this is actually quite cumbersome so we reduce that so we update and make a large update of the stage once a year and we always encourage our users to adopt the latest software and dependencies and this is for many good reasons including mainly performance and bug fixes something like that and it doesn't mean that that when we retire software that it disappears altogether and it's available indirectly so before we were at first stage we always made people aware that they can access the old infrastructure the old software and this is important for a lot of users right they actually don't want to change their stack over the lifetime of the project for various reads and so this is available to them but what changes is the default view and the other thing that we do is we're starting to use a lot now is is hooks the hooks feature of easy build and both for our users and for our maintainers and because now we allow people to install software on top of our stack and it's we find it to be a pretty powerful alternative to customizations of the easy blocks and the easy configs and especially because it's much more automated in one location and this is very flexible and it'll make things easier to maintain for us and this in particular comes to easy configs so all these little tweaks minor tweaks that we make and so that things are installed with these these different names or the lmod families this is something that can be carried out within a hook that wasn't available when we started things and and the other thing we use hooks for is to actually guide user installations right so it enables them but it also guides and it tells people how to do things properly so basically for example one of the features that we have in our hooks is that we don't allow people to install compilers or mpi runtimes probably because installing a compiler particularly something like gcc usually involves installing an entire stack of software underneath as well and you usually don't need to do this right so if someone is the little message and that should talk to us first and also the same with mpi all of our mpi's are heavily tuned for the system so so we wouldn't expect that people should install their own mpi's they should be using the one that we provide and and the other thing that we do is because we allow user installations we do that in a hierarchy so people top of the system system installations and at the group level so they can share installations with other people or they can install for themselves right at the user level or they can do it in the hierarchy of the two so they can go from system build on top of that with group and then have their own installations on top of that group and hierarchy as well i think that might be it for me yeah okay thanks a lot alan it's over to Bart and wait for my camera to show up i guess am i visible okay yes welcome everybody my name is Bart Olma and i'm talking about how we're using EasyBuild at ComputeCanada so we'll go to next slide yes so ComputeCanada is a bit of a consortium which is then divided into several regional organizations so in the West Side Press Grid, Compute Ontario, Kalko-Kobek, AceNet, I myself am employed at McGill University and which is part of Kalko-Kobek and and so we're collaborating a lot and we have so with this consortium institutions and so within Canada there's about 200 technical staff who manage these supercomputers and help help the users using them we have 15 000 user accounts so this is Canadian researchers and collaborators who have access to our supercomputers this is all free they have anybody who works at the Canadian research can get an account and then there is a contest every year where they can get more compute time than the default location next slide please so so we're basically working as like a collaboration of large medium systems mostly we have one cloud system in the West Arbutus on the left hand side that's in Victoria in British Columbia then in in Vancouver we have the Cedar system then we have Graham in Waterloo in Ontario and Niagara in Toronto and Beluga, Ryan in Montreal next one so what we have is that the goal is that when we introduced the new national supercomputers we wanted to have a consistent user interface so that basically people could log into Graham and get their jobs running there and then if they would switch to another supercomputer like Cedar or Beluga they would get pretty much the same interface so they have the same modules they have the the commands work the same and so when one cluster is down for maintenance they can just work on another cluster and things are pretty pretty much the same so you don't have a new learning experience with different module names etc etc so so this consistency is important and for that we have a mechanism where we actually distribute our binaries across Canada using the CERN virtual machine file system and we also eliminate the OS factor so we have what we call compatibility layer that's in between the OS and the modules that we install via easy build and of course we also need a module interface, we can use the next slide please so CVMFS is basically a distributed file system which works via multiple caching layers so we have a stratum zero which has where we push the the modules into the software etc which is then cached by stratum ones strata one and then there are supercomputers which can also cache on data files quits so once the software is loaded on the supercomputer somebody else loves the same software again it's cached locally and so they don't have any network latency issues next slide please so our design is basically built on a layered approach this is caught some interest particularly in Europe because they started the easy project where they've they've worked on an approach to do that in a European wide there's some technical difference between them but this has been used as a template for the easy project and so there's a multi-layer approach where at the bottom we have things that we don't install via easy build or anything it's always local of course with the Linux kernel some legally restricted software is just fast then we have gray areas of stuff that that we could install CVMFS sometimes we do sometimes we don't but the important part here is that we have a compatibility layer so what we have is sent to prefix at the moment which provides basically one module for an enormous amount of tools the GNU Lipsy auto tools bake bash core use etc etc that is all put in a single directory the 10 to 2020 directory which is loaded via module and by providing that users can can use a consistent set of of tools which is then not provided by the OS so they're not coming from rpms if you type ls you get an ls from our compatibility layer and you don't get it from uh from the um from say the red hat core user installation so from that we have a consistent bottom layer that is the same across every cluster whether they use sent to a seven or sent to a eight or a one or anything and then at the top level we use we use uh easy build to install modules there's a lot of similarities with with with julie in the in the way that we're working with it i'll cut to that in a minute as well but we also have something a little bit like stages now with the sent to layer we move to the 2020 directory and we're also in between some updates so so they have stages every year we have more or less a major update every two years so we're going even a little bit slower than that next slide please so the number of of modules has uh exploded over previous years we keep installing things we have different models also for multiple architectures uh as a sea freeze the bottom layer for some very old clusters then we have avx avx too for the major stack and avx 512 for skylake newer clusters and overall we we have more than six thousand modules installed in our stack and you'll see that by by type you know that the the most diverse uh group is bioinformatics they just have a lot of different software packages that they're using and there's a bit of everything else as well next slide please so we we are modifying easy builds uh much in the same way that that that yuliq has also been modifying some easy builds things in the in the framework and easy blocks etc we also try to minimize these differences but there's some major differences with how how they're used upstream so one thing is that the compatibility layer already provides new and off versions of many software packages so we don't really have to provide easy conflicts for all kind of the boring dependencies so in yuliq they are hidden but we just don't provide them we we just let them be used by by the 10-2 layer so these are tools like m4 most cmake installations and and there's there's a lot of libraries that have models attached to them and the users don't really load them directly they just load them as dependencies because well they're kind of boring that they are they're not chromax or or NAMD or some other packages that they're interested in that's things like like lip and curses and in order boring dependencies um we we very much used mkl as a it's a central math library and and that means that we're we're not using the forstner intel uh tool chains uh so having like mkl and open mpi essentially means that we're we're pivoting to order tool chains the tl mkl and iomkl that's just an combination of open mpi and and intel mkl and then and then we have variations of compilers and kuda versions so with that we can often use recipes directly from easy build upstream but then use options like try tool chain and try software versions stuff like that to to uh to to install a piece of software without having to modify the easy and config file we also have a custom module naming scheme actually based on the one from july but we we do things like having lowercase module names uh we have no version suffix and we hide the tool chains and we also use something called our path uh so we we eliminate the use of ld library path so that we don't have surprises in libraries we do that using a a wrapper for the uh for the linker so we have a script that wraps around the ld utility and we make sure that our path is injected properly even when users compile their own software next slide we also use hooks much like in julie so we we so this way we can use the original easy configs for instance to put in custom configuration options for open mpi so the config ops in an easy config can be changed dynamically using a hook uh we can put footer code to have like uh to to put special lua code in the modules uh to support installation users home directories um we even have like uh injecting fire hooks we put post install commands to to split intel uh compiler installation because part of it can be redistributed and part cannot be redistributed so we split it um and we we do it also by stripping down extension so hooks can be really powerful if you just want to have a central place to modify things on the fly next slide please we also handle python a little bit differently uh so we we usually install things on the side of the major packet so we install py qt 5 of qt 5 open cv python open cv when we can um and we try to make the module compatible with all version of python that we install so we don't we not just have one python 2 and one python 3 version but we have python 2.7 3637 and 38 all installed in parallel um and one the biggest difference is that we don't really install many python packages models just the basics but we mostly tell users to create virtual environments in their home directory and then they can install stuff using pip from python wheels so that they can be very flexible in which python packages they want to use and and there are always people who want to do their own thing and because they use anaconda on their own computer they want to use in the supercomputer and that that often runs into issues because anaconda tries to install everything and just loads binaries for open mpi for instance so we really discourage the use of anaconda next slide please um one thing you can do too is actually that you can mount our software stack on your on your own computer it's particularly easy if you're using a linux laptop or desktop it's a little bit harder if you have a mac or or windows computer but even if you use wsl2 you can even mount it on windows with a mac you have to use a virtual machine and so have a look at that if you're interested then you can just mount it and you can load all the software that we use them and pretty much pretend that on your local computer you're running the same software as on a supercomputer so for a resource that resource quite helpful because they can do a test run on their own computer without even having to log into the supercomputer having to deal with login node limits and having or having to start interactive jobs or stuff like that they have the exact same software stack on their own computer so this was a short world tour of the way that we're using easy build to need compute canada thank you very much thank you very much Bart okay we're over time a little bit but i think we'll still manage to wrap up in time so let me continue if there's any questions we can try to take those at the end or feel free to post them in the tutorial channel in slack so a little bit about the easy build community as i already mentioned it has grown out to be a worldwide community we're very happy with that we never expected that that was not our intention originally it sort of happened by itself the picture you see is from the last physical easy build user meeting we had in Barcelona early 2020 this year we also had the user meeting but it was fully virtual for reasons you well know so our slack is very active it has close to 500 members about 100 active people they are throughout the week so there's usually always somebody there who can answer your question or at least put you on the right path by pointing to documentation or giving some hints it's used very actively in europe but also in the us in asia australia so we're very happy with that and we have regular conf calls every other week we have a one hour conf call where we look at recent developments things we should be doing or are doing and the user meetings have been taking place yearly for the last four or five years and hopefully we can keep that going if you're interested in contributing to easy builds that's definitely welcome as well there's different ways you can just provide feedback or report bugs or join discussions or you can get your hands dirty and try to contribute back easy config files or maybe even improvements to easy blocks or the easy build framework itself if you know a bit of fighting coding that should certainly be possible as well extending enhancing documentation is of course very useful too even though i think we're doing quite well there now one thing i do want to highlight is the integration we have with github which makes it very easy and very smooth to contribute back to easy builds you don't even have to know git at all yourself or have experience with github because we automate most of that for you straight from the easy build command line so you can open pull requests update pull requests easy build maintainers can review pull requests straight from the easy the eb command line without ever visiting github and that that not only facilitates the life of the maintainers but also of contributors they can make contributions easily and things are let's say preprocess in a way that we will want them anyway and it does that for you automatically so in the last couple of years we've been getting over 2000 pull requests for easy config files per year so not including framework or easy blocks so that's quite a lot and we basically had to add some automation in there to make this feasible this is all well documented there's a part in the in the previous tutorial as well that explains it in detail or i think it's even included in here if we skip the eulig part and the compute canada part this was covered in detail in the slides easy build community briefly discussed it as well yeah i think this gives a demo of how to contribute back something to easy build straight from the easy build command line there is a bit of setup that you have to do for this first of all it explains how to do it manually but i would not recommend doing it manually because you have a way better way of doing that there's a little bit of configuration you have to do you have to install a couple of extra python packages in the python version that you're using to run easy build so there's a bit of hoops to jump through before you can start using this but once you've done that it becomes very very easy and if all goes well i actually have that setup in here so let me see if you have things set up correctly you can ask easy build to do a check for the github features and then easy build will go ahead and make sure that everything is in place which is not the case here why not okay let me make sure i don't have any modules loaded okay from one um github user so i'm using a separate test github account for this but all the requirements and the github token so like a password to talk to github automatically should be in place already that looks okay it has git python which it needs to do git commands um automatically so that looks all right it's now testing whether this all works so it's actually talking to github and trying to do some actions and it should come back that this is all working okay um and once that works we can actually open the pull request let me open github eb tutorial so what i want to do is send a pull request of this easy conflicts repository which is a fork of the central one um but i configured easy build here to look to target this separate repository instead and what i can now do is open a pull request for example for the eb tutorial easy config file we created earlier all i need to do is use once the configuration is done is use eb new pr give it the easy config file and it will go ahead and open the pull request for me now what it will do is it will parse the easy config file look what's in there um try to figure out the right location it should go into the repository it will create a git branch for me it will push that branch to github it will do the equivalent of clicking around in the github interface to open the pull request so all of that is done fully automatically and i don't have to worry about getting things right making sure that the pull request title is okay the branch is okay the file location is okay that should all be done automatically so if i just refresh the page here um i can see it has opened this pull request for me with a nice title if i look at the changed files this is the contents of the easy config file was coming up with a note that it renamed and relocated the easy config file for me because the repository has a certain structure so it knows about that it targeted the develop branch which is what we like to do for contributions and we can even take this a step further and now that's let's say now i have my maintainer hat on i see this contribution coming in i want to test this contribution i'll have to do a rebuild to reinstall it but i can tell easy build please pull the easy config file from pull request number five try to install it so i will do this of course in the test environment not in the production environment but this pulls in the easy config file from github from this pull request it will try to do the installation that seems to work and then to show that that works i can actually upload the test report back into github so i'll do the rebuild again but now tell easy build to do an upload of a test report and you'll see a comment popping up here um to confirm that the installation worked on this particular system so that goes well i won't even need to refresh yet it goes this is a test report by evtutorial so this github account on this learnhpc node using a broadwell and a python 3.3.8 and if i click the gist file this will give me more details how long it took what the easy build configuration was the system details how many cores all of that all of that information is available here now this was a successful test report if it's a failing one i'll actually get a partial log file as well so the maintainer can help the contributor to figure out what went wrong and try to fix the problems together so this is a very very powerful feature and this is also why we get 2000 pull requests a year in terms of contributions so again this is well documented how to set this up how to start playing with this you can use this to do contributions into the central repositories or if you have your own git repository for managing easy config files you can also use it to add stuff in there so all of this mess of git commands and moving files around in the right place and making sure the names are okay is all fully automated okay so let's wrap up a couple of things we didn't cover and i actually there's things i didn't mention here as well we do have documentation of each of those things so implementing easy blocks from scratch is covered well in the documentation we have a tutorial section on that as well same thing for hooks good documentation covered in the tutorial as well somebody in the select channel was asking about that we can tell easy build to use rpad linking you can use easy build as a python library you can ask easy build to submit installations as jobs to a cluster so if you have slurm and you need to install 10 things or easy build notices that are missing dependencies it will submit 10 jobs to the cluster and make sure these things are installed in order so you don't have to you're not stuck to a single node if you have an empty cluster that you have access to also on gray systems easy build works quite well cscs has been using ccs and switzerland has been using easy builds for years on their flagship system pit stained to manage their software stack which is a gray system so that's a bit different from a regular linux cluster and there's also some support for building container images with easy build but that's experimental and i must say it's not really actively developed currently that may change in the future now one burning question you may have especially if you've also attended the spec tutorial yesterday is how these tools are different so from a bird's eye view they're they look very similar they're both implemented in python they have their main focus is on facilitating software installations for hpc but how they work is quite different and also some details like the licensing is is different maybe and i know the spec developers disagree with me on this a bit but in my view the focus of both tools is on a different use case and and let's say different types of people while easy build is more focused on hpc support teams and maybe maybe hpc that means who who manage a central software stack for their users i would say spec is more focused on software developers who want to juggle lots of different dependency versions and compilers and mpi libraries and so on that doesn't mean you cannot use spec to do central software installations you can same thing developers can probably also use easy build and and make their work a bit easier there or end users can use both easy build and spec to manage their own software stack that all works but the focus of the tools is a bit different the good things are they have support for lots of software they're very actively developed and maintained by a supportive community so if you're interested in spec definitely take a look at their slack or their tutorials they do a very good job there as well and the developers of these tools are also talking to each other so we're not enemies and there's spec maintainers even in this tutorial today as well so we're both very open to talking to each other and even helping each other out and then one more thing to wrap up there's a new project what we call the easy project the european environment for scientific software installations we call this easy and that's not by accident so easy build is a big part of this which is basically trying to not only replicate what compute Canada has done of building a software stack that can be used anywhere not only on HPC systems but also on workstations of researchers and in the clouds azure aws all of that we're trying to build this on the european level as well and make it a bit more open first to contributions from the outside which i think is a bit difficult in the way that compute Canada has done things and we're also pulling it open to to uh let's say broader architecture so we're not only looking at x86 but also looking at arm and power and if risk five ever comes of age enough that it's relevant for HPC we will look at that as well so this is a really fairly big project it's fairly new it's about a year old right now where we're looking to build a common software stack that works anywhere and yeah this is a high level overview and you'll recognize the very similar layers to what compute Canada has so we use cvmfs for the distribution of the software and we have a compatibility layer in the middle to make sure we're compatible with different types of linux operating systems where we use gen2 again very much like compute canada does and then a software layer where we use easy build to install software so yeah from a high level overview it looks very different but if this sounds interesting to you definitely take a look at our documentation and our websites we have some recent videos recorded talks on there that explain things in detail uh into why this is interesting and and again this is sort of a spin-off from the easy build community a lot of people active in easy build are also active there um so yeah maybe this is even the the way of the future we'll see how that works out in the next couple of months and years so that wraps it up um more or less on time so that's very good um there's a couple of links here website documentation this tutorial and previous versions of the tutorial that have a couple of additional topics like hooks that are covered there have a yearly easy build user meeting where everybody's welcome to join both experienced easy build users and people new to it are definitely welcome and if you need help mailing lists slack and the conference calls are definitely the best place to to go to so with that uh we can wrap up here or take any questions if there are any questions in the zoom session um and if not yeah feel free to pass by anytime in the easy build slack and if you have any questions later somebody will definitely help out and try to get them answered there's no more questions thanks a lot for joining uh one thing very important um ISC has asked us to try and make sure you flout the survey so if you go to the ISC website in the schedule and uh in the schedule if you find the easy build tutorial page you'll see a rate this tutorial button so please click this and uh tell us how you liked it or didn't like it perhaps or what we could do better um the next time so please make sure you do the evaluation that's very important for ISC um and for us as well and with that yeah I just want to thank everybody who joined us today um and I hope it was useful