 So let's get started with this. This is a high-level introduction to EasyBuild. There's no hands-on here yet, so don't worry too much about the AWS environment for now. Maybe it's a little bit bigger. So what is EasyBuild? Hopefully all of you already know that EasyBuild is a software framework for building and installing scientific software on HPC environments. And that's really what it was tailored for. And it does a whole bunch of things for you. So it automates the installation of scientific software on HPC systems. And it does this in a consistent way and with a lot of attention to the performance of the software that you get. You can basically see EasyBuild as a uniform interface for installing software. So no matter what software you are installing, you will ask EasyBuild to install it and you always follow the same workflow, whether you're installing TensorFlow or Gromax or whatever other software package, EasyBuild hopefully knows how to install that software for you. EasyBuild is typically used by HPC support teams who need to provide a central software stack for their users. But it can also be used by scientific researchers themselves who want to manage the software themselves or maybe manage a small software stack themselves where they have some customizations, patched code or dependencies or tools that they want to build in a particular way. And they can do that on top of what is provided centrally. So they don't have to reinstall everything from scratch. They can leverage what is installed centrally and only rebuild the things that they really care about or that they want to see installed in a different way. EasyBuild can also be used for building optimized container images, which is not something we'll cover in this tutorial, but this is well covered in the EasyBuild documentation. And over the years, EasyBuild has grown to be a platform for collaboration among HPC sites around the world, and we will get back to that later in the tutorial. So what can EasyBuild do for you? It can fully autonomously install scientific software. So no matter how nasty the installation procedures are or even if they involve patching of code or interactive installers, EasyBuild can do that for you automatically. It will also next to installing the software itself, it will also generate environment module files for you, which is a very common interface for scientific software on HPC systems. You don't need any admin privileges for this. So EasyBuild actually strongly prefers not to be run as root. All you need is a user account that has the right access to the location where you want to install the software. And it's sufficient to run EasyBuild. So you don't need any special privileges at all. EasyBuild is highly configurable and we will configure EasyBuild hands-on during that part of the tutorial. You can use configuration files, environment variables and command line options and we will show how that all works. EasyBuild can be extended dynamically through a plugin system and there's also support for using hooks to do site-specific customizations. As we will see in this tutorial, the installation procedure that EasyBuild executes for a particular software package is logged very thoroughly and you can also inspect the installation procedure without actually performing it through a dry run mechanism and also during the installation, you can enable trace mode to actually see what EasyBuild is doing, what kind of command it is running. It has support for using a custom module naming scheme and we will show hands-on how that works for a particular type of module naming scheme which is a hierarchical module naming scheme and we will explain the differences between a regular and a hierarchical module naming scheme and it also integrates with the range of other tools including Slurm and DC3Py for submitting installations to a resource manager. It can work with container tools like Singularity and Docker to install software in container images and it works with the FPM packaging tool and so on. So these topics are out of scope for this tutorial but they are covered in the EasyBuild documentation if this is something you want to get more information on. The EasyBuild project is actively developed by a worldwide community and we will show who is using EasyBuild and how these contributions work hands-on. And there's a regular new stable version of EasyBuild about every six to eight weeks that we have been doing since 2012. The EasyBuild releases we make available are comprehensively tested. Not only during the development of EasyBuild we keep a close eye on not making any breaking changes but also before making a new EasyBuild release we do a thorough regression test and we try and make sure that nothing that was working before has been broken and you release. So it's a very stable, a very reliable tool and it's generally very safe to update EasyBuild to the latest version. EasyBuild is not another build tool like CMake or Make so we're not trying to replace any of the established tools here, we're basically just wrapping around them. So EasyBuild will run CMake for you rather than you having to do it by hand. It also doesn't replace the traditional package managers like YUM, DNF or APT. So some of the software on your system you will still need to install through a traditional Linux package manager. This includes G-Lipsy or OpenSSL and FiniBand and GPU drivers and so on. So EasyBuild is very focused on the scientific software that sits on top of these core libraries. Also it's not, EasyBuild is not a magic solution. We're still working on that so that part is not finished yet. But it will be very helpful for you and we have a dedicated troubleshooting part in this tutorial to show you how to deal with the problems you may run into. EasyBuild is implemented in Python and it supports both Python 2 even up to 2.6 currently. So 2.6 or 2.7 should be fine and any version over Python 3.5 is fine as well. It's released via the Python package index under an open source licensed GPL version 2. And the development is done on GitHub and we will get back to that later in the tutorial as well. So let's look at some of the terminology that we use in EasyBuild. So because you will hear some of these terms come back throughout the tutorial so it's important to spend a little bit of time in explaining them. So the first one is the EasyBuild framework. So that's really the core of EasyBuild. It's a bunch of Python packages that work together and provide a bunch of functionality that you may need when installing scientific software from source. So this includes functions for unpacking source files, applying patch files, running shell commands and checking their exit code or their output, generating environment module files. So all of this is supported in the EasyBuild framework. So the framework is very general. It doesn't know about any particular software installation procedure. All of the functionality here can be used in a variety of contexts. So the actual implementation of a specific installation procedure is done through separate Python modules, which we call EasyBlocks. So an EasyBlock is a Python module that implements one specific installation procedure. And in some sense, you can see it as a plugin to the EasyBuild framework. EasyBlocks can be either generic or software specific. So a generic EasyBlock implements an installation procedure that can be used for multiple different software packages. So anything that follows some kind of standard installation procedure would use a generic EasyBlock typically. So one very common example is the ConfigureMake EasyBlock which implements the well-known ConfigureMake make install installation procedure. We will see a couple of generic EasyBlocks later in this tutorial. Another one is the Python package EasyBlock, for example, which as the name suggests, can be used to install a Python package. Next to generic EasyBlocks, we also have software specific EasyBlocks. So these implement an installation procedure that has very specific to a particular software package. So we have a separate software specific EasyBlock for installing GCC, for example, or for installing OpenFoam, for installing TensorFlow from source and so on. So these installation procedures are too involved to be handled through a generic EasyBlock. So we have separate ones for these. So the EasyBlock takes into account or can be controlled, I should say, through EasyConfig parameters. And these are specified in what we call EasyConfig files. So that's the next term. EasyConfig files are basically simple text files which are written in Python syntax and which tell EasyBuild what should be installed. So each definition in an EasyConfig file, we name an EasyConfig parameter. So it specifies one particular aspect of an installation. We have some mandatory EasyConfig parameters which we will show during the writing, writing EasyConfig files part of the tutorial. So the software name and version, for example, always have to be specified. The homepage and description always have to be specified. So this is some metadata for the installation. And we also always need to know the tool chain, the compiler tool chain that should be used for installing the software. There's a whole bunch of other optional EasyConfig parameters. Some of these are mentioned here. We will get back to this when we get to the hands-on part of writing EasyConfig files. Then we also have extensions. So extensions are also software packages that can be installed, but they are installed on top of something else. So some well-known examples of this are Python packages, for example. So a Python package needs an underlying Python installation to work. Our libraries or plural modules are very similar. So we came up with the general term extensions for these type of software packages. EasyBuild supports installing extensions in two different ways. You can either install them standalone, so as a separate installation on top of something else, or you can install them as a bundle together. So multiple extensions installed in a single location, but they can also be installed in a particular Python installation, for example. So you can have a Python installation with additional extensions that are included in that installation itself. So rather than having separate and on top of each other, they can be together in a single installation directory. And we refer to this as a batteries-included type of installation. Then we also have dependency. So this should be a familiar term. So a dependency is a software package that is either strictly required by other software, or that can be used to enhance other software. So it can be a required dependency or an optional dependency. There's three main types of dependencies for computer software. There are built dependencies. So these only need to be available when building or installing the software from source. And once the software is installed, this particular dependency is no longer needed at runtime. Or we have a runtime dependency which has to be there when running the software, and maybe also when installing the software. There's also a link time dependency which is somewhere in between build and runtime, but we will not run into these throughout this tutorial. So keep in mind that we have built dependencies and runtime dependencies that are supported by Easyboot. Then we have tool chains, which I already briefly mentioned. So a compiler tool chain or just tool chain for short is basically a set of compilers. Typically these are for C, C++ and Fortran. And we use these compilers of course to build software from source. In a compiler tool chain, we also can have a bunch of additional libraries to provide additional functionality. So all the software that is part of a tool chain, we call these tool chain components. So you may hear me mentioning these throughout the tutorial. We've covered compilers C++ and C, C++ and Fortran. Maybe also CUDA compiler, for example, for GPU software can also be part of a tool chain. And we may have additional tool chain components, for example, an MPI library to provide support for distributed computing. One example is open MPI, or we may have libraries for linear algebra routines like BLAST or a LAPAC type library. And maybe also additional ones for fast Fourier transformations, for example, like the well-known FFTW library. So all of these can be part of a compiler tool chain. In some sense, a tool chain is similar to a dependency, a build dependency maybe, but it has some special aspects to it and we will get to that later in the tutorial. When we have a compiler tool chain that includes both compilers and an MPI, a BLAST, a LAPAC and an FFT library, we say we have a full tool chain. Well, if any of these additional libraries are missing, we only have a sub tool chain, so partial tool chain in some sense. If we only have compilers in the tool chain, we say we have a compiler-only tool chain. There's a couple of specific tool chains worth mentioning. So the system tool chain is a concept in EasyBuild where you can specify in an easy config file that the system tool chain should be used and this basically corresponds to using your system compiler for installing software. So here we wouldn't use a compiler that is provided through EasyBuild, but one that is just already available in the operating system. We try to limit the use of this because then we're sort of at the mercy of the operating system and the compiler we get there may be a very old or a very new version and this may have an impact on how easily it is to install the software. So we typically only use a system tool chain to either compile our own compiler from source or to install binary software where no compilation is involved at all. And then there's also the common tool chain. So we have two tool chains that are particularly popular in the EasyBuild community, which is the FOSS tool chain which corresponds to the GCC compilers, OpenMPI, OpenBLAS, the ScalaPack and FFTW library. So this is all open source software, hence the name. So FOSS stands for free and open source software. Then next to this, we also have the popular Intel tool chain which is basically all the Intel tools, the compilers, the Intel MPI library and the Intel mat kernel library. We update these common tool chains roughly every six months and we have an overview here in the EasyBuild documentation if you want to take a closer look at these. In this tutorial, we will only play around with the FOSS tool chain because that's all open source software and it's easy to do in a tutorial like this. And then finally, we have modules. So modules is a very overloaded term. So we clarify this a little bit here. When we talk about modules, we generally refer to environment module files. So this is a well-established concept in the HBC community. So environment modules are shell agnostic files that express what type of changes need to be made to your environment. Prepending locations to the path variable or setting specific environment variables that are very important for a software package to work well. For this, we will use a modules tool. So the modules tool will basically ingest the module files, interpret them and figure out which changes need to be made in the environment. For this tutorial, we will only play with LMOT which is the standard modules tool that we have in EasyBuild. And remember, the module files will be generated automatically by EasyBuild. So you won't have to write these by hand as some people may still be doing. So to bring all of this terminology together, we basically have the EasyBuild framework, the core of EasyBuild, which uses separate Python modules, which we call EasyBlocks. Each of these EasyBlocks implements a particular installation procedure for scientific software. These installations may include additional extensions, so additional packages on the side, like Python packages, for example. And for the installation, we use a particular compiler tool chain, which is a set of compilers together with maybe an MPI and a plus-layback library. And the specifics of these installations are specified in what we call EasyConfig files. EasyBuild will make sure that all the required dependencies are in place before performing the installation, and it will generate the environment module file to provide access to this software. So to wrap up in this introduction, we look at some of the focus points we have in EasyBuild. So first of all, since we're in an HPC context, performance is of something we keep close attention to. So whenever we can, we build software from source using EasyBuild, so we don't use binaries if we get the chance to. And EasyBuild will, by default, optimize for the processor architecture of the build host. So if you're installing software on an Intel Haswell host, for example, you will get a software installation that is optimized for Intel Haswell. That's what EasyBuild does by default. You can change this if you want to, but that is not something we will do here in the tutorial. We will stick to the defaults. Then we also have a lot of attention for reproducibility. So we try to make sure that the installations that EasyBuild supports can be reproduced on other systems. And we do this in a number of ways. So most of the easy conflict files will specify a particular compiler tool chain to use. And we will see this when we do the hands-on exercises. And we also make sure the dependencies are specified through EasyBuild as well. So we try not to use most dependencies from the operating system, but we will install these ourselves. And we will also fix the versions of these dependencies. So the easy conflict files have fixed versions specified for the dependencies that are required. And it makes it very easy to share easy conflict files with other people. So if it worked for you and specified an easy conflict it's very likely to work for somebody else as well. Since basically all the versions are fixed. And then over the years, EasyBuild has grown out to be a community tool. So we will talk about some of the history behind EasyBuild at the end of the tutorial. But today it's certainly a tool to help out each other and to collaborate on getting scientific software installed in a good way and in a well-performing way. So we do this by people do this by opening pull requests into the different GitHub repositories we have for EasyBuild. We also do this through the common tool chains which focuses the effort a bit on the FOS and the Intel compiler tool chains. And we collect easy conflict files for these tool chains in the central GitHub repository. And as we will see in the part about contributing to EasyBuild, we have powerful integration with GitHub that makes it very easy to contribute back to EasyBuild. That wraps up the introduction to EasyBuild part. I'm gonna give one or two minutes to let the people who are helping out raise some questions to me that may have been raised in this part of the tutorial. Let me go back to the slides. If there are any questions, I would like to see them pop up now. And if not, I'll just continue with the rest of the tutorial. So I'm not seeing any questions being raised to me. That's fine. If any questions pop up, don't hesitate to ask them in the tutorial channel in Slack. And hopefully one of the people helping out there can answer them for you. So one question that I'm getting is to clarify on top of a GCC version. I guess this refers to the Intel compilers. So let me jump back to that part. So in the common tool chains, here it says that we'll... In an Intel tool chain, we have the Intel compilers on top of a GCC version. So with this, we mean that the Intel compilers always need a GCC available, mostly for the C++ runtime library. And generally, the Intel compilers just use whatever GCC version it can find in the operating system. That's not something we like in EasyBuild because that makes the reproducibility of installations a bit more tricky. So rather than relying on whatever GCC version is provided by the operating system, we build our own GCC and we make sure that this is used as a base for the Intel compilers. So in an Intel tool chain, you actually have both the Intel compilers and a version of GCC included. And if you look here to the documentation for the common tool chains, so here you see the versions that we have for the Intel tool chain, the most recent version, for example, is the bottom line here, 2028. We'll tell you that in the Intel tool chain, we have the 2020 update one version of the Intel compilers, but we also have GCC 9.3 included as a base. So this is not used to compile the software when we're using this tool chain, but it's used mainly for the C++ runtime library. As is required by the Intel compilers. Okay, so I hope that clarifies things a little bit.