 I forgot to unmute us after that. Okay, so software modules. So now we get to the interesting part of things. So it's interesting in that we're getting hands-on and finding the software on the cluster, but we hope you're not too bored because you're not actually using it for anything yet. So with so many people using the cluster and so many different versions of things installed all at the same time, you can wonder how do we keep track of it? And that is something called Elmod. So Elmod is just to name, it's pretty standard, but it provides the way to define the different software that's available and then activate it in your own shell. Yes, so the software, what the Elmod basically does is that it basically tells the, when you load one of these modules, so they're basically like shortcuts in a sense to like you would have in a Windows or in Mac or somewhere. They basically define these environment variables to the operating system so that the operating system can find the correct software and libraries at the correct places because we have so many different use cases. This is not realistic that we would install like software in the operating system itself. So the operating system for security reasons, and for various other reasons, is quite old itself and because it needs to be secure and compatible, but these modules, they enable us to basically have lots of different software, new software that's then available for the users. So if you look at one of these modules, so with module show, for example, let's look at the GCC compiler, for example, you will see that it contains this, lots of these directives that change these environment variables where this compiler should be found. It's not something you need to think about, but this is what basically happens underneath it. When you load a module, it changes your environment so that this software is now available in your environment. And that's why we talked about environment variables in the last shell course. So when we're debugging stuff, we need to know about these variables, but otherwise you can sort of just hope it works and then ask us if it doesn't and that's usually our problem. So loading modules is quite simple. If you're at some other university, we have some of the software is commonly available through this FGCR consortium, and you can load them using this module load FGCR common, but your cluster might have different software. CSC also uses modules, they have their own installations, but basically the same kind of commands work in every cluster. So let's say you want to load a Python environment, you want to load a newer version of Python. So let's see what version we get from the operating system. So if we run which Python three, we can see that the Python comes from username Python three. Which is the general operating system installation. Yeah, and its version is there, the version mentioned here, the 3.6.8. But if we load this Anaconda module that we have, we load this, and we look at which Python now we have, you can see that it comes from this flowing path. And these are automatically installed by us like through this installation system that we have. So you don't have to care where it comes from, you just know that it's installed by us. And I'd like to point out to everyone watching that Simo was basically the one that built this automatic installation system. Yeah, so you can blame me if everything, anything goes wrong. I didn't build it completely myself. But basically, you can see that the Python version here is newer. So we have a different Python, when we have this module loaded. So this means that we now have this one module loaded, and we can see that, okay, we have this 2020 05 TensorFlow 2, that the module naming scheme can be different in every cluster. We have our own schemes, and we try to describe them. For example, in the Python page, we have described this, why we have named it in such a way. But basically module load and module list, they can also be done with ML. So ML is this kind of a faster way of access, using module if you just want to like, you don't want to type them all the time. Okay, so now you have this Python environment set, but sometimes you might end up in a situation where you have software loaded and need conflicts with some other software, and you want to start from scratch. Well, sometimes you just want to open a new shell to try to, but you can also purge the modules. So what it does is it's undoes the settings that the loading did. So purge, you can unload single modules as well, but it's usually better to just purge the whole thing. And then when you look at the software that you have loaded, you no longer have any software loaded. Yeah, like mentioned here, you can unload one module, but you should be careful that you don't necessarily, well, because like it should be deterministic what it does, but sometimes you might end up in a situation where if you do various unloads and loads, it's not the same as you started with. So you might want to purge sometimes just to be on the safe side. Yeah, like this idea of reproducing your environments is actually quite important. Like you should always be able to purge everything and then reload stuff in the same order or have a list of the stuff that you've loaded to get back to the same point. Yeah, and this brings us to like common stuff that people do is to put their most commonly used modules into the VASRC or VAS profile. So whenever they log in, they get the same modules. This is of course very usable, but it comes with the risk that you have certain version of software loaded and you don't necessarily think about it that much. So you just have it there on the background and when stuff breaks, you don't necessarily know whether it's because you have certain version of software loaded or if the problem is caused by the software or is it something that could be avoided by just not loading everything? Because some of the software in install here is not like you can't have multiple versions of pythons at the same time because then the Python doesn't know where it gets its stuff and it will try to work around it, but eventually you might get in a complicated situation. So it's usually a better idea to have like specialized environments where you load just the stuff you need instead of loading everything that's available. The module system, there are some conflicts set in the system so it doesn't allow you to load certain modules at the same time as other modules, but sometimes it doesn't do like completely dependency check or anything like that. So you might get like conflicting modules if you load everything that's available. Okay, let's try it out. You can try it yourself if you have a terminal open as well. So where is Matlab in Triton for example? Let's try Matlab, command not found. Okay, this is a bad cluster. I don't want to use this. There's no Matlab here. So obviously we have Matlab installed. And to find Matlab, we recommend using this spider command. So spider basically goes through the web of all of the software. I don't know why it's called spider, but maybe it hunts it down in the web of software. But anyways, it will give you the versions of software that are available called Matlab. When you run this, it will produce you with a bunch of Matlab versions that are available. So now a good question is that what happens if you load just Matlab? You can see that there are plenty of versions there. So which version do you think you will get? Let's see, you get the newest one. That's pretty, well, one would think that you will always get the newest one. It will usually take the one with the, well, sorting from sorting, it will get the one with the highest number. So basically in here we have the 2020 B that's the highest number, so it will get you that version. But it might be a good idea to check because sometimes you might end up in a situation where you have, let's say, software installed against Matlab, like a Max file or something, and you want to make certain that the Matlab version doesn't change when we install a newer version of Matlab. So when you load a certain version of Matlab, you always want the certain version of Matlab. So you might want to say, module load, let's say Matlab 2019 B, and this will always give you that certain version. And you can see here that the module system changes the software, it sees that, okay, you already had a Matlab loaded, let's switch it to another version of Matlab. Basically, it's a good idea to, if you first want to try out stuff, just load the software. But if you intend of doing some work against that system, and you think that it's important to keep the system reproducible, then keep the versions intact. Like keep the version, like lock in certain versions so that you know that, okay, you will get these versions always. Like, do you want things to randomly break when we install a new version and then force you to fix it to keep it up to date, which might be a good thing? Or do you want it to be the same forever? And then possibly several years from now you'll have to do a big update when you need newer software. Both are recent. Anybody who has ever used a smartphone has probably run into a situation where, especially if you have used for a longer time where you have certain version of, let's say, Android or iOS, and then in the update, they break everything and you can't use the apps you want anymore. Like you updated the operating system and everything breaks and you're mad about it to the makers of the software. We want to avoid this. So we want to keep as many versions as possible on the background so you can still use them. But you need to say which version you want because otherwise it will just get you the newest one user. Okay, another example, let's check the other version. So you can see that there's plenty of these R versions and you will see that there's yeah, there's plenty of these. You can see that there's also like some inconsistencies with the module naming. And this is historical reasons. We have switch from module, some module naming conventions and we are going to deprecate many of these modules. And when we do this deprecation we will give, we will let you know, you will definitely see when they are going to be deprecated. But usually you will only need the newest version. So module load R and you will get the newest version of R. And you can see that compared to MATLAB, this brings a huge bunch of other software with it. So MATLAB comes as one installed package by MATLAB people and it will contain everything inside of it. But this R version is being compiled against optimized versions of different operating system libraries and also operating system libraries that are not present in our base operating system. So they have to be brought in by the module as well. And it does so. So you can see that there's a huge bunch of stuff. And you can actually see that here the version that we got wasn't actually the newest one. So you definitely should check what version you get when you try to load. You probably wouldn't have wanted this one when you loaded the module and you didn't get it. So it might be a good idea to make certain that you get what version you want. Yeah. So what happens if you want to load that? It would probably say that switching and huge bunch of versions, let's see. Yeah, it will change a bunch of software on the background. But what you probably want to do is just purge and start loading the version that you actually want. Yeah. This is probably the best way of going at it. Yeah, so let's see. What else do we need to talk about here? I think we've covered most of the things. So if you are not at Alto, then the same general concepts will apply here, but there'll be different names for all the software. For example, it might be MATLAB, uppercase or MATLAB with something else or R might be called something else and so on and so on. That's something you'll need to find yourself. Yes, there's also a possibility of creating these modular collections where basically if you have certain bunch of software that you need to use, like you want to use, let's say R and you want to do something else, you might want to save this collection so that you will always... So let's say I want an R and a GCC and I want to save my modules as a collection that I don't have to remember what I wanted to build certain stuff. So you can save these collections. They might, yeah, we don't need to go into much detail about it, but basically you can save these collections so that you can recollect them afterwards. If you have many of these version numbers that you need to set up, it might be a good idea to set these collections so that it makes it easier for you to load the same modules. Yeah. It used to be that loading some modules would take several minutes to resolve all of these dependencies, but most modules these days are a bit faster. So yeah, any final notes here? So we see at the bottom there's a big reference of all of the different module commands, which you may not need them right now, but know that you can get this reference. One thing you might want to try as well as looking at data documentation, you might want to check module avail if you're using some software. So you will see here that there will be many of these that are different folders, different sources that we have software available. And you will see that this list is quite exhaustive. So there's huge bunch of software and I'd say that a lot of it is something that will be replicated at some point, but if you see a certain kind of software that you need, you might find it in this list as well. But it might be a good idea to also ask us what versions do you want to use and so forth. Should I install it myself? Like this is, this still continues. So there's quite a bit of this software. So you might find what you need here or you might need to ask us. So, let us know. This list ever end. Okay. Yeah, it's quite long. Yeah. So now we've got some exercises here. So the people in the Zoom session, can you let me know by chat how long or how interesting you think these exercises would be? So, yeah. So here's, so this gives you some time to sort of play around with module and see what's available and what I get used to it. If you don't care about modules, then this will be boring to you. And I'm sorry, but we need to go slowly for everyone that is following along. So, how long do you think of a exercise session we should take here? I think maybe 10 minutes or something. I'd highly recommend like maybe checking at least exercises one and, yeah, maybe one and two at least or maybe three as well to check. At least the first ones, like just get a hold of these commands, check their output or run the commands as I just presented. So that you get like this idea that how to load these modules because these are the interface you are going to have with the software. If otherwise you will need to install everything yourself and that's going to be a lot of hassle. And it's highly recommended that you install these versions or use these versions because they're all already present there. But I'd say 10 minutes, so maybe 22 to 22. Yeah, okay, sounds good. Yeah, so let's go to the Zoom sessions. Remember, you can continue using HackMD to ask questions. Please always scroll to the bottom and ask there, not making new things at the top where you might not see it. And good luck. If you don't finish everything, don't worry. At least numbers one and two is good.