 But okay, applications. So really you don't want to be running bash scripts or simple py.py files. I mean, that's sort of boring, isn't it? So you're here to run really complicated stuff like big Python libraries or big HPC codes or things like that. And it has to be installed. And let's say I just made about half our time is spent with helping people install software on the cluster. So in this little 10 minute talk here, we can't possibly tell you what you need to know, but there is one main point we want. And if you scroll down, we can see these four ways of... I think we are sharing the notes. Right, sorry. Okay, that was not good. Okay, so applications here. If we scroll down a little bit, there's four main things, four main ways of getting software on the cluster. So one stuff can be installed through the operating system like you take the package manager and you're doing installing program X or you download an installer and run it. So this works on your own computer because you're the only one there. It doesn't work on the cluster because once we did this, then it would change the version for everyone. So would you want the version of Python or TensorFlow you're using to suddenly change in the middle of your project and you have to redo everything? No, clearly not. So that's why basically, almost nothing is installed through the operating system or what is there are basic utilities and we don't update them. This also means... For example, in Aalto, all of our compute nodes, they start from this minimal operating system image which contains the minimal amount of software and this is loaded into the memory so there's no really place to install the software even. So they only have the few Linux tools that you normally use and few other applications but most of the stuff is not there. Can you point the cursor at the point we're talking about, by the way? Yeah. The next point, we can install stuff for you and we will install different versions of different things and you can load just the version you need using something called the module system and this is something that... Well, the typical way of doing things. So you ask us, say, I need open foam development version as of right now and we've got a system, we can go and click a few buttons and it will start installing there. Different clusters do this differently. Some clusters install a lot of things that users need and some clusters are sort of like, they install the compilers and it's up to most people to install their own things. Okay, so the next step. So someone has installed the base of what you need. For example, Python, but you need the different libraries on it like PyTorch for TensorFlow or whatever it may be. In which case, you need to do a little bit more work. So for that, you can take their base and install a virtual environment or a condit environment and then install the extra libraries you need just for your user. So you have full control, you can take the versions you need, you can even have different versions in different projects. I would have mentioned here that if you think about like, you have a program you need to install and help with, like you try to install some program on your laptop or something and what do you do when problems occur? You Google it of course, like that's what we do as well. Like if we encounter a problem, we Google and then we are gonna stack overflow somewhere, somebody shows some commands, okay, like run these and that's one way of solving the problem. But in a cluster environment, because they are usually so specialized because it needs to be this shared cluster environment where stuff can be run through the queue. Like they might be completely different software and completely different problems that don't affect, don't happen with normal laptops and stuff like that. So usually the best way of finding information about software that you need on a cluster is through the documentation. It's unfortunate, like nobody wants to read documentation, they want to do work, but unfortunately that is usually the best way of doing it. And the second best way is to contact the people managing the cluster because they might know if there's a problem like with a certain kind of a software, if it's complicated. Maybe they have experience and maybe somebody else asked about this software before and this will speed up a lot your installation problems. Like you can find the solutions much faster. Of course, you can try a Googling stuff and try solving the other that, but often you reach dead ends if you do that way. So I would recommend asking for support in these cases. Yeah. Okay, so if we scroll down through the page, we won't talk about anything more. We just covered what the basics are, but there's some links here, like for example, how to do this with Python and things like that. Yeah, I'll quickly show like from the previous talk we gave yesterday this view. So what software often is that like yesterday we talked about hardware and what is the operating system and so forth. And then we talked about that you're most likely doing something with these applications here. Like the applications can often be, they can have like a graphical element, but in clusters you often use these in this client modes. So for example, there's been a lot of talk in the notes about like for example, Jupyter, which is like a web interface that you use. So in the cluster, you often want to convert these Jupyter notebooks into Python code that you can run without the web interface because the actual program that you're running is not like, it's not the Jupyter notebook but it's the Python code, right? So the Jupyter notebook is the GUID or the graphical user interface. So it's usually a good idea to like, whenever you start to use program, figure out what is the command line interface for it or what is the way of using it from the terminal because then you can more easily adapt it into these cluster environments where you can run it via code instead of like clicking buttons. Of course, the interface might be annoying but that is the best way of automating the stuff again. Okay, so if you scroll down, there's some links you can read yourself like information on some of the main different languages that people use, but we should go on now because we're quickly running out of time. Should we briefly show the module system? Yes. Which is the next page and I will add to the notes. So module is a tool that lets us manage multiple versions on the cluster. So basically instead of making Python 2 or 3.8 available for everyone, you can do module load Python slash 3.8 and it will load our module load, openfoam slash whatever version number it is. Yeah, can we show how it works? I can demonstrate. So here I, if I run the command because I sent it here. So here we have like these modules, they depend on cluster again, but the same tools are used in like the module system itself is used in all clusters to install because it's the best way of like making it so that multiple different software can be used by multiple different users. Yeah. But for example, in our cluster, we have this Anaconda module which contains a lot of like Python stuff. So if I run this module show, it will produce me this kind of. So this shows what's happening inside of the module. Like the module does this stuff and let's not worry about what it means. It sets a bunch of environment variables so that your terminal will then find the software. So if I load the module here, let's try this example. I type Python 3. So this program Python 3, this program type tells us where a program is, like what Python 3 would be. And if I run this version, okay. So fairly old. So now if I run this module on Anaconda. Yes, and now we type Python 3. And what do you know? It's something different. And then if we check the Python version, it's now Python 3.10. Yeah. You can use this module, least for example, the least what modules you have loaded. So often you load these modules in the Slurm script that you have. So in the script itself, you load the stuff you need and then you use the stuff. So is this basically what we need to teach them? I guess there's different commands like module purge that can clear everything. You can unload certain modules. Can you do module spider Anaconda? And let's look at different versions. Yes. Because it takes a while while it's looking for everything that's there. Okay, here we go. So there's a lot of different versions available, ranging from 2020 to 2023. Yeah, okay. I'll quickly mention that in these cluster environments, it's not usually a good idea to set some defaults. So it's not a good idea to load something all the ways or to set some, like in many software installation instructions, you have installation instructions like set these environment variables into your batch RSE, which is this kind of like a file that set these settings for every time you log into the system, you have these settings enabled. And these can easily become problematic because in many, like you have probably realized these cluster systems are complicated. And when you have a complicated system and you add extra stuff into that complicated system, you can create unforeseen consequences. So it's usually not a good idea to like load everything by default, especially if you are planning on using multiple different softwares because like then these might clash and they might clash with the like the overwrite system that you're using all sorts of like hell might break clues. So like as a technical rule, it's not a usually a good idea to load everything by default because that can like create these ripple effects that you don't see coming until like suddenly your stuff doesn't work anymore. So like usually it's a good idea to like activate a load the stuff that you need whenever you need it. And usually it's done in the slurmscripts so that the stuff is like loaded, the specific stuff is loaded for the specific script that you're using. Yes. So yeah, that's basically a bit about it for the modules I think. Yeah, I mean people can read this yourself. Yeah, check the commands and there's a lot of extra information there. Yeah. Okay. Should we go on to storage then?