 There you go. And with that, I will be off for a few sections. So it's to Enrico and Simo now. So see you later. Bye. Thanks. So this small 10 minutes, the link to the materials that you're watching right now is in the schedule. I will really not go through everything. Basically, we will just go through the pictures to kind of give the basic terms and the basic idea of where you normally are when you do scientific computing and where you want to be when you need to scale up your resources. So it's always good to start with some XKCD comics. If you're not familiar with them, you can spend a few hours checking or a few days checking them all. But this is a typical case when you ask someone, is this your machine learning pipeline system? And you know, it's basically just a blob of various stuff. Some data comes inside, something happens, some algebra happens, and then you get something out. And if it's not good, you mix it a bit. So this joke here is kind of, most of you are doing or will be doing some sort of computing in the sense that you have some data, you need to process the data and instruct something meaningful out of the data. And then eventually, if that's your task, you might write a paper about it or publish it or whatever. Yes, so yeah, you might take some numbers, use computers to turn the numbers to different numbers, and then you publish them in the paper and then you get more of the citations and that sort of thing. Exactly. So if this is kind of the average process, the point of view that we are taking here is more like, okay, in this process part, what is actually happening? And most likely the configuration that you have or that you've been having is that you have a laptop or a desktop computer, which is a piece of hardware. And the piece of hardware inside has different parts. So that's the CPUs, which is like, the central processor unit, basically where the math is kind of happening, where the reasoning it's happening. But of course, the CPU alone doesn't really do much. The CPU needs to access some data. And so that's the memory where there is the data that can be accessed fast by the CPU. And then there's the hard disk in your machine where you might have larger files. Of course, having these three things will already be enough. Well, I didn't write the screen, you also need a screen and a keyboard to interact with this hardware. Human interface devices. Exactly, and a chair maybe, because you don't wanna stand all day. But of course, you might also have a network connection, which brings you to the internet, so that sometimes you can pull, let's say that some of your data is stored in some Google Drive or cloud or whatever cloud system Dropbox. Then some computers, not all of them, might also have these GPUs, which the G stands for graphical processing unit. So they're like many, many CPUs packed all together and they can be very efficient for doing many kind of parallel things, parallel computations. We don't need to go right now into the details. And then on top of that, you have your operating system, the software you've been using, Python, R, Matlab, whatever, and then yes, the user. So this is what you're very familiar with. And now you have to think that you wanna maybe move away or maybe you need to move away from your laptop because your hard disk is not big enough to keep all the data. Or maybe the cloud storage is so slow to download the data from the Google Drive because it's so big and your internet connection is not fast. Or maybe the RAM, you really need lots of memory because you need to keep in the RAM, I don't know, big matrices. So you understand that at some point, if the things are scaling, your laptop or your local desktop is not good enough anymore. This is where we are with this course and this is the idea here. As I said, I skip all the words. Now you have to understand that there is like a map. This is the map where we will be moving in the next days. So that right now, most likely you're in some, remote desktop machine or laptop machine connected to the internet. But then you are not really interested anymore in running the things in your machine basically your machine is just a keyboard and a screen and an internet connection. So that you're able to access some remote computational resources. So this of course also means that your data, you don't really look anymore at the data that you have in your local machine. You start maybe moving the data to the big cluster where you might have access or have data in other system that are not physically in your machine so that you can run things basically, run processes in other clusters. Now of course with great power comes great responsibility which means that most of the time you log in into one of this cluster to the so-called login node which is like an entry point. You see the drawing there that is the sign that you're not supposed to even stop your car there because the login node is just an entry point then request what you need. Do you need 20 CPUs? Do you need 200 CPUs? You request them, you wait for the request and then you can start processing all your numbers. It's important to get this idea of where the map, where you are in the map and we will, this will be a theme that will happen in all the lessons that we've covered here. In your experience, Simo, do you, is it, how can I say, is it so familiar to you to immediately know where you are in the various, if you're in a single node, if you're in the login node, if you are in your home, is this what you do daily basically? Yes, yes, it's completely like natural. So how I often think of it, like I use this analogy once that like one, if you go to a different place, like if you go to like a summer cabin or something, you have again to the summer cabin and you go there and you don't, if you don't see it the first time, then you figure out where it's everything, like you first have to figure out where's the sauna and where's the lake and that sort of things. You have to figure out where the stuff is, but if you own a summer cabin every time you go there, you're not checking where the sauna is and where the lake is because you live there and that's your place. And this is the kind of thing that happens once you start using this system, like you become acclimated to the system, you become, you start to like feel that you are at somewhere even if you're just using like a terminal that we'll be talking about later on, if you're using just like a client without any graphics, you still look and get a feel of where you are. And this is very important because that will enable you to like move some of your work, like part of your workload to a system that has something that your laptop doesn't have, like Enrico spoke about, but in order to make that transition easier, it's usually easy to get this kind of like a grasp of where you are, where are you placed. So throughout this course, it's a good idea to ask us and ask yourself, okay, where am I now? Like of course you're here in your office, you're here looking or you're in your home or somewhere you're looking at the laptop screen, you're here physically, but where you're running the code, where is your data, where is stuff coming from, where is the cloud, this kind of things. These, if you get a grasp of these, it will make your life a lot easier. Yeah, and this is very important because some people might have been not familiar with the system that you actually know what's going under the hood, that you actually are super sure that that part of the disk is physically there with you with the data and some other parts are actually in some remote system. And there are devices like iPads or smartphones, where it feels seamless, but in the background some data is in the actual phone, some data is in the actual cloud. Here, at this stage, it's actually important to know where you are, to know where your data are and where your process, meaning the actual computations. Are they happening in my laptop, even though the data that I'm pulling is somewhere else on the planet or are they happening there close to the data so that they can be as fast and efficient as possible. In general, the final comments, because we only have two minutes left for this part, is that the use of remote computational resources is also a sustainable way of doing computations for two things, for yourself, because you don't need to keep your laptop open during the night to run whatever machine learning things you need to do. You can start your processes, let them put them in the queue, wait that they wait that it's their turn, and then the processes start, they get all the job, and then at the end, you check the morning after and everything is done, and you can then just look at the daytime, write your paper. And another thing, of course, is that if your process is over, then someone else will use that specific computational resource. So if you leave your laptop on all night to do something and then it only took one hour, it's a bit of a waste of energy to leave the laptop on all night. Of course, laptops have some system to save energy and every computer is it. But the idea here that if there's a big machine that we cannot share, how can I say, equally in a fair way, then it's more efficient for everyone and also even one can think it even for the planet. It's even better. I don't have any other comments at the North Simo. If you want to add something, we basically have one minute left. I mentioned that there was a question in the HackMT about how much nowadays you need to know about what happens below the surface. And this is a good question and what sort of hardware things you need to know about them. I would say that we are going to be talking about these things because they relate to what your program is doing. And your program needs to understand these things. And your program is basically the thing that we'll deal with the hardware eventually. So if you know a bit about how your program works and what the hardware is, it will help you match them together basically. If you know that you need to... I don't know. You bought a big table from IKEA or somewhere and you need to take it home. You need to know the size of the band that you are taking. Can your car fit that table in? And these sort of questions are basically what you need to be thinking about and we'll be talking about them later. So how to match what you want to do with the resources provided by these HPC systems. And this is something where you need to know a bit about the hardware, but it's not something that you need to constantly think about. But it's good to get a sense of scale also for these kinds of things that we'll be talking about.