 All right, hello. I'm Kyle Hudson. I'm one of the system administrators for BioCat. We have a couple more BioCat personnel over here. Let them introduce themselves here. OK. I'm Adam Tricke. Adam's another system administrator. Dave is our application scientist. He's the guy who helps out both with programs and with optimizing what's going on in the queue. One person we don't have here today probably won't be at all, but might be toward the end is Dan Antriesen. He is our boss, and he is teaching a class right now. So he's not going to be able to make it for this part. And then he goes into office hours directly afterward. So you don't get to see him. He'll be around tomorrow if you need this. This first hour, we're just doing an introduction to Linux. So this is just getting you familiar with the Linux system, how things work, the concepts behind that. I'm going to share my screen here. And I'm going to do something a little different than we've done before in the past. And that is I'm going to run all this thing through a web browser. I'm on a guest machine here where I don't have all my tools and stuff like that installed. So I'm just going to show you that it can be done. We can do this with nothing more than a web browser. So I'm just going to go to OnDemand, which is a web application that we have, .baocat.ksu.edu. And it's going to have me sign in with my EID and password. If I didn't typo that, no, I do not want to save that. Sure, we'll do that. Sending my duo, if you're duo enabled, it will ask for duo at this point. If you're not, it'll just go right on in. Yeah, because I'm on this machine. All right. And we are to OnDemand. OnDemand is a web front end to our system. We can do quite a few things here. We can go to any one of the head nodes themselves. There are several applications. We can start here. Mathematica, ComSol, some of these interactive things. RStudio tends to be a real popular one. We're going to go through some of these things a little later on. But just for today, I'm going to connect to a head node. So I'm going to go up here to Baocat Shell Access. We have two head nodes. They right now are named EOS and Celine. So we can actually say, hey, I want to go definitely to one of those or the other, which can have some advantages if you're going to try to restart things. And sometimes we're going to talk a little bit later, probably even tomorrow, I think it is, about Tmux and how you can basically leave. I can go away, come back later, log into my machine at home, come to school, whatever, and kind of pick up where I left off. And that's really handy to know which head node you happen to be on at that point. Right now, I don't care. So I'm just going to say Baocat Shell Access. And here I am. And it even knows who I am and all that kind of thing. So right now I am in what's called the Linux shell. The shell is just kind of an environment that takes commands and interprets it and lets know what's going on. And this is the default. And it has my username, Kyle Hudson at Celine. I wonder if I can enlarge this screen a bit here. Is that going to screw up? Nope, that's going to screw up the other. Aren't it? Doesn't it? Oh, no, it did resize the screen. Good. I thought it was going to make my thing so it scroll off the screen because of the web browser. But it seems to be all right. So we're happy there. We do have a chat here real quick. Several of my students do not receive their approval for their accounts. There's something you can do so they can follow up better. I have nothing to do with that, Sue. This is in Wichita. We were doing this both for Baocat here at K-State and Baoshok in Wichita. That is a process through Wichita State. And we don't have any control over that whatsoever, Sue. I'm sorry that I couldn't give you any better news than that. That is through their Wichita State's IT services. All right, so now I am logged in to the machine. I have the same thing as Kyle and Celine at Wichita. It'll be head note 01 and head note 02. It'll probably say at Celine or at EOS. That's telling you which head note you're logged into. Even though I chose. Yeah, I chose, I said Baocat, which is going to randomly select one of the two. We only have two head notes. If you say Baocat, you say, I don't care which one. Just put me on one. Or you can choose whichever one individually. All right. So the first thing you'll see is that I have this little squiggle here. We call it, technically it's called a tilde. And that means that that is a shortcut to your home directory. You kind of have to see how things are organized here. Let me show you through a graphical interface first how this kind of goes. I can go up here to files and go to my home directory. And you can see what I have here. I have a bunch of folders. I have a whole bunch of folders. That's the way I work. And then down below all that, I have a whole bunch of folders. And then I have some files underneath there. I'm going to work out of a folder. Just kind of limit this a little bit called Baocat intro that I've set up for this. And I've set, I've reused this obviously for previous ones. You can see there I have ones from 2018 and some examples, that kind of thing. So I have a directory here. See what the chat is. Just a follow up, good. But this way I only have 20 or so files I'm dealing with as opposed to a few hundred which I normally do. So I want to change, in Windows and Mac you have what's called folders. In Linux and Unix, we have what's called a directory. Same thing. I kind of use those words interchangeably. The commands I'm going to give you are going to use the word directory just as a mnemonic to what I'm talking about here. So the first thing I want to do is I want to change into that directory. Now Unix was written back in the late 60s. So they tried to save every bit of space that they could because storage was in the order of a few kilobytes. And storage was in maybe a couple of megabytes. So we didn't have a lot of space, everything as short as possible. So to change a directory, we're going to use the short command for change directory which is just CD. So I'm going to say CD and then space and Beocat Intro. And now you can see that it's changed my command prompt here to Beocat Intro. Now let's see, we want to see what files are in there. I'm going to do an LS and LS means list them. Just make a list. I'm going to want to move this out of the way so you guys can see. And this gives me all the files are in the directory. If I look at this list of files in this nice web interface and this list of files, they're the exact same thing. I'm just looking from one to the other. I need to be able to navigate around between them. So CD command changes the directory. This gives me the last part but doesn't tell me where exactly I am in the whole directory structure. It just tells me the last part. Which sometimes is handy but there's sometimes you might have a sub directory for scripts. And that could be in this, I could be my R scripts, it could be my C scripts, it could be my Python scripts. This is an example. So all I see the last thing is I'm confused. I don't know where I'm at now. We can use the command PWD which is Print Working Directory. And that will give me the list of everything all the way down the line. So this tells me that I am now in Homes, Kyle Hudson, Bill Cain and Joe. Pretty much everywhere you go, including at Wichita for those of you guys on Wichita, it starts with home and then your username. That's standard format. Back in the early 90s, K-State got into Unix Computing and they sent everybody to Homes because home was for the local users and Homes was what they used for their network users. And we've just kind of fell into that. So that's there for a historical reason and it throws me off every time I go to any other system because I'm used to using Homes because this is where I work primarily but it's home pretty much everywhere else that you're gonna go. So when I first log in, I'm in my directory which is Homes, Kyle Hudson. If I wanted to go to Adams directory, I could say CD space. And now I don't wanna go from where I'm at right now. I'm not going forwards from this Homes, Kyle Hudson, Bill Cain, intro. I don't have a directory underneath there I'm going to. I wanna start at the beginning. So I wanna say slash Homes slash Moses, which is his EID. Most people you're not gonna be able to look into their stuff. You can look at mine, you can look at his because we have ours opened up to the world. So other people can look. And now I can do an LS here and you can see all the files that he has in his directory also. Now I wanna go back to my own. Fortunately there's a real easy shortcut to get back to my own. I say CD space and then the tilde that we saw at the beginning which is usually right next to the one on your keyboard is like a shift back tick, something like that is what mine is on this keyboard. They might vary a little bit, but I say CD and then the tilde and that takes me back to my own home directory. So now when I print my working directory I have Homes, Kyle Hudson, I'm back to where I started from. A more recent innovation is you can also do CD with nothing behind it and it'll take you back to your home directory. You don't have to put the tilde anymore but let's say I am in another folder. Several of us have bulk directory setup here at K-State. How many of you are in a working group that has a bulk directory? Any of you by chance? No, okay. You'd know if you were. We charge for that space but it lets you use more than the terabyte that we allow you. So we have some working groups have some big genomic data, that type of thing. So we have bulk directory. So I have, I'm gonna change directory. So instead of home, I'm gonna go to bulk, Kyle Hudson. And there you can see I have more files that I have set up there. Now let's say I wanna go straight from here into my Baocate intro directory. This is not anywhere even close, right? We have the root up here. We have homes and we have bulk. We have my Kyle Hudson, Kyle Hudson. And then I have the Baocate intro on the one, right? I'm just trying to visualize how things would lay out in a, if we were printing off all the directories here. We can access that, right? The bulk one, no. Well, I might have my bulk open but generally speaking, no, you... All right, I just tried. Okay, I didn't remember if I had that one opened up to the world or not. But, so let's say I wanna go back to my Baocate intro now. And that's under my home directory. I know where that is. So I'm gonna say CD. And then I'm gonna use that tilde slash to separate down to the next one. Then the Baocate intro. Now watch what I do here. I'm gonna hit, I'm gonna type BEO. And now I'm going to hit the tab button, Baocat. So I have multiple things that start with Baocat. Everything, so it says, hey, I know, if you start with BEO, I don't have anything else listed until it says Baocat. So it says, it also fills out to Baocat. Now, if I hit tab again, and again, it says, oh, these are what I have. I have Baocat admins and Baocate intro. Those are my two options at this point. And it leaves me where I was. So I can say, I'm gonna change that Baocat intro. I hit tab again, and it auto completes. The tab button is your friend if you're using the command prompt. And now I'm back in my Baocat intro directory. If I was to look up here and go back to Kyle Hudson, you would see these same things. You see that Baocat admins, Baocat intro. And that's why it wouldn't let me go any further than that. If I typed Baocat, I, all by itself, and hit tab, it'd just say, yes, I know what that is and it would auto complete. So the tab is a good thing to know for getting in and out, looking at not only directories but files also. Pardon? Or commands. Or commands, yes. For instance, I should have, yeah. Let's look at OMP hello. So if I do cat, OMP underscore H and I hit tab, it auto completes that. And you could look at this, it's a little C program. It's not very much there. You'll see me do LS a lot. It helps me, it just helps me to say, oh, this is where I am. So when I'm in a command prompt and I'm like, what was I doing LS? Oh yeah, that's what I was doing. So you'll probably see me list my files a lot of times during today. Okay, that's a good thing. A cat shows is short for catnate. So it will show all the files that you list. I only said one file. So when I do cat, matter of fact, there's a couple shorter ones in here. I'm gonna say cat, my host.sh. That's gonna show me what's in that file. And it's only two lines. It has this one and this one which it doesn't even have a line ending on that one. So it ran my command prompt right after the end of it. Okay, that's a good place to bring on is to view what's in these files. Before we edit them, we're gonna view them. So let's go back to that other one that we saw, the OMP hello. And you don't need to know C to know this. This is just an example to show how to get into and out of a file. There, long time ago, they said, oh, if I cat, like this cat omphello.c. Oh, that was off the screen. I can't see the top of my file. That's no good. So the solution then is they said, let's make a command called more. Just say more omphello.c. And now it goes through one screen, where's the stuff? And then it has more down to the bottom. So there's more down to the bottom is a 69% through the way through the file. You can see that I hit the space bar and then it shows me the rest of the file. And that worked pretty well for being able to look through a file. Problem is, I'll get to the end and say, oh shoot, what was that line right above that? So they came up with another command called less because less is more. Programmers have a weird sense of humor. So we say less omphello.c. And now it shows the same kind of things it did done or more that tells you there's down here. But now I can use my up arrow and down arrow keys to go up and down through the file. Another handy feature of using this is let's say I'm gonna look down here. There's commands here called printf, right? So let's say I'm all the way at the top. And I wanna say I'm gonna look for the print command. So I'm gonna use the slash and a slash says go look for this. So I'm gonna say slash print and I hit enter and it jumps me down. Oh, in fact, there was one up top because it said only prints the total. So they actually saw one that I didn't even have. So hit slash again and enter. So I wanna see the next occurrence of print and it jumps me down here where it says print hello world. I do it again. It goes to the last one and then it'll cycle back through. No, pattern not found when they get to the very end. So that's how you can look at files. Now let's say I wanna edit files. That gets to be a little trickier here because there's all kinds of text errors we can use. And I'm gonna show you the way to do it through the web interface here in a little bit. But there are a couple of ways we can edit files. Now, if you're going to edit lots of files on Linux systems, I highly suggest that you learn to use a program called VIM, V-I-M. That's, I spend probably half my life in VIM because I work on these systems all the time every day. It has got the learning curve of a brick wall. If you like the most often asked question on Stack Overflow at least as of like a couple of years ago was how do I get out of VIM? Because it is non-intuitive completely. However, it's not intuitive, but it's also very powerful. You can go in and do a global search and replace. You can reformat things. You can search for strange patterns. This followed by some unknown number of characters and something else, very powerful for that kind of thing. So if you're gonna spend a lot of time in it, use VIM. If you're not and you're just trying to edit a file or two, there's a program called Nano. So there's, let's look at the easier file here because we don't need to see multiple screens. So I can see the syntax of the printf command I just used. Yeah, that's in my C file. So I'm not even covering C, but sure, well, there's my printf. For- How do you get out of plus? Q, sorry, I should have said that. Q will get you out of that one. Let's look at the other one here. I'm gonna have, this is, again, one of my files that I use for my demonstration purposes. This is usually what I give people for when you go to submit your first batch job on Baocat is my host, because all it does is it goes out and runs a host name. So a shell file, which is what this one is, is just a bunch of these commands strung together. So I wanted to do, change directory and then cat file, whatever, all these kinds of things. You just put them in a file altogether and that's called a script or a shell script. So that's what we have here. We have cat, myhost.sh. Again, really tiny. It's just one single line that tells it that we're running it as under shell and another one that says I'm running host name. But let's edit this file now. I wanna say nano, myhost.sh. Nano is a quick and easy text editor on Linux. I'm not sure doing on your machine. You won't be able to write to my file. So if you tried to write my file, it would say you don't have permission to do that. So feel free. You won't be able to. So here's the file and you can see, now one thing you've noticed, how much have I used the mouse? None, because the mouse has no effect on what we're doing here. Yes. Pardon? Use Emacs. Yes. And that's fine too. We have then installed I'm not an Emacs guy, but good for you. Emacs is a nice little operating system if it just had a decent text editor. So this is a very easy text editor use. It's not, you can't do a whole lot with it other than just arrow around and do things. So everything I'm doing here, I'm using arrow keys. I'm not using the mouse at all. If I click the mouse over here, it doesn't change to where I'm at. I have to use the arrow keys to go to the end. So here I'm going to add another line to the end and I'm going to say, sleep, 3,600. So these are commands that, again, I'm just doing two commands in a row. I'm running the hostname command and sleep says, don't do anything for however long I tell it, 3,600 seconds is an hour. This is a good example if I say, hey, look, the program's still running even though it's not really doing anything. I just say sleep for an hour. Just hang out, don't do anything. Now that I have this here, I want to save this file and exit it. So you see these little carrots over here. These are shortcuts using the control key. So if I had multiple pages, I could do control V for the next page, control Y, I have to look at the bottom. For everything except for exit because that's all I ever use, Nano4 anymore is for showing examples like this. So I'm going to go control X to exit it and ask if I want to save it. And I say, yes, ask for the file name, but defaults to the one you had already and we're done. Let's move some files around. So let's say that this, my host, .sh file, it's there, I like it. I want to use it, but I don't have it in the right directory. So I want to put it back in my home directory. So I'm going to use the copy command. And once again, they shortened everything. So it's CP, it's not copy because that would be too easy. So CP to copy. And I'm going to save this file, myhost.sh. And I notice I did the tab complete so that I don't typo anything because if I put the wrong start in there, I hit tab, tab, it's going to just, it's going to beep at me and tell me that I don't know what I'm doing. So I'm going to tab, tab complete through there and make sure I don't typo anything. And then I want to move it to my home directory. What's the shortcut for my home directory? Yeah, the tilde. And there we go. I'm, it's now copied if I do an LS, tilde, slash myhost.sh. That'll show me that I have that file now that exists almost in myhost.sh. Most Linux commands have what's called command flags and that is at some point in there, it has a dash something. So in this case, I wanted to see more information about this file. I do LS dash ill. L means long, tilde slash myhost.sh. And now it's going to tell me a lot of information about this file. The first here, these are the permissions on the file. Now we're going to talk about some file accesses. You're covering that on your session, right Dave? The file permissions and ACLs, right? Right, but tomorrow you're talking about ACLs, right? Okay, traditional Unix permissions don't use ACLs and it works out really well for some things. For BaoCat and a big user system where we have, you want to share your home file with him, but not with her and it gets to be pretty complicated. So we don't use these much in here. So you see this RWX that stands for read, write and execute. These are standard Linux permissions. You see this three times here. It's read, write, execute, read, write, execute, read, write, execute. The W on this one is highlighted. These others don't have it. That means it doesn't have that permission. So right now I have read, write, execute my group and you can only have one group that's the other per file. So in my case, the group is how it's in users. It tells me right there. That group has read and execute permission but not write permission. And the rest of the world has read and execute permission but no write permission. There are times that you're going to want to run like a script. This is the most common thing we see here is that people want to run a script and they type it and nothing happens. It says it doesn't know such file type thing. Or yeah, deny. That's what I was trying to think of the right. I see it and then I don't even think about it. Permission denied. If you do change the mode, CHMOD plus X, that's saying add executable permission to my host.sh. In this case it's not going to do anything because I've already got it on there. But I could say CHmod minus X. And now if I look at that same file, you'll see that the execute permission has been taken off of there. So as opposed to up here, I still have read write, world still has read, or group still has read, world still has read, but nobody has execute permission. So now if I was trying to run that command, let's first of all run the command that we have in this directory. Because if I do this one, ls-l dot slash my host, dot means my current directory. So current directory on my host sh. Yeah, would help if I didn't type of it. You'll see this one and you'll notice it's also in green. That tells me also that it's executable. So the one in my current directory in the Baocad intro director that I'm in now, it does have execute permissions on there. So I can do dot slash my host dot sh. And it tells me it ran the command. It tells me I'm on Selene and now it's waiting that hour that I told it to. I'm going to quit that so I'm going to control C. If I want to run the one that I put in my home directory that doesn't have execute permission, I'm going to say tilde slash my host dot sh. And there you'll see it says permission denied because I don't have that execute permission on there. I'm going to arrow up. This is another handy piece of information. If you arrow up and arrow down, that'll go through the previous commands. So I'm going to go back and I'm going to CH mod and put the execute permission back on there again. And now I can run it and it's in the same thing. Now I've decided though that I don't want it there after all. I like the one in the backend entry directory. I don't want things clearing up my home directory. So I'm going to remove the file. So that is RM tilde slash my host dot sh. And now if I do the LS of the previous one, it doesn't exist. Do you have a question? Yes, so the X stands for execution. Yes. Added it and then you can. Yes. For purposes of showing what happens, I took the permission away and then I tried to run it and I saw it doesn't work. I had to add that permission in. That's probably the most common thing we see. People try to run scripts and it says permission denied and but you have to give yourself permission to execute that file. Questions? I know we're covering a lot of ground and text interfaces are not a lot of fun. I understand that. Once you get used to them, I live like I say all day long in these text interfaces and for productivity, they're great. For ease of understanding and learning, they pretty well suck. So that's the trade-off there. Let's go through the on-demand interface. You saw where I was at here again, just to show you. I go to files and I want to go to my home directory and I'm going to scroll back down to that BioCAD intro again. And there's that myhost.sh. This is through the web interface. This is through on-demand. I can bring down this thing. I can do all sorts of things. I can rename it, download, edit. I'm going to edit it in this case. And there you'll see the file I just had there before. I'm going to take that sleep command back off. So I highlight that. This is the one, since I'm using it in the web browser, my mouse works, all that kind of stuff. Whatever I want to do, you can interact with it a whole lot easier. I click on the save button again, close that file out. And now, if I look at this file again at myhost.sh, you'll see that it's that sleep command is off the end. And now, if I try to run it, it entered, see how it entered it? And then it just came right back to me. It didn't try to wait, not waiting for anything. Have I completely bored and lost everybody? Because I know that, like I said, there is no good way of going through this, Dave Hush. So I copied your myhost.sh to my home directory. And now I'm trying to run it. Okay. So to run it, you do dot slash. Okay, that's a good point right here. If I just type myhost.sh, which was what you normally do to run a command, right? It says command not found, right? That's probably what you've got. Because that's not a command. It doesn't know where to look. There is what's called a path here, which is good. I have a list of like topics I want to cover. That's why I keep looking at my phone. And one of those is the path. So if I do echo dollar sign path all in uppercase, this is where it's looking for files for me. I added a couple of things on there at the beginning that you're not going to have. So I think yours by default, will probably start with this, the user local bin, user bin, user. She doesn't have all of that. The default actually. Does it? I thought I had to add that. Okay. All right, at least the default one is on the phone. Okay. So I have a directory that I keep my binaries in. That's been, that's one of my few that's not rolled readable on my system, or at least not most of the files in there anyway. So that's where it's looking for files. There's nowhere in here that has BioCAD intro on there, right? So if we want to run something that's outside of this, we have to give it the path to the file we're trying to run. So when I want to run myhost.sh, which is in BioCAD intro right now, I have to, I can do it this way. I can do slash homes, Kyle Hudson, BioCAD intro, myhost.sh, and it ran it. Now, as a shortcut to that, I don't have to type all that stuff out. I can say current directory, which is the dot, single dot, dot slash myhost.sh, but you have to be explicit with it. So you say, I want to run the one in this directory. So I am in my homes, which is where the copy of your file is. And you know, when you do the task on a file, it doesn't want to do that. Is it executable? Yeah, so chmod, yeah, space plus x, and then the file name, and I bet it works then. Okay. You can add to your own path. You can say, path, I have to export, right? Export path equals, I don't want to erase what's out there. I just want to add them to the end. So I say path, that's the previous path, and then dot. And now when I look at my path, you'll see dot. So now I can do myhost.sh because print directory is in my path. That's not saying homes, if I was to go somewhere else, it wouldn't work anymore. I generally don't like doing this. I like to make it explicit that you have to, that you're running from the current directory. There's another nice little command says which. So if I say which, myhost.sh, it's going to tell you it's running this one. If I say, which hostname, it's going to tell me it's in user bin, hostname, that's the actual command that it's running. And it's basically going through this whole path. So it's going to look in the first one we have here. It's going to look in homes, Kyle, Hudson, Ben, and see if there's a myhost name there, and there's not. So then it's going to go to op, vayocat, share, vayocat, stress. Not there. What user bin? User local bin, not there. User bin. Hey, I found it. So that's where it's going to, that's the order it's going to run it in. When I went to do my host, it says it's not here, it's not here, it's not here, it's not here, it's not here, all the way to the end. And finally got to the end and said dot. Oh yeah, that's the one I'm going to run. My list of topics here. Yes, that's what the witch command does. It tells you which, well, which one of the, where I'm running this from. That can be handy if you have several different versions. If you look in this directory actually, the very first thing on this up here is a dot out. When you compile a C program, by default it goes to a program called a dot out. So there might be a dozen different a dot out in your system, which is really command for something like that. If you're just, you know, if you're using generic thing, something generic like that, or like I said, just figuring out where things are actually running from by saying which command. Yeah, if I was to run this command, because of what I do, which host name, it doesn't actually run the command. It didn't tell me Celine. It just told me, I'm grabbing, that's the one I'm going to run if you run that command. Before you ever run it. Yeah. My host, I forgot to change it. Oh, that means it's still running, right? So it's still running. That's easy enough to do. Control C will pretty much cancel out of pretty much everything you got to do. Control C just sends a signal to that program that I want to stop. Just going through this, ls-nl.slash, I host .sh again. Looking through this again, I kind of, we start, just kind of stop here with permissions. So this is going to tell me the permissions. This is going to tell me the user that owns it and the group that owns it. Most of the cases, that's going to be your name and then your name underscore users is a default group that you're in. There are some people on campus that are in project groups and those, you might have that listed as your project group that you have that in, especially if you're dealing in bulk folders and things like that, which we already discussed nobody here even has that in this classroom, but there might be somebody on Zoom or somebody watching this later that has that. It'll also tell me the time that it was created. So if we look at these up above, oh, in file size, this is 17 bytes. So you can tell how big these files are and when they were last edited. So I have one here, this is back in our old system, a QCED file. That's how we used to do this. So we have everything from with 2013 on up to file we just edited today, which it doesn't even give the year on that because it figures it'll give you time instead. Again, the switches where you can do, I said L is long. I can do LH for human readable. And now instead of telling me how many bytes it is, it'll tell me it's 577K bytes. That's really useful when you get strings of numbers that are this long, is that one megabyte or 10 megabytes or a hundred megabytes? How many zeros? H is really handy for that kind of thing. I can also say L, A, H, A means all the files. By default, anything that starts the period, it leaves out the file listing. So this will tell me files. Also, if we only have a couple of them here, I have dot, which is my current directory. If I told you dot is current directory, that's an entry in every one of these dots. And it tells me I have 841K worth of files in this directory, which is also a useful thing to have. That doesn't work universally. And as a matter of fact, at Wichita, it does not work. So anybody's on Zoom or watching them for Wichita state system, their file system doesn't work the way ours does. So that part doesn't work for them. But it also tells me that when I go back to directory, dot, dot is the reference to the directory ahead of mine. So that's my home's Kyle Hudson, without the Bayou guide intro. And I've got 162 gigs in there. I'll probably trim that down a little bit. Got a lot of files there. If we were to look in my home directory, you'd probably see a lot more of these. There are files that are created for you all the time. And I'm not gonna, well, I'll show you what they are. I won't show you what the contents of them are, because a lot of times they put hidden information in there, things that are specific only to you. So we don't want people mucking with those. There might be some password kind of things in there. Mine I know are shut off right, but just to show you. So I'm gonna CD and do LS dash L-A-H. Now that's gonna be a whole bunch of stuff, right? What command would I use this to look at parts of a file? To look at, I don't wanna see all those files. If I do this, I get, and I see the last little bit. I don't wanna do that, less. Now, but here's the thing, if I do less, what file name am I looking at? Oh, that's a tricky one, right, see? So we have what's called a pipe. And this is one of the most powerful things that Linux does, is it lets you take the output from one file and put it as the input to another file, or to one command to the input of another command. So I would have this LS command that I wanna run, but I want to send it to less. So I went instead of the output going to the screen, I want the output to go to less. So I'm gonna use a vertical bar, and that says take the output from this and send it to the input of this. You can see all the files I have here. And you can see some things that you wouldn't be able to see from my LS command before because they start with a period. There's a dot ansible directory, for instance. Dot ansys, I got all sorts of things here. AppTainer, aptitude, bash history. This is a really interesting one. If you look in this file, you'll see all the commands that you've run. That's how it knows when you up arrow and down arrow. If I was to log out and log back in, it would remember that from a previous session because it saves that in the bash history. Bash profile and bash RC, those are files that are run automatically when you log in. Don't mess with those unless you have a good reason to or know what you're doing. But there are things that we can do. You can see a whole bunch of files here to start with dot. And those are files that don't show up. They're hidden on purpose. We don't want to see them unless we're explicitly looking for them. Yeah, all I see is that much of it. Because I- Just even with less. Yeah, well, right, because I can scroll up. That's through my web browser, I'm scrolling up. But basically it just spit everything on the screen and there's a couple hundred lines there that just spit down the screen. It's okay, you get to see this much of it because your screen's not 10 feet tall. When I'm working with my normal desktop, instead of seeing this many, I'll probably see 50 lines or so. Pipes, like I say, are one of the most useful things I do this all the time. One of the things that we will do to run different programs is we run modules. We'll do module avail. And we're gonna talk a little bit about some other ways of doing this. But module avail, these are the programs you already have installed on BioCat. And if I do module avail, you'll see that there are lots and lots of these. I think we have 800 last I looked. Yeah, so these are all the programs we have installed people to use on BioCat. And you can see by default, the module avail command goes to more. Now the module system is pretty cool though, because I can say module load Python. And now look at what my path does when I do that. We talked about the path. And this is as far as I'm gonna go into modules. Dave's gonna cover a lot more of this kind of thing. But now you can see there's a whole lot more stuff on here. Instead of just a few things I have, the first place it looks now is in our software directories for Python. And that looks for some stuff that we had that prerequisites. So it adds to that path. And when I say module unload Python, I look at my path and it's back where I left it before. To see what? The software means module avail. Module avail. And like I said, I think Dave's gonna go over that in some detail probably. Anything else you really talked about? Pipes, filters. There are a few sort of standard commands that are really useful to know. And I don't expect anybody here to become an expert in them here in this class. It takes some working with them. BioCat intro there we are. So we saw, we'll take that print command for instance. If I wanted to see what's, if I wanted just to look in that file and only look for those print commands, I can use the grep command. Very handy. The name is some obscure, something regular expression in print. So it's like, I can give it a pattern, give it a strange pattern and say go through this file and find it. So I can say grep, print. In MPI example.c. Cause we saw that file earlier. Oh, that was OMP. Hello is the one we used, wouldn't it? There we are. And that's just for every line that has print in it, in that file, it'll print it out and it'll highlight it there by default. So that line had a printed, this line had a printed, this line had a printed, this line had a printed. Yes. Module avail will give you a list of like I say, it's like 800 of them. Last I knew. Or what could be added even in addition to that? Is that what you're asking? Okay. Okay, that's a good one actually. So I'm going to module avail. Now, this is kind of a tricky thing to do because it's kind of a pain in the butt. By default, this sends it everything to more. We don't want to do that. What was the command to do that? Minimal? And that's the... I'm going to do it the way I normally do it because there is a way that you can tell us to do this automatically. I'm going to show you the way I do it. Your output goes to what's called, we have three devices we look at a lot. Standard input, which is your keyboard. We have standard output, which is your screen. We have standard error, which is also your screen, which makes it kind of tricky because we have to say have to differentiate between the two. When I do module avail, that's actually sending it because it's going through more. It's actually sending in standard error. We want to send it to standard out. So this is a weird one. I don't have no idea why it works this way. I just know that it works. So we say two, which is the standard error, and we're going to redirect that to one. So and one. That's saying take everything from standard error to standard out. This is complicated. I know it's, and it's weird. But if I do that by itself, you'll see that instead of going through more now, should just show it over. Just shows everything. It did throw it, put it into more. I didn't do that. Well, I was going to say grep-i dynna. Cause I know I do this a lot. This is how I find modules. We don't have LS dynna here. What is the other commercial one? ComSol? Is that the one we do have? Yeah, we have an install in Wichita. That's why I thought it might be there, but apparently we don't have it here at K-State. Okay, yeah, we do have some ANSYS stuff. So what that does is that says, yeah, look for those lines that contain this. Grep-i means K-sensitive. So I told you everything has the switches. By default, it's K-sensitive. I say dash-i makes a K-sensitive. So I said module avail, look for those 800 packages for lines that contain ComSol in a K-sensitive manner. So there's your ComSol. All the lines that contain ComSol in there. And I know it's confusing. I get that. It took me a long time to wrap my head around this. I, like I said, I'm not expecting anybody to be an expert on it, but I want you to know that it exists. And if you look at our support site, we have several examples of what to do and how to make this kind of work for you. I think I was hiding mine there. And we don't have licenses for it, but we have the software installed for it. So you have to point to your own license server back in your department or whatever. Yes, we do that a lot. We say your license is over on this department server. You'll have to get the name and the port and all that stuff from your department people. But yes, we can use it that way. Yeah, we can work with you if you get more versions on it, we need to. Yeah. With that, we're ready to take a break here, give you guys a quick bio break. So come back and want to start right at 3.30, Dave, you got a full hour. Do you want to go up 10 minutes from now? Five minutes and I'm going to pause the recording. I'm not going to stop it, but I'm going to pause it so that we can pick up. I'll presume, there we are. Now we're going. Okay, Kyle just gave you an overview of how to use Linux. And what I'm going to cover is introduction to our Baalcat supercomputer itself and then talk to you about how we go about using it. And I know that some people are also probably attending this via Zoom from Wichita. We manage the cluster down there as well. And so a lot of the software that I'm going to be talking about and the methods are going to be very similar. The hardware is what's different there. And I'll try to point out a few of the differences. So this is a diagram of what Baalcat looks like. When you log in, there's two head nodes and you'll get put on one of those two depending on what your IP address is. If you're logging in from home and your IP address might change, you may get put on one or the other in subsequent days or something like that. They're both the same. It really doesn't matter which one you're on. The head nodes are there to allow you to get your jobs ready to run, move files around, compile codes if you need to, and things like that. They are sentos machines just like the compute codes. They give you access to the slurm scheduler. So the slurm scheduler is then how you run your codes. Once you get them ready and get a job script ready, you will submit that job to the slurm scheduler. Tell the scheduler what resources you need and that slurm scheduler will then decide which compute node which are at the bottom of this diagram to put it on. On Baalcat, we have about 10,000 Intel or AMD cores on over 350 compute nodes. So that's somewhere around $3 million worth of hardware. This is where Baoshock is gonna be different. Baoshock has, I think, 20 compute nodes and there are 36 cores each, if I recall correctly. Yeah, most of these are Intel cores or Intel processors. The AMD systems we have are these ones in red on the left called, we call warlocks that might have up to 128 cores. Most codes when you run or compile them and run will run fine on either one of those. So you don't really have to worry about that. There's a few that would have issues with one or the other, but that's a rarity. One of the things that we get asked a lot is why do we call this a supercomputer? It can be called a cluster computer because it's a cluster of workstations. Supercomputer kind of describes it as well because it's better than just running on a laptop or PC. A compute node, I said we have 354 compute nodes on Baalcat, a compute node is a computer in itself. It can be just like your laptop or your desktop machine. It's just usually a lot more powerful. Your laptop might have four or maybe eight compute cores. And again, I said that some of these new AMD systems could have as many as 128 compute cores in one system. The memory is also typically much larger. You might have 16 gigs in your laptop, for example, where these compute nodes could have 128 gigs and we have some that have one and a half terabytes of memory, the terabytes, a thousand gigabytes. So we're just dealing with a lot more powerful computers here, but there are similarities to your desktop or laptop system. Another thing that makes a supercomputer super is the network connections. We provide 30 to 100 gigabit per second links between each compute node. And I have a couple of the technologies listed as InfiniBand and Rocky. Rocky stands for RDMA Overconverged Ethernet. You really don't need to know what those mean. It just means they're very fast and they're low latency. Latency is the minimum amount of time that it takes to send even a small message. And in the case of InfiniBand, it's about a microsecond to send even a small message from one computer to another. And again, we're sending large amounts of data at 30 to 100 gigabits per second. That's somewhere around a thousand times the communication rate of your internet access at home. I have T-Mobile 5G internet at home that gets me up to 50 megabits per second. And again, we're up in the gigabit per second range for our communications between nodes. So about a thousand times faster than your home internet access. All these are also connected to a petabyte file server. A petabyte, does anyone here know how much data a petabyte is? Okay, a petabyte is a thousand terabytes and a terabyte is a thousand gigabytes. So it's a lot of storage and we have it mostly full. Scientists have unlimited ability to fill up whatever disk space you will provide. So if we actually added another petabyte, we would fill it up in about a year and a half. That's a pretty good rule that it takes a little while to fill it up, but it won't get filled up again. The access rate to that petabyte is very fast. Again, I have it listed as one slash 10 slash 40. We have some machines, the moles that have only one gigabit per second access, which is still pretty good, but we got a really good deal on those. And we don't have as fast of an access to the file server, but everything else is 10 to 40 gigabits per second, very fast access to the file system. We also have a fast scratch, which I'll talk about more tomorrow. We also have a lot of graphics processing units or GPUs in there, almost 150. And 145 of those are 32-bit NVIDIA GPUs. These are the same kind of GPUs that you could buy at Best Buy. They're just the very high-end ones. So a lot of these might be $1,500 NVIDIA 3090, for example. And we have a couple of systems that might have eight of these GPUs all in one system. And when we tell people that the gamers, the people who play games on PCs just start drooling because, oh, what I could do with eight GPUs in one system. But the thing is, we don't have HDMI cables coming out of any of these. We're not using them to do graphics. We're using them to accelerate scientific codes. These are streaming processors. They're good for doing graphics, but they're also good for accelerating some scientific codes, definitely not all. Again, 145 of the 32-bit GPUs, only four that are more expensive Tesla-class GPUs, which are 64-bit capable. And these are even a little bit older. The newer ones of those might cost like $11,000 each. So they're very pricey. We name our classes of machines. So the elves on the right are the oldest ones. They're still very good. They mostly have 16 cores each and about 64 gigs of memory. So about four gigs per core of memory. And as you work your way over towards the left here, these are our newer nodes. The wizards are our newest Intel systems and the warlocks are our newest AMD systems with the epic processors on them. And I don't wanna go through all the different numbers of cores and the memory sizes on each. When you submit a job to the Slurm queue, what you wanna do is tell it, I need this much memory, this many cores, if it's a parallel job, and it will go out and find a matching compute node from that and say, hey, I have room on one of the dwarf nodes for that. So I'm gonna go ahead and start that. If it doesn't have room, then your job will sit in the queue for a little while. Okay, so Kyle kind of introduced our group a little bit. So I'm not gonna go through this too much other than introducing, here's a picture of Dan Andreessen, our director. I'm the application scientist. So if you have problems with your job, getting your code optimized or parallelized, I'm a good resource for that. Adam and Kyle are our system administrators. If you have any problems at all, you want to email bailcat.cs.ksu.edu that gets to all three of us. Some people email us directly, it's better to do it to the bail cat help because then we all three get that and I commonly have people who I work with regularly will email me directly, but then all of a sudden they'll say, well, and the compute node that my group owns is down, can you reboot that? And I have to say, well, no, I'm not a system administrator. I don't have the root password. So if you send it to, if you send an email to the support email, then it gets to all three of us and we can help you a little bit better. Okay, does everyone in the room at least have a bail cat account? Is there anyone that doesn't? Okay, okay. Adam can check on that if you give him your EID. So I'm not going to spend too much time on this because this is all in our documentations. If you need an account, you go to the documentations at bailcat.ksu.edu and you just do a get an account and follow that. You'll have to go and Dan will have to approve this and once he does, then your account is created and you can log in. Getting in from an Apple system or Linux is a matter of opening up the terminal and using SSH to get in. I'll give you an example here of how I would get in using SSH. I put my bail cat username at bailcat.ksu.edu and you can get in that way. If you are faculty or staff, then you have a multi-factor authentication. So I'll talk about this more tomorrow. So anytime I SSH in, I get a message on my phone that I have to approve my access. So I'll talk more about that tomorrow. If you're on a Windows system, we recommend MOBA X term because MOBA X term gives you a terminal interface. It also helps you to transfer files pretty seamlessly between your Windows system and bail cat. So if you're editing files on the Windows system and wanna push it down, that's MOBA X term is actually pretty slick with that. You certainly can use putty, it's another common one, but then you have to S copy things down manually or use something like FileZilla so you have to use two different things. Yes. Do we choose the portable or install? I would say use the follower, unless you can't. Like if you're installing on a live machine, it wouldn't let you do that. It's all right, but it's not a big deal. So I'm not gonna cover this much either because Adam's gonna talk for a full hour on tomorrow on the graphical interface. Kyle showed you some stuff through getting in on the web browser that's open on demand. And again, there's a lot, lot more that you can do with that, but Adam's gonna cover this tomorrow. So I think I will leave all that up to him. And what I wanna move on to as well, when you get into PaleCat, what do you really wanna do? And the first thing that I recommend is using the Kstat tool. That can tell you what compute nodes there are and what jobs are being running there and what jobs are in the queue waiting to run. So it's a good resource tool for all the information of what's going on. It also heavily makes use of color so that if you're running something and it's not very efficient, it may warn you with a yellow background. If you're running something and it's flashing red, means you should probably look and see what it's doing or contact us. Kstat is actually a pearl script. Pearl is a computing language. I've written this up and keep adding to it over the years. It actually takes, so Slurm has its own commands like SQ and S-account, but what I do with the Kstat script is I take all that stuff and some information from elsewhere and kind of combine it all together and colorize it and so that it's a lot more helpful than the tools that Slurm provides. So I wanna show you what it actually looks like. Is that big enough for the people in the room? Bring my window down first. So a lot of commands have help information built into them. I did that wrong. So if you take a typical command and do it like a minus, minus help, that'll commonly give you the helpful information, at least a short version of that. And so I give a usage statement here and then I give you some common things that you can do with it. If you just type Kstat, that's gonna give you a list of all the compute nodes and the jobs running on them and what's in the queue. So that's very useful, but it's a ton of information because remember we have 354 compute nodes and there might be thousands of jobs on those compute nodes. If you just do Kstat minus H, it gives the host information. I'll show you some of these in a minute. The minus Q gives you what's in the queue, minus C gives you kind of a core usage for each user so you can see the users that are hogging the machine, for example. Another useful one is Kstat with a minus D and then the number of days it'll allow you to look at all your jobs that have finished in the last like 10 days if you give it the number 10 and it can tell you information on how much memory it used and how it completed. Was it successful? Was it failure? Did it run out of memory? Things like that. So let's just kind of go through some of these. If I do a minus H, again, this is going to be showing the hosts only and I'm going to pipe this to more or less. It doesn't matter so much. So it starts with the oldest machines, which are the L's and then it's gonna run through to the newer ones. And let's just look at some of the, what's colorized here and I'll try to explain things. If we look at L7, we see that there's nine of 16 cores. The number of cores are in cyan or light blue because it's not totally full. If you look at elf eight, they're in a darker blue meaning that that compute node is full. All the cores are used. If it's red or yellow, it means it's empty. Red actually means it's empty but that compute node is owned by someone or in this case, it's owned by the group called reserve. We can put classes, for example, in the reserve queue. So they have priority to run there. Anything that doesn't have a group on the right in orange means it's unowned. Anyone can use it with the same priority. If groups have contributed money to buy compute nodes, they get priority on those compute nodes. You can still run there. It just means if someone from that group wants to run, it'll kill your job off. Your job automatically gets put back in the queue to run on a different compute node but you have to wait longer than if you don't own or have priority on an owned compute node. So a lot of the elf nodes are unowned. So what is the yellow on elf seven mean? The load level is one even though there's nine cores being used. So that's not really good. Usually if you have nine cores being allocated, you would like to see a load level of nine. Sometimes it's actually higher than the number because you can run multiple threads on the same core but this is an indication that something is wrong. Someone may be running, someone may have asked for nine cores but be running a job that's not multi-core capable, for example. So this is something I would look at and see maybe contact them and say, hey, you're not using that resource efficiently. So that's where a lot of the colorization can come in. Here again, it says this compute node is down and it's not responding. That's where Adam or Kyle would look at it. These are like nine-year-old machines. So they might look at it and say, well, we're gonna say that one's dead and not revive it anymore. See, Adam's smiling. He likes killing nodes off, okay? I wanna run them forever. He likes killing them off. So here we see a few that again have load levels very low, of zero essentially, they're allocating one core but running a little to nothing. So again, when we get to looking at the queue, there's probably stuff in the queue waiting to run but we have idle nodes. So why is that? Well, any more, a lot of what's in the queue is asking for a fair amount of memory per core at least, either more than these can handle or more per core than we can handle. The other reason that these might be idle is because these are owned by groups now. You can see Jeff Comar owns these. Any of you can run there, again, in killable mode but only if you submit a job that takes seven days or less to run. We don't want you to run too long of a job because if it gets killed off, then it wastes that time. So yes. Yeah, if you have money. So the answer to that is on your home directory you have one terabyte. So that's free. If you need to go over the terabyte, then you set up a bulk directory and you have to request that and you have to agree to pay $45 per terabyte per year, yeah, and you get billed monthly. So that's where the petabyte comes in. There's a lot, you know, if you want to store 100 terabytes on there, there's access to that, but it is going to cost you money per terabyte. Now- We're going to make it on par with the cost of applying a USB drive. So the other thing is we do have scratch space. Scratch space is available to anyone but it gets purged every 30 days in the case of fast scratch. So it's a place to while you're running a code, you can pre-position data, but while you're running a code especially, there's lots of space there, but just realize it's going to be deleted if you don't move it back to somewhere else after you run 30 days for fast scratch. So we give you a fair amount of time. And I think we have like 270 terabytes in fast scratch. So let's just, so you can see there's a lot of actual room open. Yeah, our total utilization is very low at the moment, 20%. So, yes, the answer is a lot. It's ownership of his data is a little questionable. He has access, we have access. So things like that. So if we just do a case step minus Q look in the queue for example, let's see what's waiting. Okay, so not much at all. And one person there has waited for five days, but they're asking for 248 gigabytes of space and eight cores. The other thing is they're asking for a 552 hour run, which is more than one week. So they aren't going to run on any of the nodes that are owned by other people. You could also look on the right side there and see there, they do have priority in the groups for the Kim MRI and Christine Akins. So they have access, priority access to those compute nodes. But right now we just, we don't have a lot queued up. So in the past, we've had a lot of people who do parameter sweeps and run thousands, tens of thousands of small jobs. It took a couple hours and would fill up any little gap on our machine. So we had utilizations up 75%, 90% commonly. More recently we just have fewer people doing that with people asking for lots of memory. Okay, so let's move on. Yeah, they have access to a lot of nodes that Christine and the Kim MRI own. They understand that when they ask for this much memory, that's usually a significant weight. So if the scheduler tries to a fair share, almost crying, but then it's crying. The good news is when the utilization is low, you can get in there and access it unless you ask for these huge runs. So KSTAD is a good way of understanding what's being run and things like that. Ganglia is another way. Ganglia is a web interface. And it's a very extensive resource for understanding what's going on. I don't use it much because I try to include as much as I can in KSTAD, seeing as that's right there. But if you ever wanted to see things like how much memory does my job use over time, it can pop up a graph of that. So if you ever wanna see anything over time, that's where ganglia is much better than KSTAD, for example. And I'm not gonna really show a lot on this other than I wanna make you aware that it is there. Okay, so let's talk about submitting your first job script. So let's start with how many people in the room have submitted a job script on Baalcat, show of hands. So fair number of you. So I have this script called sb.hello. I name all my s batch scripts with sb so they all kind of come up in the same place in the directory. So sb.hello is the script on the right here. And let's just go through what that is doing. Anything with a pound s batch is a directive to the slurm scheduler. So this is what we're telling the slurm scheduler, the resources we need and how to run my jobs or in the case of that first one, we're telling it what job name to give it. So when you type KSTAD, it'll show information on each job, including the job name that shows up. And in this case, I'm just naming the job hello. I'm giving it a maximum amount of time. The time is listed in days dash hours, colon minutes, colon seconds. So here I'm just telling it, I wanna run a max of one hour. If my job takes one hour and one second, it's gonna kill it off. This scheduler will kill it off before you get to the end. So make sure you always ask for a lot more time than you need. The next two lines, what I'm doing is I'm telling it, I wanna request one compute node and one task per node. That's one core on that compute node. So I'm not doing anything fancy, like running a multi-threaded job that needs multiple cores. I'm not doing a multi-node job or anything like that. Just one core on one node. And then I'm telling it how much memory to request. So this is one of the most common things that we get is, well, I'm gonna run this job for the first time. How much memory should I request? And our answer is typically we don't know either. Typically just ask for a lot of memory, run your job, use Kstat to look at how much memory it's actually using. And then next time you'll know better how much to request so that you don't over request too much. If you end up using more memory than what you request, if it's a little more, it can start by disk swapping your memory to hard disk, which will slow it down enormously. And if you go way over that, it'll eventually kill your job off and it'll tell you when you run a job, it will create a slurm dash and then the job ID number dot out file. So like this, but the pound sign will be replaced by a seven digit number. That's where everything from standard out goes. So whatever your program outputs to standard out, we'll go to here. Also any error messages and stuff like that will go to your slurm dot out file. So if you do encounter a case where it's running out of memory, it'll dump that information to your slurm dot out file. It'll also, if you do a Kstat minus D10, it'll give you the status of all the jobs you've run in the last 10 days. And that would also say out of memory or out of time. So yes, yes. If you don't provide anything, then you get one node, one core, is it one gig? Yeah, one gig of memory for one hour. So it's not much. So you always really want to provide this information. You can also provide this information on the command line when you submit an SBAT script. I highly, highly recommend putting it in your script because then if you have a problem and contact us, we can look at your script and see exactly how you ran it. And so there's a nice record of what you're doing. So always put this stuff and all your module loads and stuff like that in your script. I'll talk more about modules tomorrow. Okay, so down here is what we're actually going to do. And what I'm doing is I'm running the host name command and putting that in a variable called host. And then I'm doing an echo hello from host. And I'll show you what the sleep does. This is a sleep 300, which is just five minutes because I wanna show you that I wanted to stay in the queue so we can look at it when it's in the queue. So I'm gonna cd to my test directory. And so again, that's exactly what I listed there, I think, listed there. And so I'm gonna do, I'm gonna submit it to Slurm now by using the SBATCH command and the name of the script. So SBATCH SB.hello and now it gave me the job ID number there that is unique to this job and this job alone. Now I'm gonna do a case stat minus, minus me. This is one I didn't show you yet. This is just gonna give you the information about your jobs and the nodes they're running on, including anything you have in the queue. So it's a good way of filtering your stuff out of what everyone else is running. And this is showing that we're running on wizard 23. It gives the name, yes, my username, it gives hello as the username or the name of the program that we're doing. And it says it's running on one core. The word run is in green, okay? Green means that I'm running on a node that I have priority on. If it was red, it would be in killable mode. Someone could come along and kill it off. I'm in this priority group for CS, high performance computing. So I have priority there, that's our group, that's Adam and Kyle and Dan, I assume Dan's in it. He doesn't use it so anyway. So this is one of the nodes that we own or have priority on. So my jobs go there normally. And again, it's just sitting there. If I do a K-stat minus L, minus L when you do an LS gives you a long list. Well, with K-stat, it's kind of the same thing. It's gonna give us more information. And more information about what's on the compute node, like it's 100 gigabits per second and these things. It's also gonna give the utilization. It only does this every five minutes because it's running a job, what's called a crontab job on each compute node to gather this information. But it's already grabbed that. It says that my job's running 99.8% idle because I told it to sleep for five minutes, okay? So since I want to kill this off, now, if we just let it run for five minutes, that's fine, it'll kill itself off or it'll finish, but I can use S-cancel and the ID number to actually cancel that. And if I look for my jobs, it should show me that I'm not running anything. Now, if I look at the list of files that I've got on this directory, the last one running there is the slurn.out file that I just produced. So let's get that. And it says that it was canceled, but it says hello from wizard 23 first, which is what we printed out. Then we sat in the sleep statement and I canceled it out. It says that it was canceled. If I do a K-stat minus B minus D1 to look at what I've run today, then it says I've run one job called hello on wizard 23, one node, one core, I didn't use up any memory. I only asked for four gigs. It ran for two minutes, 47 seconds, and it says cancel. So this is a good way of, if you don't know what happened with your code, it's a good thing to use K-stat minus D to see if it's finished up, it can give you some information too. Okay, so that's kind of how to submit jobs. I went over some of this. A lot of the most common scientific packages are already installed for you as modules. And this will be common if you go to other supercomputing systems too. They all use module-based systems. The modules be maybe slightly different name is all. So I'll kind of go through this a little quickly because Kyle covered it. Module avail gets you the long list of everything that's there. You can also do searches using module spider Python. And I put a slash at the end. Python is a package in itself, but it's also used in a lot of other packages. So you might have bioinformatics tool that uses Python. And if we just want the Python tools to start with, you can put that slash in there. The good thing about the module spider is that it's case insensitive. So if you don't, when you do a module load, you're going to have to specify the correct capitalization. If you don't know it, module spider is a good way of doing that or you do the module avail and pipe that to a grip minus I. Adam or Kyle, can you look at the question? So when you get to using modules, I always recommend starting with module purge. That clears out all your module systems. In my BashRC file, every time I log in, I have it load certain modules for me that I use commonly. But when I run things, I want to clean that out and load in the systems that I need and go from there. If you load them in when you're on the head node and submit a job, those same modules will be available to your code that's running in your job script. But it's still good to put these things into your job script so that it's repeatable each time. So you know exactly what modules you have loaded. So when we look at it, if you're having trouble, we know what modules are loaded and things like that. So in a script, if I'm using Python, for example, I would do a module purge, module load. Python will give you the default version. That's usually not what you want with Python. You usually want to specify a specific version for Python because you have to match your virtual environment. And that you also want to, if you're loading multiple modules, you want to use the same tool chain. So what's a tool chain? The main tool chains are FOS and IOMKL or versions of that FOS is free open source software. So this is the good new compiler and such. IOMKL is the Intel compiler. MKL is the math kernel library that goes with it. So, but you also, if you're loading multiple modules, you may need to match up some of the version numbers to get some codes working. If you don't find software package that you need, ultimately you're responsible for that. So installing that software on your own home directory, but we're always willing to help. That's part of what we do. It does mean downloading the software, reading through the documentations. Sometimes it's easy to do. Sometimes it's not so easy to compile. The standard way is to configure, make, make install. Not every package makes it easy to install. A lot of them depend on a bunch of other software and things like that. So the bottom line is we're always willing to help on this stuff. If you have problems, or if you're just starting out especially, we highly recommend going through the Baocat documentations. There's Linux basics, Slurm basics, Slurm advanced. We'll talk about some file system stuff tomorrow. I wanna mention array jobs. If you're submitting hundreds or thousands of jobs that are similar, people do write up scripts to do that automatically. We prefer that you use an array job as a single job script that you can program up to run to essentially behave like tens of thousands of jobs. It may just be that each one operates on a different input file or a different random number if you're doing some statistical analysis or things like that. So array jobs are fully documented in there as well. Getting help, again, I mentioned the support is Baocat at cs.ksu.edu for Baocat. When you email that, it does get to Adam and Kyle and myself. Give us lots of information, okay? Definitely give us the job ID number or numbers for sure. The directory and command you use, a full description of the problem. We commonly get emails to us saying, hey, my code doesn't work. It worked last week, okay? What do we do with that? We email them back and say we need more information. So start with giving us the more information. For those of you at Wichita, one thing that we really, really need from you is also your EID. CSUID, EID spacing, three seconds. Okay. They won't know what an EID is. We need your WSU EID. CSU ID. Your emails come in, not including your WSU ID. So we can't really trace back to your account and so forth without that information. So I will change that. Yeah. So the other thing you can do to help yourself is that if you're running a job, you can SSH into that compute node and run Htop. I'll just show that briefly because I can SSH into any node. Htop is very useful as well. So again, you won't be able to SSH into a compute node that you're not running a job on. But if I SSH into Warlock 17, and let's see, there's a job, Trinity job running there. So if I run Htop and then I'm gonna type U for user and Q, you can also use the upper and down arrows to choose your user that you wanna look at. So I wanna look at this one job by this one user. And what I'm seeing here is they've requested a lot of compute cores, but at the moment that only this top line here says that they're getting 100% utilization out of CPU number six. So CPU number six is here. They're using that, but all the other cores that they've requested are not being used at all. So again, that's one that I might want to look into further and see, you know, they're not using their request efficiently. So I might contact them and say, hey, you know, do you know that this is happening? Or, you know, use one core at times and then spin up more things like that. And then I'm gonna hit Q to get out of this and exit. But Htop is a very useful way of looking at your own code and seeing what it's doing. So the other thing is that we have Zoom help sessions on Wednesdays from 1.30 to 2.30. So if you go to the Baoket website and look at support way down at the bottom, it'll give you the Zoom address and things like that. So this is a time when Adam and Kyle and I are usually all three on. And this is available to people down at Wichita as well. On average, we get maybe one person every other week. So it's a good way to come in and drop in and, you know, make use of the access to us. There's a lot of other information out there. I'll covered Linux in just one hour. That's not enough to get you up to speed on Linux. If you just Google for these topics, Linux tutorials, there's very good ones out there. There's even some of the basic information on our support. Yeah. If you're gonna be using Unix or Linux much, I would advise learning to use VI or VIM. It's been around forever. It does have a learning curve. It's not quite so bad as what Kyle makes it sound. I don't think. No, you said you could use it forever. Not forever, only three decades. Same. So, if you're just getting into Baalcat and you're used to Windows, use mobile X term, edit your stuff where you're comfortable on the Windows side and move it back and forth. If you start using it more, you know, VI or VIM is available everywhere on Linux. So it's just a nice place to, a nice thing to learn in the long run. Well, I won't go into more of that. Man, I think that's about it. Are there any questions here? You all know everything about Baalcat now? Yeah, I see some heads. Yep, a hundred percent. We gave you a quiz, you days it. That was my shift, it's okay. Sport on Baalcat, that's your idea. On the under the getting started. We have postings of past videos, but I don't think we've ever spent more than an hour on intro to Linux. I will cover some advanced topics tomorrow for even those of you who are fairly familiar with Linux, things that you may wanna try. Well, hopefully we'll get you started a little better this time. And the big thing is, you know, this is good to come here for the introduction, but if there's still things you don't understand, get ahold of us. Again, we do the Zoom meetings on Wednesdays, but that doesn't mean that it only has to be Wednesday. If you have questions or if you have lots of questions, you know, if you have a couple of questions, send us an email. If you have lots of questions, arrange a Zoom meeting with one of us. I'm more than happy to, you know, sit with you for an hour on a Zoom meeting and go through all your stuff. Any questions you have, things like that. I can't do that with everyone that wants to run on MailCat. That's why we're trying to do a workshop to at least get the basics in. And we ask you to, hey, at least go to the documentation and read through that. We don't expect you to read through every bit of the documentations or memorize it or anything like that. So, yeah, yeah. Licensing is a pain in the butt regardless how you do it. That's the, yeah. There's nothing we can do to make it better. Yeah, there is, but they have lots of dollars, lots of zeros on the end of the dollar. I don't have that. Okay, so I think we'll say that's good for today. For those of you here, Kyle can give you a tour of MailCat and you can see the hardware. You guys wanna do it? I'll have it. He was just trying to. And then tomorrow we'll start up at the same time, 2.30. I'll give some overviews of some advanced Linux tips and techniques. And then at 3.30, Adam will talk about more of the open-on-demand web interface and Jupyter notebooks and that kind of stuff, more advanced things. We'll show off some of the.