 Supercomputing, that's what we talked about when we were talking about BayoCAD is supercomputing. Or as we like, there are two types of supercomputing and we kind of cover both of those and those are HPC and HTC, that's high performance computing, meaning it's running really fast and high throughput computing, which means we're processing a whole bunch of jobs. So what defines a supercomputer? This is where I cast questions to answer it. So what defines a supercomputer? Is this a supercomputer that I have sitting down here? Is that a supercomputer? No, why not? What is a supercomputer? What does it have to be? More than a regular computer, yes. Multiple processing units. You're hitting on all the right cylinders here. Big and fast. That pretty well defines what a supercomputer is. As a rule of thumb, we say it's about the power of 10 desktops, but that's, you know, the definition is fluid. The fastest supercomputer in the world in 1984, when I was a middle school student, has less processing power by several orders of magnitude than what I have in my pocket right now. So, you know, this obviously is not a supercomputer, but it would have been back in 1984. So it's a moving target as to what defines a supercomputer. Now, like I said, we generally define it as about 10 times the speed of a desktop. Baocat has somewhere just north of 150 worker nodes, and these are high-end servers for the most part anyway. And we have 13 terabytes of RAM on there, which is a lot more than you're going to find in any desktop anywhere. The largest, I had to look this up today, the largest supercomputer right now is the Tianhe 2 in China. It's by their Defense Department. It's got 3.1 million cores and over a petabyte of RAM and runs at 33 petaflops. A flop is a floating-point operation per second. So, the billion, 33 billion operations a second? No, keep on going, quadrillion, wouldn't it? It's fast. It'd be a trillion. That flop would be a quadrillion. Yes, it's very fast. Ours is not that fast. But like I said, believe it or not, the fastest supercomputer in the world, the one in China doesn't do as much work as the biggest one in the United States, which is at Oak Ridge National Labs, because the one in China only works on their one particular problem they work on. And when it's not working on that problem, it sits there and does nothing. Whereas the one at Oak Ridge National Labs, they open it up to everybody, and so when they're not working on their main application, it's always running full-out, doing somebody else's jobs. And if you need more computing resources than we have here at BioCat, there's also what we call Exceed, which is the national network of big supercomputers, maybe even the one on, you know, there's several of them around the United States. There's the one at Oak Ridge. There's one at UIUC, it's the National Supercomputing Center up there, and Stampede, which is the one at the University of Texas. These are all huge, huge, huge supercomputers. And if you need more than we can provide you, we can get you resources to get you out on the bigger ones as well. So, types of problems are solved by Supercomputer. Just to score, tell me what kind of problems you solve on BioCat. What's your area that you work on? Okay. Say all the search form and give you a friend and Facebook. Oh, like data mining? Yeah. Okay. We don't do a lot of that on BioCat itself, but we have some other resources that Adam's going to talk about a little later. What kind of things do you do on, what kind of problems do you solve? Big data. Data simulations. Okay. Data mining. Adronomy. So, what kinds of problems are you doing? You're doing the computational flow dynamics into things. You're doing the genomics into things. Both. Genomics. What, Gavkilesh? DFT. DFT. Okay. Genomics. Genomics. Genomics. Genomics. Quantum simulation. Okay. That's quite all right. Pardon? DFTs. Genomics. Parallel calculations. Genomics. Genomics. Genomics. More genomics. More genomics. Are you starting to see a pattern here? Go ahead. Simulations. Big data. Big data simulations. Network simulations. Genomics. Genomics. Okay. So, you see we have a wide variety of things that we deal with here. Now, we just last week to Oklahoma and they have a bigger supercomputer than we do by about three times. They're about three times our size here. They do two applications really. And they size their computers for that. They do weather simulations and they have one, what was the other thing that they talked about that they... Some bit of molecular dynamics. Right. And so, if it runs those two programs happily, they're happy. They don't care. We on the other hand, everything from statistics, which I don't know if you're the one doing this or not, but there's somebody from statistics that will typically dump in 50,000 jobs. And they'll all be really short. They won't take long at all. But we need a whole bunch of parameter sweeps, that kind of thing. Then we have genomics people. And genomics people will sometimes take the biggest machine we have and they'll run it flat out for three weeks, which those are kind of diametrically opposed targets. So, we kind of... That's why Beocat is heterogeneous. We have some really big memory machines for the genomics folks. We have lots of small cores for the statistics folks and pretty much everything in between. So, we optimize for large size, for fast speed, and reliability. We get quite a few people coming onto Beocat because they've been running things on their desktops and they'll be two weeks into a three-week simulation. And first of all, they're not able to use their desktop during this time, which is kind of frustrating. And second of all, two weeks in, they'll... Somebody will pull the power plug or something like that. Windows will do an update and they'll be cursing at the computer because it shouldn't be doing that. But it'll break whatever they're working on. So, we do have downtime here too, but hopefully it's a little bit more predictable. And it does stay up for long periods of time. It's not unusual to have at least several months of uptime on Beocat. Do you want to go take a tour of Beocat? I'm going to take two groups and do this. Yeah. Beocat? Somebody else is on... Ah, gotcha. On the Adobe... On the Zoom session. Okay. So, it's got a little... Yes. This is being recorded. We'll send out the link when we're done. Is that what you're asking? Okay. Yeah. We'll send out the link when we're done of how to get back to and look at this again. Okay. It's got to be true for one. So, let's do like this side. Take them on a tour of Beocat. We'll talk about some parallel programming here. Nope. I'm going to stay here and I'm going to talk to you a little bit about something. And then when you guys go over there, we'll talk to them about the same thing. How's that? Just because he's loud out there. Okay. One of the things that makes a super cubian great is doing what we call parallelism. What does parallelism mean? It doesn't mean to do something in parallel. Do multiple things at the same time. Exactly. As opposed to serial, which means you do one thing and then you do another, then you do another, then you do another. Parallel, you're doing several things at the same time. The concept of parallelism is one of the more difficult things in computer science. And the reason for that is that programming is tough to make it all work right. I'm going to say this probably more than once. This system, not ours, not anybody's, can magically make your programs run in parallel. If you have something that is inherently single-threaded, which means you can only run one thing at once, and you put it on Baocat, it doesn't matter how many resources you throw at it, it's still only going to do one thing at a time, unless it's programmed to do more than one thing at a time. So I have some examples here. Notice I put it in bold. This is probably the biggest thing that we get, because we have people that have a program that will run single-threaded, one thing at a time, and they'll say, hey, I need to use Baocat, and they'll put on Baocat, and they're saying it's no offense, not any faster than this on my desktop, what's wrong with you? That's because the program wasn't written that way. Some programs are harder than others to run in parallel. Here's some example. If you have this being your data set, and you're trying to make this computation, can that be done in parallel, or does that need to be single? If you're just trying to take this part right here, all these numbers that you've got out here, and you're taking all of them and multiplying by four, is that something you can do? Can you take a number and multiply it by four? Can you multiply two times four? Yes or no? Can you do that? Yes. Can you multiply three times four? Can you multiply four times four? Can you multiply five times four? That's something, and you guys can give me the answer, and I would know all that right at the end. I wouldn't have to do any coordination really to make that happen. You have to have the communication there, but the computation itself can be done in parallel. Now, we've got something a little more tricky here. We have 11 times the quantity a to the n. I think that was supposed to be a squared. I think my thing got moved up. Times e to the a n plus log a to the n of 17. How about that? Is that something you could do in parallel really easily or not? Yes, it is, because there's nothing saying you only got one input and you want one output. You could do that. It's a lot harder to problem than just multiplying by four, but there's nothing more to it than that. Now, here's a trickier one though. We start with b zero being zero, and we have b of n equals a to the n minus b to the n minus one. It relies on the previous value. That makes it a lot tougher now, because you have to have the previous value before you can go on. Does that make sense? I think we're about ready to switch here, which is perfect because that's how far I wanted to get through. One more thing. The typical usage we see, this is true of most HPC programs. Programs will have some part of it as serial, and some part of it is parallel. You genomics people, do you guys use the Perl? Do you guys use Perl to do your, what do you call it, the pipelining? You guys do pipelining? Well, none of that's going to be Perl. Do you guys do pipelining? The pipelining is the serial part, and then you submit them on to Beocat to do the parallel part. Those are very well written. Those are perfect HPC kind of jobs, because they know what parts are what. But what we see is we'll have somebody to ask for a lot of resources, and they'll only be using a little bit of it for a long time. They'll use a lot of it for a little bit, and they'll use a little bit for a long time, and a lot of it for a short period of time too. So, they kind of go back and forth between the parallel and serial part because they're combining these kind of things that we call these, there's actually a term for it, it's called embarrassingly parallel, because they're really easy to do, followed by this part, which cannot be parallel. It has to be done serially. So, that's the way most HPC programs work. Some of them are all parallel, some of them are all serial, but for the most part that's the usage that we see. Okay, perfect time to go. Next group. All right, now you get to hear this feel that I just went over with those guys. The basis of HPC Supercomputing is parallelization. What does parallel mean? A job of split serial doing multiple calculations at the same time. Exactly. As opposed to serial, which means you're doing one thing at a time. Now, you notice I put this up here in bold. I do this because this is the biggest misconception that we have. No system can magically make your programs unimparallel. We have people, probably on the average of once every couple months, get new Bay of Cad accounts, and they try their job out, and they say, what the heck? This runs no faster than it did on my desktop. What's broken on your end? The problem is, it doesn't matter how many resources you throw at it. If a job is not made to run in parallel, it won't run in parallel. And the programming of that is difficult. So we have some tool kits and things like that. We're going to talk about a little bit later, but that is the key to it, is that your programs have to be created to run in parallel before you can use it in more than one spot at once. And I said, notice I said, no system can do this. This is not a limitation on our part. That's the way it's written. That's the way it's got to be. There's fast when the parallelism making things unparallel, it's not written to be that way. Some programs, some problems are harder than others to run in parallel. So here we are. We have given a of n equals 1, 2, 3, whatever up to n. Is this parallelizable very easily? The first one, 4a to the n. Can you multiply 4 times 1? Can you multiply 4 times 2? Can you multiply 4 times 3? Can you multiply 4 times 4? Can you multiply 4 times 5? Yeah, that's pretty easy. And I don't have to coordinate much to do that. That's what we call a bearish, singly parallel problem. I can give you my data file and I say you pull out line 1, you pull out line 5, you pull out line 16. You guys can all work on part of this problem separately. What about this next one? It's a little trickier. This is supposed to be a squared. I missed my superscript on there. It was 11a to the n squared times e to the power of a to the n base, log base 8 to the n 17. Can you work on that parallel? It's a lot more difficult calculation. You won't be able to do it in your head probably, but it's still what we call an embarrassing parallel problem. The reason I bring this up in an example like this is because the calculations that we're doing on Bayoucat are usually not multiplying 4 times a number. It's usually heavy-duty mathematics. However, the principle is the same. For any input that we have here, we can create the output. Again, you can grab out your line and you can do this calculation to be happy. What about this third one? I say b to 0 is equal to 0. b sub n is equal to a to the n minus b of n minus 1. So b to the 1, you have to know what the value is of b0 before we configure b1. You have to know the value of b1 before you have the value of b2. This cannot be parallelized even at all. Even such a simple calculation like that, it requires the previous result before it can go on to the next one. So again, it doesn't matter how much we want to, we can't make that run in parallel because you always have to have the previous value before you go on to the next one. Typical usage we see on Bayoucat, the genomics guys do what we call pipelining. You guys run pipelining with your genomic stuff. It does this part for you and this is what we see. We love those kind of pipeline jobs because they're very efficient use of our resources. Typically what we'll have is they'll have jobs will come in and they'll have some part of processing that needs to be serial. They'll work on something like this where it's working on a chain of stuff and then it says, okay, now break out this stuff in parallel. Some part is in serial and then some part is in parallel and run huge amounts of resources and then we'll go back to the serial part and then go back to the serial part and then back to the parallel part. The genomics guys really get this because they do the pipelining stuff and that's why it's very well written. But we have somebody right now who has asked for a huge amount of resources for a long period of time and it sits there and uses very little resources 99% of that time and then it's going to explode into a huge amount at the end. So it needs to ask those resources because it's going to use them but the program wasn't written to do that pipelining kind of thing where it can run this for a long period of time and then go from there. Yay! You're good timing on that. We're all at the same spot now. Let's give you guys about a five minute break or so. Or go to the bathroom, break that kind of thing and if you have any questions, we'll be around here for a bit. Get up, stretch your legs. So be back here in about five minutes or so. We'll be right back. I might build a snag one from somewhere. We had more people than I thought we were going to have and we ran out of spots. Let's see if we can snag one from somewhere else. You want to grab, you know where more chairs are? You grab. He's going to get you one. I think we'll be fine for now. My voice just started to kill me. My voice just started to kill me. My voice just started to kill me. My voice just started to kill me. My voice just started to kill me. My voice just started to kill me. My voice just started to kill me. Do you have any, like, the Terminal apps iOS? There's a terminal already built in to... Oh, iOS, sorry. I was thinking OS 10, sorry. I don't know. I don't have any Apple devices other than, you know, desktop. My desktop is Apple. Dave here does have an iPhone, I know. If anybody would know, he would. Yeah, well, I mean, I know I have one that I run on my phone. Hey Dave, is there an iOS terminal app that lets you SSH decently? An iOS SSH client that is reasonable? Dan would know too, since he does this all the time. What is? Web? Okay. We do have a web-based terminal also. You're saying that reminded me. We have this. Damn it, I don't have my password to send me from. That will... Yeah, sure, that's good. I just have to look it up here. gait1.beokat.cs.ksu.eu Should... Well, okay. Right. This is cool though, because I can come back to this on a different device. I'm going to type it out again. On my own computer, it's not a problem. And this is all through a web page. So, sure. That's on our list of things. Cool tools to show you guys. That's going to end, right? Yeah, toward the end. Alright. Next section here we have is on some parallel programming. And there are two languages that use a lot for parallel programming. There's FORTRAN and there's C. Just me speaking personally, I can't stand FORTRAN. So, all the examples today are going to be in C. So, that's just my personal prejudice. If you like it, I'm not going to dislike you or anything. That's what I like. Oh, yeah. I forgot to mention this. I didn't know I had this slide in here. I'm going to go through some more about a lot of this stuff that we're covering here today. As far as general stuff and not Bayo's cat specific stuff. This page right here, the Super Computing and Playing English. There's the URL for it. Or if you just look up Super Computing and Playing English, it'll take you to that site. That is a guy down at OU has put this together and it's a class that he does like an hour a week kind of thing. I'm going to go back to the 2011 version if I remember right to get the full the full version I want to try to say the full videos. Because of ADA requirements, they weren't able to put later ones on there and they can't get them transcribed because it costs money and they don't have a budget for it. So, therefore they're back to the old versions which is 2011, but still this is some really good information on Super Computing and how all this is put together. And as a matter of fact I draw several of my examples from his So, that's a good a good resource that I didn't change that will get you there, but support.BayoCat.CIS.KSU.edu is actually our support pages. I'll change that before I send this out. And emailing us, the three of us if you send to this email address that gets put into our ticketing system I know that there are several people in this room just based on the names that I saw registered that do this on a regular basis and that's great. So, if you send email to this address it goes into our ticketing system and you might get a response to whoever, which one of us can either best answer the question or whoever gets to it first if more of them one of us can answer the question. So, that's how you get a hold of us. Parallel programming. Now I have a copy of these all in my home directory and these are all open to the world and I put this on here because when I send I'm planning on sending these slides out to you guys afterwards so that you guys can have them. But, here's where all these examples are coming from so you can kind of look at them yourself if you want to edit them or whatever. So, shamelessly stolen or adapted from this other website. We're going to do some background includes and how many of you are familiar with programming in C first of all wow, this is going to be how about Fortran, does anybody do Fortran? A little bit. So, what's your programming experience in then? Just shout it out. What's your programming experience in? Are you doing any programming? Java in what? Python. Got it. R is not multi-threaded that's why we call them high throughput jobs because generally they have a whole bunch to go through fast but they are not getting them to run in parallel or is a royal pain in the rear trust me I've tried. Okay, so we're going to go through some of these programming examples here if you, yes that is parallel this next section is about programming in it though so how to write the programs so that's why I was kind of surprised that so many people signed up for this portion of the class because most people are not interested in writing parallel code and if you have code that you want to have parallelized send email to that email address you have earlier and Dave will be loved to have lots of work to do because that's what he does. And Java will do parallel however Java is kind of inefficient you don't see a lot of Java programs written to take advantage of it because it's not very efficient code and when you get to high performance computing being high performance you want to be as efficient code as possible mainly because it's not very efficient code it's not very efficient code and when you get to high performance computing being high performance mainly because most places aren't like Baocat most places are we want you running 100% CPU all the time and we want it to be doing useful work and you're going to have a very limited window if you go to like I said to exceed some of the national resources they're going to have a very limited window to get your calculations done and if you have a limited window you're going to need that code to be as efficient as possible and that means not Java that means see your program yes, the more low level it is the more efficient you can make it does that make sense you can write some really bad code in low level languages I trust me I've done it matter of fact you probably see some of that here what a fork does again this is these examples are going to be in C most everything you see as far as parallel programming is going to be in either C or Fortran and if you go matter of fact if you sit through Henry's course the super competing and playing English he has examples there and he brings up examples in both but you won't see Java and you won't see Python and you won't see R and those kinds of things so my apologies for that Adam's really the Python guy in the room too and he just left though I know that it will do some level of parallelization too if you jump through some hoops I don't know what level that is again it's not very efficient because Python is not a compiled language it's an interpreted language so that will slow it down some or just by itself so here I'm going to kind of go step through this code here and like I said if you decide that you don't want to have anything to do with C you're going to go hang out in the lobby for a bit you're not going to offend me but I'm going to go through it anyway so here's what I'm going to do I'm going to set up a global variable and then I'm going to run this loop here I create a type called childpid which is a process ID I'm going to declare a local variable of 0 and then I'm going to fork it a fork makes a copy of all the memory and variables and everything like that so you basically have two identical environments then I check from my child process ID if it's 0 then we know we're looking at the child process so both sides run the same code and you're going to stand here that means we're at the parent process you guys have a question for me? how these are in my on BayoCat these are all here and I'm going to I wonder if I can I can't hear but tell you what I can do here I can that's a little better I didn't see that a little better I'm going to make it even bigger this is starting right kind of where the last one left off so when you fork something you have to have a way of knowing whether your the same code runs both we made a full copy and we have two full copies of the program running now so it has to know whether it's running the child or the parent typically when we do this we're not doing this just once we're doing this several times by doing this several times you might have several different one parent and then several different children underneath it so first thing we're going to do is we're going to check to see if we are running on the child process if we are running the child process then we're going to increase that local variable we're going to increase the global variable and those out if it's the parent process we're going to set the local variable 10 and we're going to add two to the global so let's see what happens when we compile this and run it I don't put a loop in here I haven't put a loop in there yet that's a good point because most time when you're doing a fork you're going to do a loop and do this several times and you need to have something in there to watch how many children you have so that you don't keep on forking this one just forks one time the program is done and is finished so that's why this one is not going to cause any problems at all okay so in this case my parent process finished first and you notice that it in the parent process it set my local variable to 10 like we told it to and it increased the global variable by 2 child process it included local variable to 1 and the global variable by 1 you notice that it didn't add 1 and 2 is not 3 so the global variables are really difficult to do in by forking that's the trickiest part about using fork as a parallel processing mechanism creates an old new process this is the old way to do it this is the only way that IU was even aware of when I was in school of going about parallel programming and it becomes really tricky because you have to set up functions that basically call memory locations and things like that so you're if you wanted to actually add global variables together it becomes very very tricky so you kind of have to have some sort of master process that coordinates that and has to do a really good job of it and collecting all that data look at so this is what work example 2 same kind of thing so this is how we get around that and that is in C we're going to create a pointer to our global variable glb2 and then we're going to set that pointer to 0 and that's basically the only thing I've done here is instead of increasing the variable I'm using pointers and again these will be on slides that I send out so hopefully it won't be quite so obtuse here before I do this I mean so here I took my global variable plus 1 in the child process plus 2 in the parent process and you notice that I end up with the same problem it seems like it should work but it doesn't this is the way you actually do it we have this big ugly thing here called a memory map and again these will be on the slides because I'm not expecting to teach you guys C in one day especially if you guys were all C experts we might go through this a little more in depth as it is this is going to be kind of general concept so we're going to probably have plenty of time left at the end of this one because this will kind of go fast for this part so you have to make a big ugly memory map way of making all this happen compile this one then now you see finally that we got finally our global value actually ended up with 3 on one of our things because we added 1 and then we added 2 it's difficult stuff programming is tricky stuff and it's not fun if you ever had to do much of it power the good news is there are better ways yes what's that self referencing program recursive programming yes yes again but you have the same caveats so you might have some master process that they're collecting as you're working off children processes and you have to have a way of keeping track of which one's which and which one's getting data back so it is certainly possible it's still not easy so anyway we just created two processes again you just fork again make 3, 4 do it in a loop you can make 15 you can make as many as you want now openmp it gets a whole lot easier when you start using things like openmp you can see the website that I get this from now the only tricky thing is you also have to tell it you have to tell it that you're using openmp when you're doing this so here we have 3 files that I have back in my directory again and we're going to set these out here alright so this has a program that we're taking it's in on the command we create integers called threads and tid and then we have this command here the one that's listed in purple that says do this parallelized basically we tell it to get the thread number and it says hello world from thread number whatever and only the master thread does compute the whole number of threads so you can see that even there's a lot more comments in here than before but if you actually look at the number of command lines it's much smaller the whole lot easier to deal with and it makes your programs multi-threaded very easily in comparison I created 8 threads here now I want you to notice something here though run this program twice what do you notice about the output it's not the same order it's nondeterministic it's whichever one happens to get there first gets there first so the number of threads, the parent process it's always coming up with the same answer but each one of these threads 0 1 2 3 4 5 6 7 they're all out there all these threads are working but they all eventually get there but their order is just whatever happens to get there first whatever the processor happens to get there first that's the only caveat with these do I have anything else here I'm sure it would be worth trying to look at here if you guys are not into C programming and all that this is going to be very boring I think I'm just going to skip it the next example here was just showing how they can do scheduling I can do it myself and I can say now as soon as the first one is done then we do the third one that kind of thing boring details I think we're done with this part so GCC, the GNU C compiler pretty much standard in for C compilers good question Intel tended to make a slightly more for Intel processors GCC is probably more widely used because the Intel compiler is licensed it's not free and GCC is free and pretty much every system you'll ever be on will have GCC available so we see probably 90% of people using GCC now one thing about OpenMP if you ever do get to use OpenMP I implore you to what look at this last line the set number of threads by default will use every thread available shouldn't worry about that now we didn't for a long time so you have somebody running that asks for processes on a mage and pretty soon they're using the entire thing because because you need to set the correct number of threads okay MPI, this is where it starts getting fun and there are several different versions of MPI include there's an R version of MPI Python will use MPI, correct Adam? if you ask it to so this is from Wikipedia message passing in case you're MPI now notice that the last one was MP OpenMP, we use OpenMPI here they're not the same, they kind of have similar properties but they're very distinct one is just OpenMP is just for using on a single machine OpenMPI is for using on multiple machines or a single machine either one so message passing interface is a standardized portable design by a group of researchers from academia and industry to function on a wide variety of parallel computers the syntax and semantics of a core of library routines, useful to a wide range of users writing portable message passing programs in Fortran 77 or the C-Probe programming language again that's been expanded since then that's what it was written for though several well-tested and efficient implementations of MPI include some that are free and in the public domain these foster the development of a parallel software industry and they're encouraged development of portable and scalable large-scale parallel applications so MPI is really what let supercomputers be supercomputers that's the short version now I'm going to use a contrived example this is from the supercomputing in plain English if you go through the supercomputing in plain English class this will look very familiar to you so this is how he explains it imagine you're on an island in a little hut on the hut to desk and on the desk is a phone, a pencil, a calculator a piece of paper instructions and a piece of paper with numbers here's your instructions and here's your data so what to do have the number in slot 27 and the number in slot 239 so it's an adult slot somewhere slot 71 is equal to the number in slot 118 then call this other number and leave a voicemail containing the number in slot 962 got it? so all you've got is your phone, pencil, calculator instructions and data more things otherwise call your voicemail box and collect the voicemail from this number and put that number in another slot so these are kind of limited instructions you can only do these certain things but you're really kind of isolated from the rest of the world so we've got two kinds of instructions here we have arithmetic, logical but the addition the comparison, that kind of thing we have communication call this number and leave a voicemail or call your voicemail box and collect a voicemail kind of the same that's kind of the analogous situation here if you're on a hut in the island you aren't specifically aware of anyone else MPI applications are they're running they don't know if anybody else is running or not by themselves you know whether anyone else is working on the same problem as you are and you don't know who's at the other end of the phone line all you know is what to do with the voicemails you get and what numbers to send the voicemail to now suppose that Horst I don't know where he comes up with these names like I said I just copied it Horst is on another island I think he's not trying to offend anybody by picking a name that nobody had in this class now suppose that Horst is on another island somewhere in the same kind of hut with the same kind of equipment and suppose he has the same list of instructions as you but a different set of numbers but they didn't phone numbers like you he doesn't know if there's anyone else working on his problem or if he's working on a D in Hudson Islands each of the four has the exact same list of instructions but different lists of numbers now the numbers that you call or each other that is your instructions have you call Horst Bruce and D Horst has him call Bruce D's and you and so on then you might be working together on the same problem does that make sense each person is very localized to what they have they have their own instructions but they're not aware of what the other places are doing but they still might be working on the same problem even though it's unbeknownst to them all this data is private I can't see anybody else's they can't see mine so my numbers are private there's no way for anyone to share what their data is here except by leaving them in voicemails so why do we use voicemail as an example here it's because of the cost of communication long distance calls have two costs you have a connection charge which is the fixed cost of connecting to someone else's even if you're only connected for a second and there's a per minute charge the connection charge is large and you want to make as few calls as possible now as an example here hope this is loud enough I think we'll be alright great ads an ad for an ad okay people this is a phone this is a dollar you still with me that's good about this number and all your long distance calls from home could cost less than a block that's right with 10 10 to 20 all calls up to 20 minutes are only 99 cents talk longer it's just 10 cents for each extra minute no fees no contracts am I right Gucci just about 10 10 to 20 then one then the number bottom line you get up to 20 minutes on this for less than this you got that good it's not a mistake and I think nature's calling my dog okay why how can 10 10 to 20 do this how can they charge basically five cents a minute for all their calls up to 20 minutes it's not directly related to MPI but it's very similar so we're working on analogies here well how can they do that how can they have 20 minute charge for 20 minute call for a dollar that's probably less than their cost to for the lines no guesses what's the average length of a phone call in the United States today take a guess you're close average is two so you're paying a dollar for two minutes that's pretty good rate they're getting on that right the point is that the connection charge is really expensive the per minute charge is pretty low they did some analysis on MPI and found that it's essentially the same as having a $150 connection charge with a one cent per minute after that so you want to you the point being when you're dealing with MPI you want to keep as few connections as possible now obviously the beauty of it is that you can make those connections but that's the overhead cost with this we want to keep those connections as few as possible and as much information as we can while we're on the same line so MPI talking between machines you can have two MPI hosts running on the same machine but they don't know they're running on the same machine all they know is that we have I'm talking to somebody else they're on this hut on an island they don't know who they're talking to so advantages to MPI you have interaction among different programming languages if you were really sadistic enough you can go between R and Fortran that'd be nobody even thought that was funny jeez I'm going home interaction among different machines there's really handy in something like Beocat where we have 150 different machines you can be running your jobs on multiple ones data collection scaling even if you go on to some of the biggest resources in the world they'll all have MPI available and you can talk huge numbers of machines all at once does have disadvantages cost of getting started not only in terms of communication but in terms of writing it it's not a fissure for small amounts of data typically like we see with R code it's it's all stuff that will fit on one process anyway why do you want to even separate it on to multiples coding is complex and again don't confuse it with open MPI open MP did I say the MPI? don't confuse open MPI and open MP and like I said just because they sound a lot of like I even sometimes will stumble over it as I just did I know what's right in my head coming out of the mouth is a different issue so this tells you how to get that and started here there's an MPI example here I called it MPI so here we have some MPI code and it's basically meant just to be an example so we have a declared number of processes rank length of name an ID I have a character string of the processor name I initialize I talk about my communication size and rank and all this kind of stuff and then I say process whatever whatever out of whatever so I have several different things going on here at once this is the code that I use again I'm not going to go into a lot of detail on this to do my open MP and then I call finalize so I'm done so MPI CC dash f open MPI did I do that wrong there's an open MP so you notice I'm all running this on the same host so Athena is the machine I'm logged into so it's telling you on Athena I'm running thread zero again like on the open MP example non-deterministic this one had zero three one three seven one so they're going in different orders starting up several different processes each one's doing its own thing reporting back MPI is if you're going to get programs that scale out MPI is the way to go it's more difficult to write for to begin with but it is the way to go if you're getting to large scale problems and again Dave knows a whole lot more about this than I do so one last thing like I said writing this is hard so my best advice is that if somebody else has already solved the problem for you let them we have all programs installed on Bayo Cat already and you can install your own Nomdi Blast openfoam those are some examples from different from different areas of software that is already written to take advantage of the most efficient way of doing things whether it's open MP or open MPI if they've already written the code all you have to do is shove your own data in there don't you know use the toolkits that are out there I'm very very little there's little new one to the sun if you have to write your own code do it but if you don't use the work of others it'll be a whole lot better for you and for them both and like I said you can also install stuff in your home directory I at this point I'm going to go out to our website and show you how to install something in your own directory because I had to do this for somebody and said you know we should document this there is our videos okay so the resolution on this is not fantastic but you can do this on your own so here's an example from a program I was asked about it's called open bugs first thing I need to do is find the links download here and you'll see that I'm going to log into Beocat notice that I don't use a password because I have keys set up that you may be asked for a password to this point okay we're down here we're ready to download the program we're going to use a program called WGET W-E-T I'm going to paste that so the problem we sometimes run into with links notice that there is ampersand in the line it doesn't like that Linux sees that as a different command so we're going to have to put that in quotation marks W-E-T copy this again put quotes around it and now I'm downloading the program you notice that it's saved as a really strange name file with a whole bunch of extras so I'm going to move that first I'm going to rename it rename it just what it's meant to be which is this last little part here if you're downloading this straight from a web browser it would do that for you W-E-GET is not that smart so now we have a program downloaded called openbugs-3.2.2.tar.gz so we're going to unzip that file so I'm going to use tar extract zip verbose when I see the files that are going on the screen and the file name is openbugs-3.2.2.tar.gz and this is going to take a little while because it's a rather large file plus the end so now I need to go into the openbugs folder and notice up here there's usually some files in there some hints as to what needs to go on we have a readme which is usually like a lot of life information but we have an install so let's look at the install file and it tells you as most programs will that there are make scripts run by configure make and make install and look through here to see if there's anything else really strange that this program does not sing anything however there is something I'm looking for installation names that's not it we're not doing anything clunky as far as compilers here we go notice that it asks for as prefix equals directory for a installation prefix otherwise it's going to try to install system-wide and of course since you don't have privileges it's not going to allow you to do that matter of fact plus I specifically tell it to I don't have any privileges so I'm going to have to install this in my home directory by using this format and we're themed to the file so we're going to run what they said dot slash configure prefix equals here I'm going to put in my home directory which is homes Kyle Hudson and I want to install the root of my home and give it a directory so I'm going to call it open bugs and what this does is it automatically configures your system for our environment they make these rather generic and that way if you're installing on a different unique system on the Linux system whatever it can kind of figure out what options it needs to put into the make files so it'll create the program quickly got done doing that so we're going to make and it starts compiling the program this program actually itself is pretty small the biggest part of what it was expanding was the documentation so it's already done there and now it's compiled everything now this is the important part when I put that prefix in there now when I do a make install you notice that it's copying it to homes Kyle Hudson open bugs rather than trying to put it in the system directories where I don't have permission right now to do that so change to bugs directory without the extension on there take a look and you'll see I now have three folders in there called live and share the one where open bugs sits is called bin if I look inside there you see I have an executable file called open bugs and if I run dot slash open bugs they tell me to tell what to do so here I'm going to actually quit just like they tell me and if I wanted to submit a job let's say I had a program out there let's do Q sub homes mining bugs in open bugs and then I could say path to data file and that would then submit that into the Q of course you can put in your Q sub script all the standard things of how long you wanted to take where you wanted to run that type of thing but that's how you run a program from your home directory in on Baokat thanks so we put that in here after our last intro to Baokat class we talked about doing your own that was something somebody asked about was open bugs and it was a fairly easy example there's a lot of software that we don't have I mean we don't know every area on campus that kind of thing and maintaining it is a pain in the rear so what we do is we have you guys go ahead and install this at your own home directories and that was an example of how to do that so that example is on the website please feel free to review that sometime on your own if you need to install your own stuff in your home directory which is a very common thing to do so what kind of software do you try to install it yourself we get lots of requests for could you install this software and our answer is no but you can do it yourself here's how we point you to the video and we point you to documentation that kind of thing and like I said we've even had people that have had some problems and I spent a couple days last month helping somebody out who couldn't get their software installed it was kind of a tricky thing and we figured it out and I helped them out we had no problem doing that but you know we do need you to be able to pretty much install your own software that kind of thing and keep up on the updates that kind of thing itself giving up with this many updates you might need version 1, you might need version 2 you might need version 1.5 and we can't keep up with that system that's the main reason why we do that in your own home directories any questions?