 Welcome to another edition of RCE. This is Brock Palin. You can follow me on Twitter at Brock Palin all one word You can also just find a link to it off of www.rce-cast.com Also, this is a special show. So I'm gonna let everybody introduce themselves We have three of us are usual on myself and my co-host and we have another guy here We're gonna talk about just regular HPC kind of things going on the world. So guys go ahead and introduce yourselves Hey, I'm Jeff Squires. I'm I'm usually on with Brock here Today we're gonna be doing a little different as Brock said We're gonna be chatting more about HPC news and a couple of outstanding HPC issues And I think most people listening here will probably recognize our third person here at the round table So why don't you go ahead and introduce yourself rich? Well, thank you Jeff This is rich brekner from inside HPC comm and I'm really pleased to be on the show today guys Well, I think this is great. We're kind of having a crossover event here. So a little different than our normal interview style We wanted to get to you know another person in the HPC business and just kind of chat about some things So we got a couple of Recent newsworthy things to talk about but we've also got a couple of outstanding issues that are just good to kind of You know at the water cooler Chew on the fat about a little bit here. So rich give us a little give us a little, you know Shameless plugs on the stuff that you do. All right, let's let's let's get that out of the way. So Inside HPC it I just had my first year Anniversary I took over from a guy named John West who works at the DOD and he was doing inside HPC is kind of a hobby and Eventually he sold it to me because he got a big promotion. So good for you John West good good for me and We recently expanded We've been taking the information that the whole style of inside HPC and we've gone on with three additional publications Inside big data inside cloud and inside startups So using the same format we see these three areas is really growth Pieces of IT and I had all these stories coming across my desk That didn't quite fit in the HPC realm of things But I thought they were important. So now I have a place to put them. So that's what's going on with me guys Well, you've also got like cross publishing agreements with like the register and you've got like several Authors now and stuff too. I mean you really expanded in the last year. Yeah Yeah, the register thing has been great because they have so much reach in Europe and what we do is we publish one of their stories once a week and they publish one of ours and It's gonna. Yeah, it's great. We just did a handshake and That's been going really well and we're looking to expand that with some other pubs coming up here. So I'm excited about that So why don't we cover some of the events that are coming up? So SC? It's in is it Portland or Seattle this year? I would screw them up. Is it which it's in Seattle? Yeah, it's in Seattle. Okay, so I'm gonna be there coffee. Yeah coffee for everybody So I'm gonna be there. I'm gonna be there the entire week. I'm coming in on Saturday On Sunday, actually, I'll be leaving on Saturday and Jeff. I know you'll be there. I don't know what days you're gonna be there I don't know what days. I'm gonna be there yet either But I'll definitely be there for the whole show part of it And I'll probably be living mostly around the Cisco booth who knows and I got the open MPI boff as usual as well I think that's Wednesday at noon. I haven't looked at the the schedule yet I hope they do not put us opposite the MPI CH boff because that was kind of a bummer last year that we were both at the same time and You couldn't attend both because I wanted to go to their boff in here what they had to say so Yeah, yeah, and I'm gonna be there all week and we're actually gonna have a booth this year up on the sixth floor So, I don't know if you guys remember Seattle from whatever 90 or 2005, but it's it's two floors It's like a big Dagwood sandwich, right? So the exhibits and on two levels. Yeah To be honest, I'll probably remember it when I get there So many super computing now, they all just literally run together. So I don't know Yeah, I mean Seattle's gonna be a great town For this show and we've actually got more exhibitors this year than they had in New Orleans But it's in less space. So think of it as kind of a smash together Super computing sweet where sworn enemies are just feet from each other And Also something that I think we're all involved in one way or another is the the student cluster competition. Yeah, yes You know, you know, I was just for last year and that was tremendously fun. It was really great Yeah, they think up a lot of space they if they're gonna be crammed for space because every one of those students effectively has a small booth, so That's yeah interesting always Yeah, so last year, I don't know if you guys remember but the the dark horse the little quiet team from I think it was Taiwan Yeah, and the competition, right? and They you know that the the local teams were talking a lot of trash and they had to take second place so that I Wait to see who repeats this year who comes back. This is gonna be good So one thing interesting about the student cluster competition. It's it's been at SC For years now. It's actually Expanded and it's coming to the international Super computing show next June and I believe that it's still in Hamburg Yeah, so they're gonna do their own European version of that probably with mostly European schools But I was really encouraged by that and I know a lot of companies worked hard to make that possible So, um, yeah, I mean so on that note What's been going on recently is those same folks from ISC just had their ISC cloud conference this week And what that's about is is HPC in the cloud and I think it's the only show that really focuses on that particular aspect of cloud computing and They had Ian Foster as their their keynote and the thing I wanted to bring up is he had a very interesting kind of point to his keynote and it was all about the missing middle and That cloud computing and HPC in the cloud is really enabling small business Startups to sprout up that wouldn't have been possible before He sees it as a big enabler So maybe HPC in the cloud has found its its niche at a very different end of the spectrum And I just thought that was fascinating You know, I got to kind of agree with that I mean, I don't have any hard market data to back this up and I'm just some random engineer But I mean you look at some of the the cluster integrators and even HP right HP made a killing Selling, you know, 16 and 32 node clusters, right? They weren't top of the top 500 But they were the bottom 50,000 for a long time And Microsoft has made no bones of the fact that they're going after the bottom 5000 50,000, right? So, you know, this is not necessarily a bad market move Absolutely Yeah, actually I have some experience with with that bit and a lot of it is Like a one person firm or two person firm I mean really really small is what I'm seeing going out to like an amazon service or a penguin service And they can run something really wide and get it done really quickly But where actually I had a discussion with Not to be named company in northern michigan here and And again, it is discussion about software licensing issues And I think there's a lot of space here for vendors of of ISV apps to enable Getting large quantities of licenses at a reasonable cost just for the amount of time you use them for Rather than having to purchase all those licenses and sitting on them for all eternity Yeah, there's been there's been, you know, periphery discussion about this and the various blogs and the news feeds and things like that And rich, I think you've even covered stuff about this, but I think that For this to really be an enabler, it's got to be exactly what you said brock I mean, not only do I want The hardware and the electricity and the hvac and all the other junk infrastructure that I need to You know run a job on 128 nodes for today But you know the other part of that is the licenses I need licensing some kind of pricing scheme That is amenable to kind of like using it like a library, right? You know use it for a little while and give it back Yeah, we've got the hardware down to like a As you use consume resource, but we don't have the software in the same space the software's still a You know a one-year commitment or a five-year commitment. It's not This as you consume space and some vendors have gone into space. Yeah. Yeah Um and some vendors have kind of done this, but they've all done it on their own They have their own interfaces So if you want to use x application you go through their face There's no way if I need five applications right now to really do a consistent Interface just like I bought all of those but then get billed for what I use So I don't know. I think there's space there and I think we're going to get there I think the mark is going to demand that we move that way You know I I agree, you know this is this is not a new problem They had this back went with with grid right in the early 2000s They were starting to how are we going to crack this nut and here it is 10 years later and we're still talking about it Yeah, I have a little hope though because now that there's actually hardware and dollars In this and there are providers who are marketing this as a model, right? You know, so you have your Amazons and your penguins and things like that and with grid It was really just sharing between organizations or at least that was my perception of it, right? You can get a bunch of different definitions for what really happened back then But you know here. I think that there's actually You know business is willing to fork out, you know some amount of money so that I can run for a little while and then be done So I think the market climate is at least a little bit different since then that gives me a little hope So guys, I guess, you know talking about cloud at the low end Some big news this week at the very very highest end of the spectrum Netapp just announced a big win at the DOE Lawrence Livermore They won a 55 petabyte Storage win for the sequoia super computer that's coming out next year Sequoia is 55 petabyte 55 petabyte. Yeah, so I mean this is I think it's the big I'm pretty sure it's the biggest single storage win ever and certainly at the DOE it is This big bad boy is for sequoia and sequoia is going to be a 20 petaflop IBM blue jean q super computer at livermore so Unbelievable specs on this for io One terabyte per second. You just think about that Yeah, running the luster file system. So huge news and What makes it exciting for me is that net app is was not in the hbc space six months ago They they bought in genio from lsi makers of these disc enclosures and suddenly Not only are they in hbc. They're at the very top of capability and storage Bandwidth Yeah, I think it's very interesting. It's not forever to you know fist that thing Man that is That is this is ridiculous having to deal with that I'd be curious to actually speak with one of the guys implementing that and the way luster works and we had one to Luster guys on rce before and You're curious how they're actually laying that thing out Yeah, um, you know the wamp cloud guys will be supporting that so um, you know I I bet you they'd love to talk to you about what their plans are for supporting that bad boy Now that fits right in that big data category. I mean one terabyte a second I mean if they expect it they must have a use for it Um sustained one terabyte a second. I mean that's just a boatload of data. Where's it coming from? Probably the places we can't even talk about Well, I guess the the machine will be used for a wide variety of research You know, they gave it the standard spiel, you know bioinformatics and you know academia, but uh, you you got one Wonder what else is going to be going on there? So Yeah, what's normally the benchmark for those really large systems You need to dump the contents of ram to disk in a certain amount of time normally. That's what the The benchmark is for checkpointing those massive systems. So if you're looking at a 20 petaflop I don't know if that's peak or sustained. This is the first I've heard about sequoia. So Yeah, that's peak. Yeah peak But still that's a really big box You need a lot of bandwidth if you want to checkpoint anything reasonably to get value out of that box. So It's just keeping the whole system balanced Yeah, blue jeans a little different architecture though. Um, it's not quite the same thing as Even a well integrated rack of boxing, right? Yeah. So, you know, checkpointing is not as big of a deal Don't get me wrong. I don't know a whole lot about blue jeans. So probably somebody will tell me I'm wrong, but My meager understanding was that it's not Like you have to checkpoint it Not infrequently like you do on a typical cluster kind of thing because the the mtbf is is a little better simply because it's Quote unquote one machine Yeah, um, yeah, but Still, you're you know, a really big one machine with a lot of components Yeah, I mean So jeff you're an mpi guys. So you got to write, um, you know an mpi for 1.6 million cpu's So I don't have that believe it or not. We actually Talk about that. There's there's a lot of issues involved there and um, You know, both us and mpitch are are moving in those directions. I think um, Let's see open mpi is run at 50 60 000 cores 50 60 000 processes something like that. Um, but you know, still that's a far cry from a million Right and there's still a lot of issues to uh surround there There's a lot of network and hardware issues too, right because mpi in some way is just The thing that exposes how good or bad your hardware and the abstractions that you present upward are Right, you know, like do you really want a connected type network? Oh, so I I could go off on an mpi spiel for a long time. So let's let's not go there Well, I guess I guess I think that the core issue here is that uh, at least even on a machine like jaguar today Which is what uh two petaflops. There's only like five codes that really run at that scale and they're all high energy They're all high energy physics, right? So it isn't like you were going to write something new from scratch to go run on sequoia Uses the whole machine that that probably isn't going to happen Yeah, I mean even at our local resource here our largest single fabrics 2000 cores Largest single thing ran on it besides the first lin pack um 480 cores so far so You know, it's really how you slice in it up and what's it being used for? Yep And what hardware are you targeting? Are you going to be targeting accelerators or just cpu's and trusting Your you know magic software tools to fork it out onto like an intel m i c or something like that, right? I mean the model Is changing again, right? And so what is the software going to do right? So what are these existing applications are going to do to take care of I mean accelerators to me is a fascinating topic They're really starting to finally come into their own And for certain types of applications. They are super great, right and and you could quote me on that super great They're I was that for a phrase? But uh, you got to adapt the software to do it There are a few magic tools and those magic tools work very well, but they don't you know fit everything And so it's it's a challenge, right? Yeah, well, we had some some news on that this week, right with accelerators Intel's mic architecture had a huge win. What was that last week? But uh, The tack stampede system that's coming out next year. Uh, this will be a 10 petaflop system It will be accelerated by mic And uh, their whole spiel as you know, jeff is that this thing's much easier to program Than an accelerated system with gpus I in my gut feel that that is probably true, but I want to see it I mean, i'm a vendor too But uh, you know, I think we all have a healthy degree of skepticism about vendor claims when it meets the You know the crusty fortran codes that have got a run on this stuff. So Right I hope they're right because that is a major problem with the accelerators is you know changing the codes and changing your kernels and Things like that and but you know, I don't want to downplay some of the prior guests. We've had on here on rce You know writing tools that provide computational kernels that Magically run on gpus or cpus and they just do the right thing in the background kind of like mpi does for networks And and those things work really well if that's what you're you know Those kernels are what your application is using and things like that But the inherent programmability for all the random stuff and all the stuff that the missing middle wants to do right They don't want to write custom gpu code. They want to just write some code that uh works And so is it going to work well on an mic? I don't know. I hope so Yeah, I mean I can throw out something to one of our our second guest ever on the show Josh anderson into whom decode, you know, they're here at u of m where i'm at and I support their users and they just did a huge buy of Large number of gpus for their code their performance improvements over traditional cpu m decodes is still absolutely amazing even as they add more restrictions and make the code more complicated and they've really become masters of Doing these things in these massively threaded simple core kind of ways that these types of tools have been requiring and It's it's it's really really neat what they're doing and in the power envelope the footprint the cost everything It's just been absolutely amazing And I sit on their mailing list and the traffic on their mailing list just keeps growing I think this stuff's going to stick around It's not going to be like some of the other tools we've seen in the past where they kind of They look neat, but they go away and I think these things are going to stick around this time Well, you know it again coming back to supercomputing, you know jensen wane The CEO of nvidia is the keynote this year And there was some controversy about this, but you look at what's happened with accelerators in the past year China and japan are at the top of the top 500. It's it's a force that you can't just ignore And just what I was at the gpu conference last year It was like 3000 scientists there and they're using this stuff for their daily business So there was a hundred startups there that are basing business models on gpus and accelerated science So there's really a lot going on and You know, it may not be the answer for all codes, but it's certainly got a role to play It's it's just so accessible I mean your laptop comes with the gpu compared to old things like this is now something that you can just go and try And the cost entry point is lower and there's all these things I think was a perfect storm to make this work out to people to actually develop on Well, let me be the naysayer Just just to play devil's advocate here. Don't forget your laptop comes with two or four cores too, right? Yeah, and you know, that's a far cry from m ic But you know, maybe someday grandma will have an m ic in her laptop just so she can watch youtube I mean who knows right Um, but for me that the kicker is the memory bandwidth Right because when you that that's still a huge problem for all gpu codes today Is to get the the data down, you know from main memory down onto the device and then back And that's I say that's a huge problem Mainly from my own foxhole because of the whole mpi perspective, you know, we can't Communicate directly with the gp. We can't network send and receive directly from device memory Right and the same problem is going to be with the m ic unless that stuff gets memory on board somehow or I don't I don't know Exactly how that's going to go But you know tying it into the network because no matter how many you stuff in one box You still got to talk to your friends across a network And so how that network and integration happens and the memory bandwidth integration happens these are still Unknowns I think or at least not publicly knowns Well, Jeff, you know where we see that a lot is a lot of people I've worked with have wanted to do parallel across gpu is not Even across hosts, but inside the same hosts And you really can't do that right now to get Desirable performance good time to solution. It's like just run two cases Independently stick them under each gpu and just leave them there Don't try to use multiple gpu's on one problem right now It's just that whole communicating back and forth inside the box let alone Going from device to other hosts to device It's just not an option right now really So I have to agree with you totally on that Yeah, I'd love to see where the hardware vendors are going to go with this because At this point it's there are software work around solutions But you kind of lose a bunch of performance in doing so and so I think there's going to need to be some kind of you know hardware changes over time to Make these things better and I'm it's going to be real interesting to watch where where this stuff goes Yeah, I just I don't see how they can continue to just go across the pc i bus And and get the kind of scaling that they want so we'll have to watch that Yeah, I actually have my own prediction for the gpu type stuff and we already see amd doing what my prediction is and I won't go into it here, but I expect intel to do similar things with mike and other things in the future It's just I think it's now you can't drop a bomb like that and not give the prediction Oh Okay, so all it is as I see when we have eight core cpus rather than each core having its own vector unit It's own ssc 4 is that we're going to have a bunch of scalar cores and regular floating point units And then we're going to have a mike type thing sitting right on a die next to it And the individual cores will schedule their time when they can take advantage of it When they can rather than duplicating that silicon and for you know for eight, you know 12 places in the future So you're talking, you know normal dedicated cpus plus the specialized mike stuff next to it for On demand so to speak Yeah, but I would actually expect that that would trap like sse instructions and other vector instructions and remove the individual vector units from every core And push it all on to something like mike Because I think you get silicon back for that Interesting i'm not enough of a smoke packer to know whether that's true or not But that's uh, that's an interesting theory Yeah, no, I'm not I'm not a computer engineering like hardware enough guy to really know if that's feasible or not But I would not be surprised and like I said, we already see it with amd. They've got their fusion or whatever They got a cpu and gpu core all on one hunk of chip so This is true and um, let's not forget the whole denver project out of uh, Nvidia right where they they got arm licenses and The supposition that I think is pretty obvious is that they're doing something there to put things on chip So I don't know we'll see Yeah, we're going back to vector Well, yeah, that's vector mainframes A lot of people would love that the big smp again a monumental huge 20 petaflop smp would just be very welcome I think No one could afford it but Well, no, it would be the next model of tivo, right? You need that to record all the Well speaking of big big monsters, uh, did you guys want to talk about blue waters and where that's at here? Oh, yeah, let's let's talk about some blue waters It it well, it's it's of course it's as as of recording here today. It's still up in the air I mean ibm canceled the contract about a month ago Um, they weren't able to uh Deliver the story came out this week. Thanks to the freedom of information act That ibm tried to delay Blue waters by a year And as far back as december of uh, 2010 And the ncsa guys were not going for that and eventually led to the the thing collapsing So the big crisis now is can they replace Uh ibm as the vendor for blue waters before the money goes away So I want to do a quick vegas ads uh brock and jeff Which is it going to be is the money going to go away or are they going to pull it out and pick a A cray an sgi or even an ibm maybe as a vendor. What do you think? jeff All right, so i'm going to take this and I I have no vendor knowledge involved here So I I I am not giving away anything at all because I am not part of this situation at all but I would find it remarkably difficult To be able to do what they want to do In the time that's left With any vendor right because ibm had done so much work And to push the envelope and to be able to achieve the scales that they wanted to achieve In blue waters and they were you know getting down that road and finally they said no We're just not going to make it and now you've got not enough What was it three year contract and they were two years into it. Is that right? Yeah. Yeah, yeah And so there's less than a year left for somebody new to come on the scene And uh, you can't take anything that ibm's done. You got to do something new and it's not like you can just You know throw a bunch of sandy bridge servers together and and call it blue waters, right? It's going to take a lot more integration Than that. So I I think the money's gonna go away. That's just my personal opinion Okay, brock. What do you think? Yeah, I mean, I gotta go with jeff. There there's not enough time. I mean, there could be an extension. Also, I You know Oh, man, the the craig guys are gonna hate me But I think ibm so far in the recent past has been the only company to show that kind of innovation into us Uh fujitsu could probably do it but given You know A political perceptions of funding everything else. I don't think that's gonna happen They're not going to be brought in as a vendor. I don't think so I say probably only 20 oz that this whole thing is going to fly and we will have a sustained one petaflop machine Because like that that sequoia machine you said that's that's 20 petaflop peak But that's not one petaflop sustained. So this is solving a different kind of problem So I'm gonna you know what hold on I want to amend mine a little bit because I forgot about this is embarrassing But I forgot about craig and they did just a new introduce some new model that I know nothing about Um, is that guy the xe6 or something something? Yes. Yeah, that's that's their that's their gpu accelerated machine So okay, so it uses the antiprocessor that I don't know. Maybe that's a contender, but I know nothing about it So before somebody shoots me down That's what I want to say. I'm sorry rich. Go ahead. All right. All right So so we got we got two votes. The money goes away I'm gonna go out on a limb and say ibm comes back in with blue gene q And gets a revised contract that they can live with and wins this thing before the money goes away I have no inside knowledge to that effect, but I think that's what's gonna happen So we'll have to see what happens All right. Give us some rationale on that. Why do you think that's good? Why do you think that's well I mean, I think I think Brock touched on it mean ibm has shown You know, they've they've put out a boatload of patents on blue gene q It's a totally different architecture than what they bid for blue waters, right? That was power seven From what I understand they had trouble building that thing at scale with the interconnect for Um In the time frame they had would just cost them too much money. It was a business decision ibm wants to make money They pulled out even though it embarrassed the hell out of them. I'm sure Um, this could save them face blue gene q is a totally different approach Um, can they do a petaflop us? Sustained I think they can convince those guys that are looking at a big empty computer room that they can So that's my bet interesting Yeah, I'd be curious if you can actually get there with a wide enough variety of codes is that one petaflop mark Using the traditional q of blue gene style architecture because blue gene was exactly what I was thinking about about ibm demonstrating you know recently Having something new and fully integrated that was truly A changer of the way you can do a high performance system. So Yeah Yeah, uh, okay, I'll revise mine up to 30 percent. How about that? I'll give you that You got I'm still gonna I'm still gonna stay with uh thumbs down And I'd love to be proved wrong because it's a it's a crummy situation For everybody involved, right? So, you know, I know some of the guys over there at ncsa and whatnot And everybody's bombed and so it just kind of sucks. So I'd love to be proved wrong But uh, maybe I'm just a pessimist Yeah, I mean those guys they'd love to tell their story from their end and get on the record But they just can't so well, let's hope for the best for those folks Well guys, it's really been fun talking and I'd like to announce that we've already decided to plan to do this again live from the super computing uh show floor Inside hpc. You'll have a booth up on the sixth floor and uh Brock and Jeff I've agreed to come up there and do a show summary on thursday. We'll maybe even try to do it live So, uh, I'm looking forward to that and seeing you guys at sc 11 in seattle Yeah, with the red hat With the red hat do I get to tell that story before we go? Tell us where the red hat comes from. Okay, so I've been wearing this red hat for years but it started at super computing actually and We at I was working at sun doing the booths there for years and in 2003 we built a rocks cluster in our booth With the sdsc guys and it was 128 node cluster started from bare metal With nodes on the table and by nine o'clock on monday night. We had that big bad boy up up and running code So we all got red hats for that and I've worn it ever since and Part of it was it really pissed off my bosses at sun to be wearing a red hat Which is an excellent reason to keep wearing it. Yeah So, yeah, we all know where that went but anyway Well, cool. Yeah, so we're all going to be at super computing We're going to do this live show or some flavor of show on thursday, which I think will be tremendously fun So, uh, stop by come find us all. We'll all be in our various places in and around super computing. We'd love to say hello Okay, guys, thanks a lot. Yeah, we'll all be at sc and we'll have the shows soon And we'll give it to rich to post on inside hpc so he can distribute it how he wants And we'll um, see all you guys around. Thanks a lot Thank you. Thanks