 Welcome to another edition of RCE. Again this is Brock Palin. I have again Jeff Squires from Cisco Systems and the open MPI project. Jeff, once again, thank you for the show. Woo! Yes, glad to be here. It's always fun to do. Yes, so SC is only about a month away. Both Jeff and I will be there. Jeff, you'll be hanging around the Cisco booth or random other MPI partner booths. Yeah, I'll be hanging around the Cisco booth quite a bit. A couple of other MPI things. There's an MPI forum boff and an open MPI boff. But generally I'll be hanging around the corporate booth because who pays my paycheck. Yeah, I don't have a booth. One of the benefits of working for a university that does not have a booth. I'll just be walking around, but this year we have a limited number of RCE t-shirts. So if you see one of us and we have shirts still and you say you like the show, we've got a t-shirt for you. So if you want a nice RCE t-shirt, say hi to one of us. Collector's edition. Yes, yes, limited quantity is keyword. That's right. Also, you can find the RCE online at RCE-cast.com. You can also follow me on Twitter where I tweet new shows when they come out at Brock Palin, all one word. And you can find all those linked off of the RCE site including Jeff's blog. That's right. We just launched our new blog site, which we're very pleased with because I think we were using the best blogging platform of about 1999 or so. And now we actually are using some modern stuff and it's just awesome much better. But my Twitter is just ace wires. And you'll see that one of the people I follow is our guest today. Yes, actually I follow our guest today too. Now I'm hoping with our guest today we're not going to end up into some sort of buzzword bingo. It's an evolving, it's an evolving section of our industry. So who we have with us today is Nicole Hemsoth of HPC in the Cloud. And so I'll let Nicole introduce herself and what she actually does with HPC in the Cloud. Nicole. Hi guys. Thank you so much for having me on. I love your tweet, by the way. By the way, I am on Twitter at HPC in the Cloud. I'm pretty easy to figure out. So it's interesting that you said the word buzzword because I, you know, even though I'm working with a publication called HPC in the Cloud, I do also believe that it's something of a buzzword. I believe that it's misused quite often. It's a marketing term. However, there are some really great use cases that we do try to cover in HPC in the Cloud, demonstrating strong uses of a virtualized environment for HPC applications. Now our publication, we are a sister publication of HPC Wire and they were covering cloud computing for some time and it generated a ton of interest, enough so that they felt the need to spawn a new publication out of it. So that's sort of our roots. Yeah, it's interesting because over on the industry side of HPC kinds of things, I do hear people will talk about Cloud quite a bit. It's still eschewed quite a bit by many academics and researchers by saying, oh, that's just something that's going to rob me of performance. But I have to say it is gaining some non-zero traction on the industry side for people who find very practical uses and it fits very well in their environment. So I wanted to call, could you give us some of what are the common or better definitions of HPC in the Cloud? To be honest with you, I think if we're talking about HPC specifically, the best definition for Cloud is simply defined by on-demand resources. So it's not as a VMware type definition of Cloud computing that we're talking about. It's this idea that you have capacity whenever you need it on a pay-as-you-go basis. I think the pay-as-you-go and the on-demand are two of the most important components here. And so when the publication first launched, I went through this whole identity crisis. How do I define for the publication what Cloud computing is? Do I only stick with that which is virtual? Do I carry over things from grids and after all this is all a carryover from grid? So right now, I'm just sort of taking the approach that if it's on-demand, if it's off-site, then it's Cloud to me. Now, lots of people could take issue with that definition. I think if there's any good definition, it's the NIST definition, but that one too sort of lets something flip through the cracks. And what is that definition that's not complete? That the NIST definition is pretty comprehensive and it does invoke some of the virtualized environment as critical. I don't necessarily believe that it has to be all virtualized again in the VMware type definition, the commercial mainstream definition. On-demand companies providing HPC, HPC on-demand are to me Cloud. Some of them call themselves Clouds, some of them don't, but the fact is it falls into that definition of being available whenever you need it on a pay-as-you-go basis, which makes it ideal for those bursty demands, which to me seems like a really great use case for Cloud right now. For a lot of HPC users, they are not moving every aspect of everything they're doing into the Cloud. This is being used when they've over exceeded their cluster capacity. So it's not an across-the-board thing. It's not like HPC is all moving into the Cloud. I think all of you know what problems there are with this. I think in the next few years, however, some of those issues will be mitigated. So another use case scenario, a logical extension of what you said, I think, is that I have no HPC capability at all, but maybe three times a year or 10 times a year. I have a big honking job that I need to do and I can just farm it out to some of that be considered the same kind of classification. Yeah, absolutely. And again, that's one of the best use cases I've seen in the most prominent type of cases I've seen for large-scale enterprise and even research centers using Cloud resources. It's certainly the workload that they're moving into the Cloud. However, you do want to define it, whether it's just an on-demand resource or if it's EC2, for instance. It's just when they kind of hit their maximum capacity, they need to take a certain workload that is not maybe security dependent and then off of that into the Cloud. That's the most common use case that I've seen from the range of companies I've talked to and researchers have So you've talked about what Cloud is good for, this bursting in demand, this excess demand or I think it would also work for a small researcher who pretty much almost never does anything, but needs to do some computation every now and then. What do you see, though, is where people are using Cloud for HPC when they really shouldn't be? What's the biggest case not to use Cloud in some situations? That's an interesting question. There are certainly lots of HPC applications that have no business being in the Cloud, of course, because they rely on that fast communication. They require it. So usually the people that make that mistake, and it is a mistake, they are only doing it because they don't have the financial resources to buy their own cluster and so they think that they can push some of these things out into EC2 and they do, I mean, they can do that. It's just the performance difference is pretty massive. I understand. So you mentioned EC2 in there. EC2 is a virtualized environment. What about some of these, I don't know if you call them cloud providers, but HPC as a service where it's actually a hardware provider, like it's not a virtualized OS. You put your workload over there, they run it, you get the results back. In that case, it should just look like you're running on hardware. Is that happening much? Do you consider that HPC in the Cloud or not? I consider that HPC in the Cloud and that's where that definition issue becomes pretty sticky. And this is nothing new. Certainly renting out time on a supercomputer or a large cluster is absolutely nothing new, but now that there's this marketing buzz phrase to slap on it, the ecosystem has sort of changed and some companies have adopted it, others haven't. HPC on demand companies, I've talked to several of them in the past. The major providers they've got, Penguin, you have a company called Stablecore. I was just at an event last week, sponsored by our systems. They're a champagne or banana-based HPC as a service provider. They have a much easier sell for their customers, let's put it that way, because they have an easier time defining what the resources are. There's none of that concern that gets invoked in the virtualized environment, of course. But again, this is nothing really new. It's National Labs, Supercomputing Centers. They've all been doing the same sort of business model for them for years and years. It's just that there are a few more companies offering this to smaller researchers. So you've defined what some of the good k-zards and some of the good reasons are and some pretty good definitions. What makes this hard? Why is there so much buzz about it? Why is there so much press about it? Why are there pundits and naysayers? What are some of the challenges that people face even when they have a strong case for moving their workload into the cloud? Okay. Well, one of the biggest issues and primary issues is porting applications in the cloud. Sometimes that is a very, very big challenge. It's often overlooked and lots of what you read written about the cloud. It's not like you just say you call up Amazon or you secure yourself a couple of instances and you throw whatever you have on there. There's a lot of work to be done in advance. Furthermore, simply getting your data into the cloud is a big challenge, especially if you're dealing with large data sets. And this is such a big problem that I think it's prohibitive for a lot of companies looking at a resource like Amazon. Maybe not so much with an HP stand-demand provider, but just getting your data there, just getting your application to function properly, those two issues can be extraordinarily tricky for lots of companies and researchers. So it sounds like that in a lot of cases you almost need a cluster admin to help you out with your application because you get an image over there and you need to install your application and run your application, which may normally be helped with your local admin. But if you don't have a cluster, you don't have a local admin, is there any service providers out there kind of providing this middle, you know, it's like middleware people, like people who are sitting between who are comfortable with the Unix environment, who can help researchers who maybe not so much to actually get their stuff onto the cloud. There are a number of providers from smaller companies to individual consultants. Anybody that I've talked to on the corporate level that's made this jump, and oftentimes again, this jump was made simply for bursting capacity. They did bring in a consultant just to handle these two sticky issues. I think it's probably a good idea actually to have somebody who is experienced with whatever the cloud is that you're jumping over to. So an experience with the Amazon offering or with Microsoft as your offering to sort of guide you through the process. It is not as simple as it's made out to be. Has anybody tried doing an actual application service where instead of we provide cycles in this on-demand environment but instead we provide a Hadoop farm or we provide a Abacus farm or we provide an R-farm and it's much more specialized to specific application. So the application is already set up. I think some of the smaller on-demand companies are able to do that. Whether or not they have that ready to go for you, it remains in question. But I think if you were to approach, if you had very specific demands and you approach one of these companies and made it clear what you needed, they would often work with you. They do have some sort of flexibility. Now one of the issues that a lot of people have with something like EC2 is that there is zero in the way of support for the most part. And I think that's why a lot of people I've talked to have chosen on-demand resource providers over the public cloud. Again unless they were just running a very quick job that this embarrassingly parallel stuff ideal fit for cloud. I think a lot of them make a decision to go with a service provider because they get that sort of looped in support. This is actually, you're harping on a term that's very near and dear to my heart because as a maintain open source software stack that's fairly commonly used for HPC application, we get a lot of questions on our public mailing list that sometimes there's local people around like Brock I believe serves as a good advisor at his university and just a bunch of local support people that help get this stuff going because it's complicated and it's hard. But there's a lot of places out that just don't have little support. We end up talking to the civil engineer grad student who was charged with get the cluster running. And it's a real challenge because this is complicated stuff. Do people have this kind of support out on these on-demand kind of resources or is it good support, bad support? What's your experience with that? That's a really good question. And in fact that's something I feel I definitely need to start addressing more in our publication because you definitely don't get the site of the story in a lot of the literature you read about cloud computing. And when you talk to users across the spectrum, one of their biggest concerns is this lack of support they get from the public cloud. So Amazon EC2, while it's a great resource, you need to be pretty specialized to use it. I do not care what they say. It is complicated for researchers to figure out, especially researchers who want to concentrate on whatever it is that they are researching. Once they have the hang of it, it's not so much of a problem. However, HPC and demand providers, part of what makes them so attractive, especially to smaller research and development companies, smaller organizations that don't have the resources to hire somebody to take care of this on a regular basis, they lack the idea that there is some sort of support network there. They can make alterations as needed if it's possible. They can provide, oftentimes, I think, most of the major providers have around the clock support. And I've never heard a sad story about trying to get a hold of support from one of these companies and not getting a response like I have with both Microsoft offering as well as Amazon EC2. So, you mentioned before that there's apps that rely on high-speed networking. And certainly, I see that in the MPI environment, too, that there's just a lot of apps out there that are dependent upon the latency and or the bandwidth, particularly as core counts going up. That's a whole separate conversation in itself. But we've seen EC2 has these, for all of its pluses and minuses, they've deployed these HPC resources where they have 10 gig in there. And I think Penguin has Infiniband networking available. I'm not 100 percent sure of that. But do you see some of these on-demand people, if they're particularly trying to hit the HPC ghetto, that they're deploying some of these faster, better networks to service this kind of stuff? Yes, I do. And I think that moving toward Infiniband is a big goal for a lot of these companies that haven't done it yet. Infiniband GPU, that's like the golden combination for a lot of these companies. Ah, so you mentioned GPU. Let's talk about that bit. So, again, that's a whole separate conversation in itself, the pluses and minuses of GPU. But let's assume for the moment that it's a net gain for HPC applications. How do you see the uptick of this going on, both in providers and or cloud-ready applications that can take advantage of on-demand HPC resources? To stand a little bit on what I just said about decoupling of Infiniband and GPU, every on-demand company that I've talked to has at least hinted at having at least two of those before the end of the year. So, I think around FC Time, we're going to see a lot of announcements in the Infiniband range and also in the GPU range. And I think what a lot of these companies are going after, and this is sort of something I just gleaned sort of from reading between the lines, is that a lot of their business is coming from folks with big time modeling and simulation needs, who aren't even bursting necessarily, but who are using HPC on-demand resources as the backbone of their IT, basically. So, being able to offer that care certainly to that big customer base that's been, I think, overlooked in some ways by the public cloud providers. So, you keep saying public cloud, public cloud. So, you're focusing on companies you can go to. Do you see any large organizations building what you call like an internal cloud, a resource provider where your business unit or your research group can go and say like, hey, here's 50K. We need this much time. What kind of model are they following? So, that's the essential concept behind the private cloud. And if you want to get into definitions right about cloud, I think nothing generates more debate than the very concept of a private cloud. So, this is something functioning within the corporate firewall. So, everyone's comfortable with it because there's supposedly none of those security concerns that are floating out there if you send everything into ECQ for instance. So, private clouds are definitely gaining traction in the enterprise. However, you don't see a lot of that or even discussion about private clouds and research centers outside of maybe Berkeley with its Magellan project. Private clouds are sort of just an extension of what was already happening. However, again, we have a lot of marketing buzzage around these words and private cloud is certainly not immune. People think that it removes, companies think that this removes risk while saving money. And in some cases that's true. In other cases, it isn't. Cloud is not a guaranteed score in terms of saving money. For some applications and for some companies, absolutely there's no question, not for everyone. Okay, so there's a lot of talk around security with clouds. What's the common worry about clouds when their VMs being torn up and tore down or their stateless machines are being recycled between users? What's the worry about security? Well, you just kind of hit on to the major concerns. So, when you have a multi-tenant environment, the fear is of course that your data is somehow going to leak into someone else's vice versa. There's also this sort of fear about hypervisor attacks. And I just want to say a word about this because this is very, very rare. However, there are a lot of security out there that are marketing products aimed at protecting the hypervisor, which is important, but these attacks do not occur all that frequently. So, a lot of the buzz about the buzzage about cloud security, some of it's founded and some of it, like anything else, that generates fear among businesses is generated by companies who have a something to gain out of perpetuating the ocean that cloud security is a big issue. I'm saying it's not, certainly not. Companies, researchers, everyone needs to be vigilant, but the fact is the cloud tools that are already built in, for instance, to Amazon's offering and Microsoft's offering, you can make modifications. Most companies with qualified teams, you make lots of modifications, but it's not the biggest issue with cloud and everyone talks about it like it is, but I don't necessarily think it is. And certainly not for HPC and certainly maybe for businesses that are dealing with sensitive information. But if you're just running some code to get to the bottom of whatever your research question is, this is not something that needs to be at the top of your mind. So, you mentioned a couple of times the price point. People do this for financial reasons and whatnot. How is that working out? Are there actually scenarios that save people money or is it deceptively expensive that you get launched into it and then you find out that, oh my goodness, we could have actually bought our own cluster for this kind of price? The bill versus buy question is, again, one of those issues that not enough resources are talking about these days. As I said earlier, this does not work for everybody. This is not always the cheapest solution. What companies need to do in particular is, and you can bring an outside consultant if you need to, or you can look at this on a per node basis. It depends on your application. It depends on how much data you're sending and receiving. It depends on all of these different factors. There's no hard and fast rule here on whether or not this will benefit your organization. The same thing goes with HPCN demand providers. If you aren't just using them for bursting capacity or maybe seasonal demand, this one in particular does not generally work in favor of the enterprise. Let's put it that way. It is a lot more expensive, but you have to look at, too, that you don't have full-time people on staff on a daily basis to manage and monitor their machines. Again, there's just no hard, fast rule here on cost rent. Now that we've got the cost side of the way, let's talk about the growth. There's been companies popping up all over providing these clouds or cloud type services. What is the growth of HPC cloud or what do you see cloud being? You said there was expecting some announcement at SC. What do you expect to see after that? Do you see an uptake of what do you see a year or two years? Well, certainly cloud computing, unlike grid. I came before it. The traction with the buzzage hell on. It's taken hold and small to mid-size enterprises, and it's trickled down from small businesses actually into research, which is a total flip flop, by the way. It used to be it was the research centers that were responding to the most innovative inventions if you were on the computation side. All this is coming up from small mid-size businesses. There's a lot of buzz about it. It's not going to go away. With that said, there are going to be a lot of new variations on the same old thing. For HPC, however, there are some interesting things coming up. I have to say, as much as I feel like I've frammed Amazon, I wish I hadn't done that. I'm sorry I came across that way, because the cloud computing instances are, I think, very valuable for a large number of researchers. Also, I do think other cloud providers, public cloud providers are going to have a similar, if you want to call it an instance type. Like I said, for FC, I think this year, we're going to see a lot of HPC on-demand providers offering that golden combination I talked about earlier, the infinity on GPU. But the space is growing. It's growing far faster for small and mid-size businesses. HPC takes a little bit longer for lots of very good reasons, including the latency issue, which you talked about earlier. Just in general, a lot of research centers spend a lot longer doing this kind of cost-benefit analysis. They've got huge investments and on-site resources. Things are going fine for them. Why switch? We see a lot of press. You see a lot of conflicting PRs, to be honest. What's the largest and or the most unusual deployment of what you've seen in cloud or on-demand resources? I'm going to take this in a different direction and look at the most comprehensive cloud deployment. Actually, it's for the purposes of research. Berkeley National Lab, to me, is one of the most interesting possible places up in CERN in my space due to how they're making use of cloud, how they're testing it, what types of applications they're using in the cloud. This isn't just Amazon's standard cloud offering. They were one of the first to try out the cluster computing services. They've also built their own private cloud called Magellan. I think for anyone interested in learning more about how clouds actually fit into HBC, it is definitely worth a visit to Google Scholar and just type in Berkeley cloud computing and take a look at what they're doing. There are tons of papers coming out of Berkeley related to all the different experiments they've done using applications that are like the Monte Carlo type applications to really, really complex applications that are certainly reliant on latency. Their findings are across the board. Some papers find that performance is actually the same. Others find completely the opposite. They've made all sorts of tweaks to try to figure out how to optimize that environment. The Amazon environment plus the environment they created from Magellan. I don't have any crazy cloud story for you. It's a crazy cloud with two Ks. I do think that for anybody truly interested in this and finding out what the cloud can do and what it can't, checking out Berkeley is a really great idea. Okay, Nicole, thank you very much for your time today. This show will be going out actually at the end of the MSU, Michigan State University versus University of Michigan football game. It's crazy this year because both teams are undefeated. Hey. Yeah, nobody wins in that game. Nobody wins in that game. Yeah, I know. I know. So thank you very much. Jeff, again, J. Squires on Twitter. I'm Brock Palin at Brock Palin on Twitter, all one word. You can find us online at rce-cast.com. And Nicole, thank you very much for your time. And again, where is HPC in the cloud online and on the Twitters? On the Twitters, we are at HPC in the cloud. And online, we are HPC in the cloud.com. And I do encourage everybody to sign up for our weekly newsletter. If you don't have time to sort of catch up and you don't use feeds, because for some reason people in HPC, as technologically advanced as they are, don't use feed. I found the newsletters are a great way to keep up on weekly coverage to kind of give you a best of our weekly wrap up. So thanks a lot, guys. I really have fun. Well, thank you. Thank you.