 Live from the Austin Convention Center in Austin, Texas, it's theCUBE at Dell World 2014. Here are your hosts, Dave Vellante and Stu Miniman. We're back, welcome to Austin everybody. Stu Miniman and I are really pleased to have Jimmy Picon, he's the Vice President and Senior Fellow and Chief Architect of Dell Engineered Systems. Jimmy, welcome to theCUBE, it's great to see you. So, Engineered Systems is in your title, but really, we're going to talk about HPC, but let's start with your role inside of Dell. Interesting title, Senior Fellow, Chief Architect, you got VP in there, what's- Did today have to make bigger business cards for you? Yeah, actually. Well, so my title is VP and Senior Fellow, Chief Architect of Engineered Solutions. Five dollars and you can go to a Starbucks and get any cup of coffee, can't work. So, actually, we have a technical ladder, okay, and there is a Senior Distinguished Engineer Fellow and Senior Fellow. So, I'm at the top of the technical ladder, as far as the Dell Recognized Technical Ladder. It's an easy way to get there, okay. The senior part is I got old. The technical part is being a fellow, is being recognized as not only a Dell leader, but an industry leader and someone who can influence the industry and certainly has their finger on the pulse of the industry, that's the role that I'm expecting to play. And so, I'm responsible for the technology that goes in for the products that come out of the Enterprise Solutions Group. So, we love having folks like you on, because we can pick your brains and help us sort of squint through what's really happening in the industry. So, I wonder if you can, let's talk about specifically HPC, where it's been, where it's come from and how it's related to this big data meme. Everybody's talking about big data and the HPC guys that we've been doing big data for a long, long time, but give us your perspectives on that. Well, sure. So, I think there is a general perception that HPC is for the geeks. It's for the research guys, the guys that are making new science exists, the guys who are looking for solutions to enormous problems that have never been solved before. Geeks and spooks. Yeah, and that's true, it is for those. But it also has a lot of practical applications that are much smaller than that. For example, one of the things that we've done is we've partnered with a company called TGM, and we've created a solution that's actually a genomic sequencing solution. And so, when this is out in the world, you'll be able to go to your doctor's office and have your personal genome sequenced in about four hours. And those use the techniques that the high performance computing community has developed. Now that's an example of the smaller part of HPC. Certainly it is large. You can go, HPC is really high performance computing. So, it's not necessarily high performance computing in the large national lab kind of model, but it does scale that. Now the significant thing is that the techniques that we have used in high performance computing up to now are being applied in lots of other places. For example, if you look at big data, big data is a scaled out model for doing data processing for understanding things that happen in large amounts of data. In fact, if I look at the technologies that are used in HPC, and I look at where big data is going, it's becoming quite difficult for me to tell the difference between the underpinning. It's the use and the application of them. So, I'd actually say, and by the way, if you look at the enormous amounts of data that we're starting to deal with in the world and the ability to process those pieces of data, you need those high performance computing techniques to go through that amount of data to make sense of it. So, I think maybe a better term going forward is about big computing. And big computing is not only is it high performance computing in the way we've known high performance computing, but it's big data. And you can look at the merging of those two because if you've got billions of zettabytes, I think the IDC guys say that by the end of the decade we'll have 40 zettabytes of data. That's a lot of data. And if you're going to go through enormous amounts of data, specifically unstructured data, which is really what Hadoop, what the map reduce model is all about, you need those techniques for being able to do that. So, I see a real convergence between high performance computing and big data, and I guess for lack of a better term, I'd call it big compute. So, that's an interesting narrative. And I remember when I first came into the industry, there was a lot of buzz from Danny Hillis with Thinking Machines and Kendall Square Research, my friend Ed Gershenson built the IO subsystem, used to educate me on all the cool stuff that you could do, and of course, Cray. Like you said, Jimmy, it was always sort of confined to that little niche. And so when you think about HPC and now what you're calling big computing, scale out, shared nothing, the database is what, help us sort of understand for this lay person sort of what the characteristics of that big computing is and what it means. Fair enough, let me give you a couple of data points. I'm not sure exactly the numbers, but there's around six billion cell phones that exist in the world today. And interestingly enough, only about three billion, I've earned about half of them, and we're actually only about people. The rest of them are vending machines, are traffic sensors in the roadway, they're tracking devices on trucks, they are security cameras, and what this does is this creates this huge amount of data that we have to deal with it, and the key thing about it is it's not the database models that we're used to, it's unstructured data. And what that means is just big globs of data and you have to be able to go deal with that data. And that's really what MapReduce is about, what Hadoop's been about. In fact, I'll tell you one of the things that we've done to help make this something that people can consume is we actually have an appliance that we created, and it is Intel Cloudera, it is Yarn, which is sort of the evolution of MapReduce, using scale out capabilities like Yarn. You said Yarn, right? Yarn stands for yet another resource negotiator for the geeks out there. But it uses scale out capabilities. And it really takes Hadoop from batch, helps it get more real time, right? Right, and then we added Spark on top of that. Which is the in-memory piece. Exactly, and so now you can go do data analysis in that model without being a data scientist. And for example, Spark has about 80 intrinsics built into it, so you can go use those intrinsics that are built in. In fact, you could write your programs in Python, which is, looking back, I never thought Python would be such an important language in the world, but it really is. But the idea of doing that in a way that you don't have to be a data scientist, that's one of the things we've done. And the idea is it takes the concepts of big data, of course, the big data is there. And we have three sizes, small, medium, large, really creative sizes. I like it, like the cube. And it is really the emergence of the things that we've been doing with high performance computing, and the way that you connect things up, and the things that we've done to speed up computing, and the emergence of using this vast amount of data. And last question before I let Stu jump in here. What about the data aspects? I mean, traditionally, high performance computing, the data types versus what you're just describing with all this unstructured sort of schema on right, or no schema on right. Yeah. Where are the similarities and where are the differences? Well, so high performance computing has really dealt with big data for a long time, especially if you go look at the large models that exist. Some of the most, I was trying to think of an example of an extremely, I can't think of an example right now. But if you look at modeling and simulation, modeling and simulation specifically around things like jet rocket engines, the automotive industry, what they do with their. Weather. Weather's a great one, right? And those have, we have dealt with large amounts of data for a long time. The difference is the data that the high performance computing has dealt with has been largely batch data where you have your solid database and your solid source of data, I should say, and you go out and you perform these big modeling pieces on it, and you move the data to the places where you want to do that. That's a real similarity where we go with big data. The difference is the data is much more fluid in the big data model than it has been in the past. So the idea of bringing big math to the data, which is something that has happened for years in HBC, that's what we're doing with the Duke. We're bringing code to the data, not the data to the code function. Yeah, that's a pretty good way of describing it, yes. Okay, good. Stu? Yeah, Jimmy, I wonder if you can help connect the dots for us, what we were just talking about from the application and from the infrastructure standpoint. I think we're full agreement with what you were saying. We actually did a market definition and market forecast of what we call server sand, really scale out architecture where storage pulls back into the compute and allows for some of these environments. If I look at the state of the marketplace today, solutions like what you have and with your partnership with Nutanix and others, those are going for virtualization and VDI. They're not running the kind of NoSQL and big data environments. Most of those today, it's a lot of DAS because it's a smaller environment. Some of it's going into the public cloud. Some of it is going to some scale out like NAS architectures and playing around with some of the new ones, but help connect the dots for us. I think the best thing to understand is, especially in IIT, one size never fits all or one technique never fits all. I could tell you that we were going to converge to a single model and you should probably shoot me there because that's not the way it's going to happen. So if you look at what people are doing, there are companies out there that have basically built their business on doing their compute a certain way. There are companies that do VMware, the Evo rail stuff that their companies like to do Microsoft, you can go see the CPS platform, Newtonics, there are companies that want to have that model and the idea of being able to provide the best value to the customers what this is all about. Okay, so I guess from an application standpoint, do you see those kind of the new, the analytics applications? Where do you see those fitting into those solutions that you just mentioned? That's a really good question. I think it becomes really more around the business problem that the customer is trying to solve. I think there are analytics pieces that are going to emerge and they're almost, you can also look at them and sort of fix the clients model. We have some business intelligence applications that we own as Dell now and I just lost, I can't remember the name, but anyway, there is Syssoft is the company that we're talking about. And that is, it came out of the process control arena. So the ability to do understandings and analytics of things that happen in the environment. Steps off. Steps off. Yeah, we had Matt Wolken on yesterday, I was looking at my notes. Oh yeah, that's good. That's good, I can't believe I can't remember that name. But yes, I think there are tools that we're going to provide in those environments. And then a lot of the stuff that goes around that is going to really deal with what the customer feels is the value he's looking for. So our intent is to try to provide the best value to the customer in the environment they're building. Yeah, so Jimmy, you're a senior fellow. So we'd love to ask about kind of the research culture inside the company. Has the privatization of Dell over the last year, has that had an impact down to the research side or has that been just going full steam? Yes it has. I think it's easier for us to do things that we might not have been able to do in the past. We can do things that have a little longer time for them on them. I think that one of the things that the financial community likes to do is they want to see things that happen that make an impact on the next quarter's results. And that makes it extremely hard for you to deal with things that do not have an immediate return on investment. Some of the acquisitions that we may really fall in that category. You can look at some of the things that we've bought. Dell, I'll note, SecureWorks is a success. Dell bought SecureWorks and a lot of people in Dell probably didn't know what SecureWorks was and what it was going to bring to the value. That's one of our shining stars. Now that occurred before we went private. But as far as being able to do investments like that and being able to reach out and do things that might not necessarily do an immediate return on investment, much easier now. Jimmie, I wonder if you could talk to the notion of the secrecy and the cloak and dagger of the HPC world historically and compare that to what's going on in big data. You see some of that. What about the whole privacy aspect? What should people think about, how should we look at the notion of HPC coming to the mainstream and big data? How does that can affect privacy? Well, that's a good question. There's a lot of sociological changes that has happened. I have children and their view of privacy is much different from my view. Now, as far as being able to understand where things are going with privacy, the big data is going to happen. The collection of data is going to happen. One thing to remember is everything you put on the web, it's there and it's going to last forever. So you really need to be cognizant of what you're putting out there for people to basically look at and understand because you can't take it back. So Dell has a lot of security solutions and I know it cuts across. It's one of your big software businesses. How do you as a technologist sort of work and grab that technology, that IP and get it into solutions? That's a really good question. I think that, and this is one of the other things that I think has happened since we have gone private, the relationship between my group, the engineer solutions group, part of the enterprise solutions group and the Dell software group, it's much better for whatever reason. I don't know if it happened because of this or it evolved to that. So we have a close collection of John Swinch and his organization and the products that we're doing. You talked to Matt. So we have these great products, SecureWorks, the Sonic wall, being able to reach out and understand the direction that those teams are going and how that relates to what we're doing. It's a much easier relationship for us to go understand that. So it really is an appangent on people like myself and like the technologists and John's teams to understand where we're going and where those intersections need to be done and make that happen. So how do you see this thing all coming together? Is the HPC world and the big data world, are they colliding? Are they on sort of parallel paths? Clearly we talked about the technologies coming together. Are the skill sets going to sort of, there's a real lack of skill sets in the big data world and those two worlds going to come together and are we going to see this HPC, the whole technical computing thing, finally explode into the mainstream? So yes, I think we're going to see HPC explode into what you call the mainstream. But it really, you have to understand what segment you're talking about. Does HPC mean it is an astronomically large compute model that you're building with or is it one that is sized for the problem you're trying to solve? Here's a piece of data. In the US today, almost all of the real innovation comes from what we call tier two and tier three companies. They're small companies that go out and they focus on solving a solution and then they partner with the larger companies. Those companies really have a lack of compute infrastructure that they use. So I think one of the things that Dell we're trying to use, we're trying to create models that not only fit the very, very large solutions but are actually applicable for the departmental and the division side of things where the smaller companies can actually afford to go do them. And I actually did a talk on the impact HPC in your modeling and simulation and what it can save. So if you really go understand what you can save if you model it and test it and then build it as opposed to build it and test it and build it and test it, I think the impact becomes clear. The electronics industry has done this for a long time. I started out doing ASICs a long, long time and a lot longer than I care to admit. And the behavioral modeling before you actually build it saved millions of dollars because it avoided mask set after mask set after mask set. So if you look at the pace of innovation and you look at the way the industry is changing and the drive that people have for results, putting those compute resources in the hands of the smaller guys is one of the things we absolutely want to go do and we think it's important. And so we don't need necessarily to think about supercomputers of multi-million dollars or tens of millions of dollars of supercomputers. You're talking about bringing this technology to the masses. Yeah, so 16 node clusters, 32 node clusters, 10,000 node clusters. They don't all have to be there. Now, I'll give you a couple of examples of what we've done at Dell. We've been a partner with the University of Texas for a long time. They're Texas Advanced Computing Center. The projects are, they're two of them that we should talk about. One of them is called Stampede. And Stampede is a cluster that we partnered with them to put together. And it's about a petaflock cluster and it's number seven on the top 500.org. We have a similar situation. Oh, and there's actually another cluster in a cat called Rengar. And it is the data side of a companion which is the San Diego Supercomputing. It's called Project Comic. I think there was some conversation about it earlier. So those are three examples of things. Now you asked about the collision of that big data in HPC. HPC has done high performance computing and has developed a lot of things that really have spurred them along. Probably one of the best known file systems is Luster's, right? That's been there. And it's kind of the main thing you might say. The big data side, especially the mat reduce side, Hadoop, HDFS, the models that are even merged. Those are not the same. So I think there is going to be some crossover or convergence between the two but I don't think it's a light switch thing. I think people are gonna start asking questions of how you do HDFS or how you do Hadoop on things like Luster and putting those together. And I think you'll see convergence from that perspective. Now as far as the big data guys are going to continue to do big data because that's new well. The big compute guys are going to continue to do big compute because that's what they do well. The big data guys would probably be angry at me saying this but maybe they don't do compute as well as the compute guys do. And the big compute guys, maybe they don't do data as well as the big data. So I think there is an opportunity for things to merge together and for us to create some synergy but they are going to run as kind of independent paths. But they can learn from each other. They can learn from each other. And grab techniques and designs. One more comment. The most critical resource is people, especially in the performance computing space, the guys who can create those algorithms. People who can understand the problem you're trying to solve and create those algorithms that you then can go implement and look for the problem. But that is the most scarce asset certainly in the HPC space today. So I encourage you, every time you talk especially to people in academia we really need to be creating models and allow those capabilities and those talents to exist. And we're a way shorter where we need to be. Jimmy Pike, we have to leave it there. Thanks so much for coming on. It was really great. Thank you. Pleasure to meet you. All right, keep it right there. Everybody, Stu and I will be back right after this word.