 Live from the San Jose McKennery Convention Center, it's theCUBE at Open Compute Project U.S. Summit 2015. Hey, welcome back. I'm Jeff Frick here. You're watching theCUBE. We're at Open Compute Project Summit 2015. This is the sixth OCP Summit. We were here last year, about 20, I don't know, about 3,000 people. It's really had a small intimate affair and in fact, Frank said they don't want it to get much bigger. So really excited to have our next guest, Thomas Summers, CEO of Rex Computing. We had Thomas on last year. Welcome back. Thanks for having me. So last year we were on, we talked a lot about your story, feel fellow, probably the youngest Cube alumni that we have. So congratulations on that and we decided to get you back on and get an update. So what's going on with Rex Computing? What happened since we saw you about a year ago? Yeah, so it was about, it was January of last year, it was the last summit, so about 14 months, but lots happened. So we've moved away from just focusing on the board level and integrating other chips and instead into actually building our own processor architecture. And we're looking at actually being able to tape out those chips and have the world's most power efficient processor. Yeah, so back up a step. So what's your guys core focus in? You're looking for big, big iron, big computing power. Yeah, so when we started Rex Computing, the goal was we wanted to have the most power efficient supercomputers in the world. And we thought at that point, it may have been possible to use other people's processors and as we were developing that and the system that we showed last year at OpenCube was getting there, but we realized that there are a lot of fundamental issues with how processors are currently designed and built and we were kind of crazy enough to say, well, maybe we could do a bit better. Right. So that's what we're doing. So it's interesting how the power discussion has evolved over time as these data centers have gotten bigger and bigger and bigger and power becomes a real big issue. But you would think for a supercomputer that wouldn't necessarily be the case because if it's a dedicated unit to do really big problems, is power such a big deal? Why is power a big deal in supercomputing? Yeah, so part of the biggest single user of supercomputing is the United States Department of Energy. Well, the Department of Energy is tasked with actually maintaining our nuclear stockpile. Most people don't realize that's the DOE's job and so they do weapons testing or simulations in addition to making sure that these current warheads are still safe. That's one of their big jobs, that's the nuclear national security agency. The NNSA portion of the DOE, but they run a number of the national labs and they have some of the world's most powerful computers. But for the entire Department of Energy, their power cap by executive order, by the president is 20 megawatts and so they have these huge tasks which need to be accomplished and they have this 20 megawatt budget. So and right now their biggest computer which is the number two in the world is called Titan and that contains about 17 petaflops of sustained computing power. And the Department of Energy's goal is to get that up to one exaflop which is a thousand petaflops and they still have this 20 megawatt budget. So while they're currently at about three to four gigaflops per watt in terms of energy efficiency, they need to be at 50 to be able to even be think of exascale being at all feasible. So they need over an order of magnitude power improvement to even think that this thing is something they can reasonably put together. And there's actually a presidential directive as to the limit of the amount of power that they can consume. Yes, and that's the 20 megawatt limit. So talk a little bit about the evolution of supercomputing and high performance computing from what used to be, remember the good old days, the pretty craze with the little bench around the computer on the outside and really dedicated specific devices for that versus using cloud and x86 architecture in massive scale, especially from like AWS or a place where you can basically rent massive capacity. How is that game changing the uses of supercomputing and then what you really focused on the power consumption in supercomputing? Yeah, so when most people think of supercomputers, they think of the Cray one, the system you just described. And so Seymour Cray is thought of as the founder of supercomputing, but the Cray one wasn't actually his first system and what most people consider the first supercomputer and held the record pretty much throughout the 1960s before he went and founded Cray Research or Cray Computer Company at that time was called the CDC 6600. And so that was from 1962 and in most people's opinions, the first real supercomputer. And that was a single machine, had one processor and it was just aiming at doing things as fast as possible. When he moved to the Cray one and started his own company with that, the whole idea was vectorization, where you could take a very large dataset, put that into a vector instead of taking each number and doing an operation on them individually. The idea was put that in a vector or a big, they're not sure if you remember math class, but... Yeah, I don't remember the vector part, but keep going. We've got a really smart audience, so they're hanging with you. Yeah, to explain the basics of it, just you have an array of all the numbers that you're operating on and the thing that a vector processor and what the Cray one did was take multiple of these vectors and multiply a huge set of these numbers all at one time. And so it was a huge, massive improvement over the scalar processors of the time. And back then, 1975-ish with the Cray one, we were down to 10, 15 micron transistors and we were still in the low megahertz for most things. And so what most supercomputing companies, including Cray did was they just kept cranking up the clock speed and they eventually get the transistor smaller which helped the clock speed problem. And so they kept doing that for a while until they realized we're not really getting the same returns from just making things smaller. So then they started making more distributed out systems. So that was the Cray two, you had multiple processors. There are still these vector processors but thought of actually sharing the computation and putting it on different pieces. Then Cray made the Cray three and Cray four which his idea was almost 25, 30 years ago, we should move to some new non-silicon. So his idea was going to Gallium arsenide which was a really crazy proposition at the time and it wasn't successful and that kind of was when Cray started to diminish. And then the Cray run, yeah. And so at that point then a bit later on but in the mid 1990s, this whole idea of cluster computing started coming about where you can take a bunch of off the shelf processors that were typically reserved for personal computers and put them together and distribute that work. And there was a lot of IBM and Cray were calling that weak scaling where the whole IBM, what's now known as FUD, if you're in certainty in doubt was saying that we have our big iron systems over here, we have a big beefy processor and then they were making those systems that were selling for ridiculous amounts of money and then there were these new systems out there which was focusing on having many processors and do parallelization instead of vectorization. And it turns out that in terms of cost per flop, cost per doing the actual job, it would make a lot more sense to go to that parallelization. And so that's kind of what went out but the really interesting thing getting into today is the fact that we're sort of going back to vectorization. Because today's GPUs are vector processors. Like you can think of a... Oh that's right, all the Nvidia GPUs and all that type of stuff. Exactly and so the only reason that they're successful right now is because Nvidia was able to take the old idea of a vector processor and put in a package that appealed to gamers so that they were able to have a huge volume and thus have the cost down. And now they're bringing back that old technology back to the big iron HPC systems and finding some success. But it's my theory that that won't last long. Because of the processor or just because it's not a specifically purpose-built processor or I mean where do people put in their money? You've still got, we just covered the IBM System Z event earlier this year and they're basically repositioning as an integrated system around the modern workloads of cloud and mobile and social and some of those things. And then you've got the massive rise of the hordes of x86s in these massive data centers that the two coexist. Are they workload specific? Which is why one is better than the other? What do you think? Yeah. And then where does your piece fit? Yeah, where does your piece fit? Yeah, I'm not really in the business side of the computing. So specifically the IBM, their work or not really workstation, their mainframes. The mainframes are pretty different from like supercomputers. They're really focusing on getting as many IOPS and things out of that while not focusing on raw computation. And so I think IBM is in their own little world over there which there's the big banks and there's others that have been using IBM systems for going on 50 years. Sure. And so there's a small market there but it's not something I really focus on. But when it comes to the real high performance computing where you're focusing on compute, the main reason why I think that both the x86 world and kind of these general purpose or the when I say general purpose not in being able to do any computation but the sense that what's currently in your laptop and in our phone, a lot of people try to build up like that arm is a lot different from x86. They're really basically the same. But in my mind when I'm coming from computer architecture the way that these x86 processors and arm processors and anything that's really used today including GPUs are built for a old paradigm. They're thinking of the constraints that existed when these things were initially designed 30, 40 years ago and those constraints aren't the same today and we have different problems which I could talk about as well. Right, well I mean it begs the question. So how do you define a super computing problem differently than a high performance computing problem or a mainframe or a C of Amazon servers? Yeah, so I would say that HPC is mostly defined by the fact that you have to do a specific operation very quickly and you have, I think of HPC as being actually very similar to embedded in the sense that with an embedded system you have a lot of constraints on power, size and it's really meant to be doing a specific task. HPC is basically just the room or the warehouse size version of that. But it's actually from the problem solving both software and the electronic engineering mechanical side it's very similar problems faced and there are similar solutions to it. And so the difference with the mainframe like I was saying is like that's focusing on IO on the Amazon I think that's focused on being able to have many different tasks all and being able to spread the compute function to different things kind of dynamically while the big iron HPC, the things that are developed there kind of flow into the Amazon and those very large distributed scale but that takes time. But I think that we're facing a huge problem in HPC and that the top 1% right now and that's when we start affecting the Amazons more and more in the two to five year timeframe three to five year timeframe and then be affecting all of computer in the five year out. So that's why we're focusing on the HPC technologies coming down is what you're saying. Exactly, so in my mind, by focusing on the problems of the top 1% of computers right now we're going to be affecting the design and development of all computers in the future. Right, so if you could describe some of the classic use cases for the HPC we already described the one in the Department of Energy and making sure the nuclear warheads are all safe and sound which is very important. And we always hear about kind of modeling of the atmosphere and you know, it's big massive models. Where do you see, what are some of the entry points you see that computing horsepower coming downstream into some applications where it doesn't exist today? So I think the thing that excites me the most and ties into what I was just saying about how embedded and super computing is close together in problem set. I think it's actually going to be getting closer in that having embedded super computing where if we have self-driving cars or having them in UAVs, you have a very, very constrained when it comes to actual power budget and cost but you still need to have a lot of actual compute power going on there. And while, you know, the cloud is developing we're still pretty bound by latency and bandwidth when it comes to it. The speed of light is fixed, right? Yeah, exactly. If someone fixes, you know, finds a new domain of physics and breaks that, then they win that. But since I like obeying the laws of physics, I'm going to focus on, you know, trying to fix it at the compute level. So you really think you'll be able to take you, the industry, the lessons from the HPC down into small pieces that conceptually run a drone, run a self-driving car? Yeah, and because the root problems that affect HPC really do affect everything else. And what most people don't realize is like doing the actual computation isn't the power intensive part. That was true 40 years, 30, 40 years ago when these architectures were designed. But to give an example, doing a 64 bit double precision floating point operation which is like the most power intensive like computation itself takes 100 picojoules of electricity, which is a very small amount, but when you have very large system that adds up. But actually moving those 64 bits of data from your DRAM to the actual registers of your CPU so that it can actually operate on it, takes 4,200 picojoules. So there's a 42x, it takes 42 times more energy to be able to move the data to your processor than to actually operate on it. It's the age old problem, right? Do you bring the data to the processing or the processing to the data? But you're saying even within the chip you still got that same basic issue. Yeah, and so the requirement, we have to improve that by over a factor of 10 to actually make any of these very, very large scale systems be able to function. And the problem with current architectures is that they don't accept that fact. All these things were designed where it's acceptable to have a lot of data movements. When the data movement today is actually what's costing you the most. Right, right, and there's so much more data in a useful application. Exactly. All right, well, Thomas, we're getting the time that I want to give you the last, give us a quick update on the company, kind of where you guys are and what we're going to be talking about if we see you at OCP 20, OCP Summit 2016. Yeah, so we're developing a new processor architecture, new instruction set, new core design that is based on this concept that memory movement is expensive and doing the actual computation is much, much cheaper. And so it's not some crazy non-vuan Neumann or anything like that where we're sticking to the basic computer root. So it's relatively simple to program, but we're developing a 256 core version of this processor and we're hoping to actually have this chip taped out in the next 12 to 18 months. So it's a potential we'll have something next year not guaranteeing actual silicon, but shortly afterwards, hopefully. But I guess the other big thing related to open computers is I'm the co-chair for the open compute high-performance computer root, which we started up in this past year. Are you the one that they talked about in the keynote that said, why is there no HBC? Yeah, I said, go ahead and start it. Yeah, so it's been a, I won't take all the credit for it. It's been a collaborative effort, but yeah, I did talk to Frank about that last year. Yeah, good for you. But the part of, so what we're doing with the open compute HBC project or my little bit that I'm focusing on is kind of the open silicon. I really believe that having these closed proprietary X86 and GPUs and things just isn't going to work in this community because what OCP has done so far is open up everything above the processor. And I think that there really is a good case and a lot of reasons why we should be opening up things like the instruction set and the actual interconnects that go between chips because if you can build systems with those things in mind, then you get much more efficient overall compute platform. Right, right, interesting. Well, Thomas again, thanks for stopping by. Thomas Summer, CEO of Rex Computing. Check it out, feel fellow, for sure the youngest CUBE alumni, and I say that with great respect and admiration for what you're doing. I'm Jeff Frick, we're at the Open Compute Project Summit. It's number six, it's 2015. Be here all day, wall to wall coverage. Open source meets hardware infrastructure brought to you originally by Facebook. It's an interesting story, it's only going to get bigger and we're excited to be here. I'm Jeff Frick, you're watching theCUBE. We'll be back with our next guest after this short break.