 Welcome to another edition of RCE. Again, this is Brock Palin. You can find us online on our complete back catalog at rce-cast.com. You can also find links to Jeff and mine's Twitter, our blogs, and all that usual fun stuff. Also, I have again, Jeff Squire from Cisco Systems and one of the authors of OpenMPI. Jeff, thanks a lot for your time. Hey, Brock. How you doing? Good season here. We're just post supercomputing and post the American Thanksgiving holiday and so on, so we're back into the swing again, recording some more podcasts. Yeah, and so we have one more comment. Also, I need to apologize. I dropped into your OpenMPI state of the union at supercomputing for like five minutes and then turned around and left to go to something else. Yeah, I didn't want to say anything, but I noticed that. I was like, oh, there's Brock. Oh, there goes Brock. Yeah. Well, you were nice enough to put your slide deck out on the mailing list, so that's available out there so everyone can get their OpenMPI updates and what's coming in the next version. Sweet. Okay, so let's go into our guest for today. Our guest today is Thomas Summers and he's gonna speak to us about Open Compute. So, Thomas, why don't you take a moment to introduce yourself? Yeah, I'm Thomas Summers, founder, CEO of Rex Computing and we're developing new HPC-focused processor solutions. And so, as part of Open Compute, we started the group back in July and I'm project co-lead along with Devashesh Paul from Integrated Device Technologies and we're trying to push HPC into be more open and kind of bring that into the more data center and server-focused Open Compute project area. Okay, can you give us a little bit of background of what Open Compute is? Most people, if they have heard of it, they probably associate it with Facebook or somebody like that, not somebody you normally think of as being high-performance computing. So, can you give us a little bit of background? Yeah, so Open Compute was originally started by Facebook and the general premise of it is to what Facebook originally did is that they donated their internal custom server designs so that both includes the mechanical, the racks, the blades, how they're connecting them all together and new motherboard designs and everything and they contributed that, they started the foundation, they donated it to the foundation and got a group from Intel, Goldman Sachs, Rackspace and a couple other companies initially and they came together and said, it's better for us consumers of technology to actually define things that we need to have and if we design our own stuff internally to make that available so that everyone can use that as a base and contribute to making those things better. So the real goal initially and has been up to now is to make more efficient data centers and server systems and in the HPC world, think that that's a pretty good idea and something that would be very beneficial for HPC users and HPC system manufacturers and we wanna bring that to the group. So you just say there about how it started and what some of its initial goals are and some of those you said are the ongoing goals but really if you could put it in terms of what is the 10 year goal? What is the five year goal of OCP? I mean, what do you see as down the road more than just making designs for hardware available? Yeah, so up until today, I would say that that has been the primary goal with the base principle for the long term viability of it is that these open designs would cover enough of the different parts of the data center that if some complete new startup wants to build their own data center, they could take all of the pieces from open compute and go to some contract manufacturer or the idea is that the contract manufacturers would already be making open compute specific designs and those would be cheaper than the proprietary HP or Dell or whatever systems out there. And that startup would just be able to take all open standard items and bring that and roll that and put it together and based on the actual specifications would know exactly how everything goes together and make sure that it works. And then if they have some new innovation that they've been able to improve something for whatever specific tasks that they're doing that they can then contribute that to the community so that others would also be able to benefit from it. Now, the longer term goal, which is now starting to develop actually with the HPC group is most of what open compute has been up to today has been related to mechanical and at the lowest level being your Gerbers for your actual motherboards. And what we're trying to do with HPC is bring that deeper into new networking at a silicon level. So up to now networking, it's been taking whatever off the shelf, 10 gig, 40 gig, whatever is out there and then throwing that on a board while with the HPC group because of HPC demands for low latency and high bandwidth where we want to focus on trying to bring basically covering all of the areas that open compute does right now and focusing on the specific needs of HPC in addition to instead of just doing mechanical layers, we want to go all the way down to silicon. So you've kind of touched on them a little bit because I have a little bit of familiarity with OCP but what are the big components of open compute that consists of the stack and what are especially focused on the primarily different ones than you could say like the commodity world uses right now. Yeah, so with open compute, it's not just one set standard. So the Facebook standard is called OpenRack and Microsoft back in January donated their own version which is very different. They updated that this past October but there's no really one set standard. The idea is that the project allows anyone to contribute anything and people can pick from that what they want to have. So for the major groups that is part of open compute, there's storage, networking, server design, open racks, certification, hardware management and the data center groups and each one is focused on accepting and reviewing submissions for their specific areas. Okay, so I actually got to see a one vendors implementation of OCP server and rack and one interesting thing was the rack was a different size and it used bus bar DC power in the back. Is that like a standard thing or is that just like what part of the community is looking at? I would say that that's the majority. So that's the open rack design that Facebook initially contributed for what we're doing with the HPC group. We're taking that primarily as a base but just the one thing I want to point out is that that's not a restriction. You people, there are other contributions of different designs for your racks and how blades and such. So but yeah, the basics are the open rack rack is 21 inches wide compared to your standard 19 inch and it has AC power dumped into it and has a bus bar, three different bus bars in the back to distribute power to the all of the blades. So let me ask a dumb question. I'm sorry, I'm a software guy. What's a bus bar? Yeah, so a bus bar is just metal bar in the back which is the actual power connectors when you slide that into the rack connect to it and that's what distributes power to it. So one of the neat designs with open rack is that for the full size unit you have three what are called power zones in the actual spec. And those three power zones are each according to the original Facebook contribution 40 and sec. I have to actually pull up the spec. I believe it's 4,200 watts. Yeah, 4,200 watts for each power supply. So instead of each having each server blade having its own power supply, the idea is to have that distributed across the rack. And so every single blade is receiving DC instead of AC. And so the actual conversion going from AC to DC only happens three times in the entire rack. And is there again, software guy, I'm very little knowledge about these things. Is there a benefit to that? The DC to AC conversion limiting that and how does that factor in with getting DC power into your server, into your data center as opposed to AC. I mean, is that difficult? Is that cheaper? Is that better somehow? Well, you still have the same AC in your data center for when you have the power supplies inside the rack because the actual connections to the rack itself are AC. And then that AC power goes into the power supplies and that does the conversion. So the advantage with having DC throughout the rack level is that you only have the conversion happen three times in this case. So instead of going from AC to DC which most power supplies are around 80 to 90% efficient for when Facebook was designing this they really wanted to have the best power supply designed. So they're almost at 95% efficiency at their target load level for the actual units. But the basic idea is instead of having that 80% power or sorry, 15, 20% power loss every time you go through a power supply like that which in a normal system would be happening at every single blade. The OCP model is to have that only happen once once per power zone and that power then be distributed to all of the blades in that power zone. And there are three of those in the standard open rack model. Okay, and all right, so that makes sense for that one. And another improvement or at least specification that you mentioned was that the rack was wider is 21 inches versus 19 inches. What was the rationale slash benefit for that? I think it was really just to be able to fit more in one rack in areas relatively cheap and so expanding the actual rack a couple inches doesn't really hurt you in terms of just rack density but in terms of how much actual components you can fit in there does help a lot. And it's also easily dividable into for Facebook they as part of that also contributed what they originally called when the first generation was called windmill and then now there's a winterfell which is the dividable blade up into three seven inch wide components. So like in a lot of the cases in my 19 inch racks we're putting two nodes in that 19 inches. So you're saying at 21 you're putting three. So you're actually saving space. Correct. Okay. So we've been talking about all these things so it sounds like higher density, higher power efficiency, this is all things that sounds like when we design a new HPC system things we're concerned about because we tend to be space constrained, power constrained, things like that. So what else about OCP do you think the HPC community is specifically interested in? Yeah. So I think that what has been contributed to the project so far is a good starting round when it comes to the new mechanicals. I am a fan of the open rack design. I think that's just having the bus bar is a nice improvements. And some of the things that we want to do in the HPC group with open rack specifically is integrating networking into the rack itself. So basically having dedicated networking spaces within the rack so you can connect nodes as we're assuming that you're going to be having this as a giant MPI cluster for instance. But as for the rest of the project we are not trying to step on the toes of the other groups and we really want to focus on optimizing both existing designs from open computers and accepting new ones that are focused on just low power, high performance and fast systems. Okay, so let's go into this a little bit more because like we said at the beginning here most people think open compute think your Facebook and other data-centric applications and whatnot and you're doing HPC specific designs. Give us some of the differences between some of the other designs that have been submitted for OCP. What are you focusing on? And you just touched on a couple things, right? So fast network, obviously, assumedly fast processors but you also said low power which sometimes contradicts HPC requirements. So give us some of the high points of what your HPC focused designs are looking at. Yeah, so I think for most of the HPC community as where I'll talk about having this huge almost insurmountable problem of getting to exascale that is primarily a power constraint. And so in the top 10 systems, power is the almost the number one constraint there. And I think that that's gonna be felt by more and more systems as we get closer to a petaflop machine being your average top 500 HPC platform in the future just as terror flop and tens of terror flops are now the norm now. But in terms of what we want to do specifically with the group, we have multiple phases set up as part of the part of our charter. And so this kind of first phase which we're just starting off now is taking the existing OCP designs and reformatting them a bit and to better suits the requirements for HPC. And we're starting that as, that's the most minimal amount of work. And that's the plan that we have for the next six months to a year where those designs would actually be finalized. And actually that's already we're planning on having those submissions in the beginning of next year. But in terms of a longer term, we think the big differentiating feature of the HPC group in terms of completely new contributions will be things at the silicon level which none of the other groups are touching. Ooh, at the silicon level. Okay, can you expand on that? What are you making? Yeah, so part of I guess my own company we're a fabulous semiconductor company. And so that's part of our goals. But from just the group level we believe that the existing semiconductor providers that are part of OCP don't necessarily target their silicon for HPC specifically. And so what we want to do is have a set of designs that are usable by both other semiconductor companies. So just specific IP specs for network-on-chips or chip-to-chip interconnects and things like that to allow it for it to be easier to actually merge all of these different components. And then from my own company perspective, the crazy thing that we want to do is actually have a open instruction set architecture as part of the group that is targeted for HPC. And as the group continues, we hope to contribute things on the RTL level as well. So are you talking about both new CPUs and a new type of networking? So not your traditional x86-based CPUs from large companies that Reimu Schmintel and Schmay-MD? Yes, so that's my own company doing that. And we have a couple others that are interested on the processor side. But the main silicon contributions or the bulk majority of them will mostly be in the networking area. So both on-chip, chip-chip, and just more targeted node-to-node silicon. Fascinating. So this is, what kind of network is it gonna be? Is it gonna be Ethernet? Is it gonna be InfiniBand or something new? Well, we're open to contributions from anyone, but in terms of what we're looking at right now is things from primarily RapidIO. And then in the future, kind of our longer, this is the three, four years off, but part of our charter is actually talking about the future of silicon photonics and how we think that as that's a new and emerging area that making sure that we have open specifications for that is an important piece for the future. I guess one other point with that is that company that runs with MIMTEL, they've contributed to the networking group, their own photonic interconnect stuff, but that currently only relates to their specific, it has to connect into their processors. And so with the HPC group, we will obviously love to have contributions from anyone. We do want to focus on having these systems be interoperable as in the future of the HPC data centers, we think that interoperability will also be very important and you don't wanna be stuck into having one specific manufacturer as you upgrade your system over its love time. So you keep saying, you know, contribution, participation, stuff like that. What is the actual process? I mean, your company is one of these contributors. What's the actual process for the working group and contributing stuff and getting it in the actual repositories of the project? Yeah, so anyone can become a member of Open Compute either as an individual or as a company, it's free. The only tiered paid levels are, you can either become one of those tiered levels by either contributing time and a certain number of specifications or you can pay for it, but you're only limited in not being able to become part of the incubation committee, which is the high, the kind of overseer of all the project groups. But to become like a project lead such as myself, unlike a lot of other standards bodies, you don't have to pay anything. And so for the actual processes, if you're a company that has a new specification, you would join Open Compute as an individual or a company. You would sign the contributor license agreement and basically that's just saying that you're not going to sue other people for using your spec and you're giving it up. You choose what license you want to submit and under. For hardware, there's two. There's the reciprocal and the permissive license. So it's similar to, basically you can think of it as the reciprocal as being more a GPL V2, V3. And then the permissive is more like the BSD license. And you submit that to the project lead for whatever group you think is most relevant for that. So if you're an HPC, you would send that to devastation myself. We would review it if we think that it's something, that's good for the group, which basically our checkmarks is that you're actually contributing full specification for something. It doesn't need to be a full blade or anything. It can be one very specific component, but if you're open about how you are plugging into it and everything so that others can look at that spec and make something that can plug nicely into that, most likely we'll think it's a good submission and then we'll put that out on the mailing list where people would be able to also see that and then we would submit it to the incubation committee, which would actually review both our project lead comments, the initial submission and then whoever submitted will be able to present that to the incubation committee and they would have the final approval. So you were talking about silicon before and then you also mentioned to make sure they're actually contributing a full spec. When you're looking at something like silicon, do they actually have to contribute the design of the chip or can they contribute the interfaces to a chip? So I'm a network vendor and I can say, okay, I'm going to contribute for OCP. Basically there's no cost to licensing the developer capability or the API for hardware is kind of thing, but basically how do you deal with like proprietary inside the chip, which normally people hold pretty closely, but then have this interoperability around the outside? Yeah, proprietary insides we're expecting as being the norm. And so the main idea is that we want them to be able to say, okay, here outside of the NDA world to have their chip and we have rapid IO, we have ethernet, we have infinite band on this chip. Here's the connections for that. Here's, you know, all of our Rata, other things and here's how to place it on a board. That's kind of how we see like initial submissions being. Hopefully as we get further into the future and there's more submissions on the component side that as if there's a open compute, there's a specific network on chip or chip-to-chip interconnect or something along those lines. Okay, so this is a fascinating blend of proprietary and openness. How do you envision this playing out, particularly when you said at the beginning that your goals were that you could go to a contract fabbing plant or manufacturer or something and say, hey, build me this, but parts of it are going to be proprietary. How would they be able to build those proprietary parts in conjunction with the open rest of the system or did I misunderstand that? Yeah, so when I was saying that previously, that was more in relation to the existing OCP designs for the mechanical and for the board level systems, not including the processors or that silicon. And so with the HPC group and us trying to open up the silicon, we think that that's more of a long-term thing and fabrication costs are expensive. So even if someone wanted to do that, most wouldn't want to pay the upfront cost to go to fab and that's really go to TSMC or Global Foundries or one of those and even if the entire chip was open, I doubt that many people would go out and have those fabbed on a whim. And so I still think that the actual semiconductor companies themselves are going to be very involved in actually making those in the future and they'll be selling those. The point of it being open is just so that at the initial design level, that an engineer can see those specs, see exactly how that should be placed in the system and design that without the constraints of NDAs and having to go and talk to every single sales team from all these different companies, there's a one place where you can go and see things that are all supposed to be interoperable. So what has your personal experience been like working on open source hardware? Yeah, so I started Rex computing a little over a year ago and initially we were kind of had the goal of being a semiconductor company, but initially the idea was to just be a box manufacturer and taking off the shelf components. But then back in March of this year, we're looking out there specifically for the area that we want to target. We didn't see any chips out there that really met our needs. And so we started designing our processor architecture and August or so we had some stuff working on FPGAs and we were trying to think of, okay, what's our actual business plan? We have some cool stuff running on FPGA with some nice demos, but it doesn't really mean much if you can't sell it or raise money or anything. So that was when I was getting involved with open compute and had the kind of crazy idea of, well, if we're going up and competing against Intel and these other giant behemoth companies that have the established install base and we're coming with this new architecture that you have to recompile your software. How could we ever break into the market? And what we came up with is, well, why don't we make it so that anyone can make compatible chips with us? Because if we donate the instruction set architecture and we have quote unquote competitors with us that are making competing chips, that would actually be a good thing because if there's other people who are evangelizing the base architecture that we all share and are contributing to the software ecosystem, that is really just pushing the architecture forward. And so this is, in a similar way to I think how Linux kind of pushed the Unix ideas from the 80s and had it in the late 90s through the 2000s became an actual real push against Microsoft. And you didn't see that when all every single company had their own Unix. So back in the 80s and 90s, Sun, HP, Deck, everyone all had their own Unix piece and they weren't really compatible with each other and none of them gained any significant market share and they were just all fighting to the death. But as soon as Linux came about and became significantly stable enough, that became kind of a unifying place where everyone could go to that, use that and share in the spoils and each company would then make specific, they would differentiate in their specific areas and that's how they would profit. Okay, so you just described the process for getting a spec in there. Clarification question on that. Is there discussion about these submissions? I mean, is there negotiation or, well, not because you said you're, consider yourselves a standards body. So when somebody submits a spec, is there discussion about that spec or is it more like, here's my spec, enjoy. You guys either publish that or not, right? You know, you accept it or you reject it. Is there more to it than that? Yeah, so the kind of points I just didn't emphasize previously were, so when it's initially submitted to the project leads, we talk with whoever submitted it and if we have any questions, we hope to get it from them and once our questions are sufficiently satisfied, we put that out to the group mailing list. So the kind of involvement place for everyone in the group, we have a mailing list which anyone can join and that's where we discuss submissions and just our general plans. We also have bi-weekly calls where we would also be going over those submissions. So you're a fabulous design and your company specifically focuses on certain pieces of it. What is the process, do you see most OCP vendors? Like if I wanted to go buy one of these things, do you see them building all of the pieces based on the specs? Are they sourcing certain parts from others? What are you kind of seeing being the norm? So I think the norm for if you're building a system and there will be a mix of if you have some specific alteration that you need, if you're talking with whoever the contract manufacturer is, you can download the actual mechanical drawings and such of whatever the open rack component that you want to modify. You can change that up and then send that to a manufacturer and that works pretty well at the mechanical, the rack and blade levels. And then I'm also assuming for significant a certain, if you're a large enough company or project that you would be designing your own piece, your own motherboards, which there's examples which are contributed to the group and you can modify that and get that made at some contract place. But when it comes to the actual semiconductor area, I still think that that's going to be pretty tightly controlled by the actual semiconductor companies just because the costs of fabrication are pretty high and to do a run just to get the masks made are $5 million and I think that that's, you're still going to have to go to whoever you want the chips from there. They are going to be the ones selling them to you. Okay, so you mentioned a lot of things about the OCP project here from the submission to the design to the goals and things like that. How can somebody get involved? I mean, more specifically, what are the opportunities for different levels of involvement? And by that, I mean, there's hobbyists out there, there's vendors out there, there's academics and researchers are out there and things like that. What do you think are the opportunities for involvement for these or other types of groups? Yeah, so for Open Compute in general, you could just go to the website opencompute.org and read over the website. You can see a lot more about all of the other groups. Under the community tab, you can click on get involved and it talks about exactly what they, how to actually contribute and be involved in that way. But in terms of the HPC group specifically, unlike many of the other groups which don't really have the academic or government involvement, we see that as being a particularly strong part of our group for HPC. And so we're trying to make it as easy for anyone to join and the simplest way without actually even, becoming an official member of the project, you can just go and register for the mailing list, which there's links to that on the Open Compute website and I'll also provide a link to you guys to go join that and you can read through the archives and join, send an email, ask questions, anything. And we're trying to just have it be a very simple process to start and ask questions. So for someone who's listening right now and this sounds of interest to them, what would you say like where your top one, two or three needs right now, someone who's got a skill set that you could use or time or resources that you could use, particularly in the HPC group? Yeah, so right now we're very early, we have not had any submissions yet. Our charter is not yet approved, it's going to be approved at the next Incubation Committee meeting. But the basic, we currently have over 50 members as part of our mailing list from a wide variety of companies, every networking, semiconductor, storage, everything all part of our HPC group. But what I think the more involvement at least I would like to see and I'm biased as a semiconductor guy is, I think others involved in the Silicon would be great, but really anyone where we want this group to be pushing the bleeding edge in all of the areas that affect HPC. The basic idea when we're talking to other people within Open Compute, part of the other groups is that we're working on the specs for the systems all you guys will be using in five years since we think that the trickle down from HPC to more general data centers is just how things have always been. And that we want to focus on that and have that be more open so that in five years when data center requirements obviously continue to go up that people will be able to pull from those submissions and start integrating those into their more commodity systems. Okay, Thomas, again, what's the website for OCP? Yeah, it's opencomputet.org. Okay, and thanks a lot for your time. Well, thank you. Thanks, Thomas.