 Good evening, CUBE fans, and welcome back to the Mile High City. We're here at Supercomputing 2023. My name's Savannah Peterson, joined by CUBE co-founder and co-host John Furrier. John, good evening. This feels like a nightly news broadcast. It's not going to go late, we're going to go late. Whatever it takes, it's just like election coverage. It's like special broadcast. Who's running for office here? And I think Supercomputing's going to win the day. Is it the clouds? Is it the vibrates? I think AI is looking good on the ballot right now. AI is looking- They're certainly the most popular candidate. Definitely could use some intelligence in the office, but definitely, I'm talking about fabrics, not plastics, fabrics. Strong start to this evening segment here at the CUBE after dark. Let's go ahead and welcome our guests before things get too crazy. Summit, thank you so much for being here from Liquid. It's a pleasure to have you. It's a pleasure to be here. Thank you for hosting us. Just in case the audience is not aware of Liquid, though they certainly will be after this show, give us a little bit of background. Sure, Liquid is a composable infrastructure company, and for those that don't know what composable is, I'll take this 30 seconds to explain it. Normally, what we do is we build servers by taking devices and plugging them into the sockets of the motherboard, and what that leads to is static infrastructure. With composability, what we do is we take pools of resources, pools of compute, pools of storage. These days, a lot of pools of GPUs, and we connect them into a switch, a fabric, and then we come in with software and we dynamically build servers at the bare metal. Take this server, take 10 of those storage devices, take 20 of those GPUs, interconnect them at the bare metal, as we say, and build me a dynamic server. The only difference is when that server needs more resources, for example, I need another GPU for my AI workload, we don't send a human with a screwdriver in a cart and try to move things around. What we do is we reprogram that fabric and we add or remove devices from the server depending on what the application layer is required. And the benefit is to do the bare metal work, get that set up and the fabric tends all the management upgrades, configuration, changes. That's right, right? So we dynamically study the workload on the inbound and we spin up our t-shirt size, small, medium, or large, to perfectly match the physical infrastructure to the workload that we're looking to deploy. And the problem that you guys solve is what? So a couple of problems. One problem has to do with performance. So if I need to build a very performant system, I can take a server and attach, for example, 20 GPUs to it and make it go really fast. Another problem we solve is utilization. These devices now are extremely expensive, 20, 30, $40,000 a widget, right? And so the days of putting a GPU in a server at $30,000 a pop and deploying many of them, we don't think that's how GPUs are eventually going to be consumed. We think it's a pool shared resource where we can apply the GPU to any server that needs it whenever it needs it so we can address that utilization problem and change the economics of how we deploy these things. I think you just brought up a really good point. I'm glad that you brought up GPU cost. We're all excited about what's possible. Nobody wants to pay for it and or at least wants the cost to get too high. Our last guest, just Johnny Dallas, has a website he's made called GPUCost.com which is an awesome way to take a glance at just how much that power costs depending on what people are buying and figuring out what GPUs are available. Very interesting tool. You're obviously talking about topics of big hype here, GPUs, AI, but you also have a very important series of announcements. I believe you have three announcements this week. Congratulations. Thank you. Let's hear them. We're hearing them first here on theCUBE. We're getting the scoop before the press release goes out tomorrow. I promise you guys I'd give you guys the early scoop and that's what we're here to do. And so the first announcement will be coming out tomorrow from us. It's a joint announcement between Liquid and Dell and NVIDIA actually. It's around an extremely dense, the most dense NVIDIA Dell GPU server on the market. And so NVIDIA has released a new device called the L40S and we have partnered with NVIDIA and Dell to release a 16-way GPU server where we can apply 16 of these L40Ss to a single Dell server with the power of composability. So we're very excited about that. We expect that to be the highest performance, most dense GPU solution on the market today. That's impressive. Congratulations on that. You have two other big announcements too, correct? We do, we do. And so we're actually at the booth today demonstrating something referred to as CXL. CXL is a new emerging fabric technology and that fabric is being invented for memory desegregation. The ability to desegregate DRAM from the server which is a really important evolution inside the data center. What does that mean just in case folks- Sure, so normally what you do is you take memory and you plug it directly into the motherboard of the server. So what we enable with technologies like CXL is a top-of-rack memory tray where we can dynamically compose memory from that tray to servers depending on what again, what the application workload requires. Flexibility-wise. That's a big deal. It's huge, it's flexibility, it's performance, it's utilization. There's a tremendous amount of benefit being able to desegregate your memory and giving it to applications dynamically on the phone. Hardware's back. Hardware is back, maybe. It never left. This hardware nerd's been waiting for everyone to say this. It's about time everyone remembered what's lovely in powering our entire lives. I'm sorry, I believe you had one more announcement if I'm counting correctly? That is correct. Which is impressive. The third announcement we're actually very excited about, this is again a three-way partnership with Dell and a company called Emulate Hotkey. And on this announcement what we are doing is we're taking our composable technology and we're augmenting Dell's blade servers with GPUs. We are taking blade servers which previously did not have the ability to augment or add GPUs inside of them and through the power of desegregation we're enabling that entire product line with GPU capability. Which blades? It is the MX7000 platform from Dell which is their lead blade platform currently. So they're customers that could have installed servers, could get this and or, right? So both Greenfield and Brownfield, so all of those customers that have invested in blade servers who now want to enable that with AI technology, we're going to partner with Dell to enable that. I was just going to say, what does this mean for AI? I think AI is going to be, it's everywhere, right? So remember GPUs are used for a variety of workloads. AI is obviously the big one that everyone's talking about today, but also virtualization. So we are enabling this blade server with virtualization capability, but it's also important for video rendering in CAD. So regardless of the GPU workload, we are enabling these servers to do more. You know, Savannah brought a good point out and I want to just double down on that second. Everyone loves their NVIDIAs. When they spend a lot of dough on it, they're going to want to use it, get their pound of flesh or get their utilization, whatever they want out of it. So what just pops in my head is the classic TCO, total cost of ownership, because at some level they're kind of ignoring that if they're going to go all in on say millions of GPUs, dollars of GPUs. So how does someone zoom out? Because this to me is the next modern IT architectural question. I got some platform engineering teams out there. We just came back from KubeCon. I got some Kubernetes clusters. I love some bare metal over there. I maybe use some cloud. I got to put it all together. What does the TCO look like? Because if you're overrotated on GPUs over provisioned, I mean, the CPUs are idle, you got to get the most out of the GPUs. Is that what you optimize for? You want more GPUs? You want more network? It's very difficult and a lot of our customers tell us they suffer from the 90-10 problem, right? And so when they're deploying things like memory and deploying things like GPU, 90% of the servers have 10% of the resources utilized, 10% of the servers have 90% of the resources utilized. It is almost an impossible thing to balance, right? So this architecture is like composable infrastructure. We allow for the dynamic rebalancing of infrastructure, which again, directly impacts that TCO that we're talking about. She's the most out of your infrastructure. You get your pound of flesh out of your GPUs. You make the compute work. What about the network area? Where's that going to innovation? We're hearing from Broadcom earlier, they're all sighted and jacked up about the Ethernet Alliance 400 gig. 400 going to 800, right? And so those technologies all will enable the ability for us to do things faster, but we'll also enable the ability for us to disaggregate. I think a lot of your viewers need to get this concept of hardware disaggregation, because that is where a lot of the industry is at. Can you explain that please? Because I'd like to get that on the record. What is hardware disaggregation definition mean and why is important? Sure, I think we like to say that the static data center is dead. We think the days of deploying static boxes inside the data center is not the future. We think the future is all about, number one, disaggregating your hardware into individual pools of resources, individual pools of compute, storage, GPU, networking, and then we come in with software-defined methods and we recompose these servers into any shape or form that we need, and that can only be done through the power of disaggregation and fabric technology that ties it all together. Sounds like cloud to me. Sounds like what hyperscalers been doing for years. The hyperscalers have been thinking about the concept of disaggregation for a long time, but we think now it's finally time for enterprises. It looks disaggregated, it's all in their data center. That's right. That's right. We want to bring that power to the enterprise. Yeah, and I think that changes where you can do things, which is very exciting, and it's that could have implications for edge computing and a lot of other things. We talked about composability and the data center. We talked about cost management. I'm curious about sustainability. How does that play into your value proposition? Listen, it's a big important thing related to GPUs, right? The power is a big emerging part of this whole thing and power and utilization go hand in hand, right? The idea here is if I have a rack of infrastructure that's sitting 20% utilized, if I can raise the utilization of that rack to 40, 60, 70%, in theory, I should be able to do with one rack of infrastructure what I previously looked two racks of infrastructure to do, and that's directly how we impact the power, floor space, cooling, TCO argument inside the data center. Utilization and TCO have a one-to-one correlation. Submit and the- That's dramatic. At the HPC community meeting we attended this afternoon that Dell had, there were a couple topics I want to get your reaction to if you don't mind. Density and cooling, I'm sure you'll weigh in on density, we already talked a little bit about that. And then silicon diversity, mainly the CPU and accelerator relationship in the future, CPU and accelerates will be on the same board, more interconnects involved. Take, what's your thoughts on density and cooling and silicon diversity? Sure, density and cooling first. We're starting to approach a point where power is now the limiting factor in how much infrastructure we can deploy inside the data center. When these GPUs begin to run 700 watts per GPU, we quickly get to a point where beyond 32 of them or beyond 16 of them in a rack, we can no longer power and cool that rack. One of the benefits of composability is we can actually have selection, different selection of GPU types. So if we go into an environment where we're limited on power and cooling, we can just switch to a 300 watt GPU, aggregate more of them together and reach higher absolute performance points. I think you have to keep an eye on things like immersion technologies and liquid cooling. I think those are going to become more and more important as we go forward. Silicon diversity is also important for us. One of the hallmarks of our platform is we're heterogeneous GPU types. So we love NVIDIA, we think they make a phenomenal product, that's the majority of our deployments. But our platform is universal. We can support AMD, we can support Intel, we can support other accelerator technologies. And so we think the market is very much focused on a handful of players now, but we think it's a play with strategy in the sense that we want to give a platform where we can give our customers choice. One more question that came up, this is more about AI workloads that introduce new sets of challenges, new workflows, iterations happening, figuring out what happened, how you repeat it and how you measure it. But the issue about AI and ML Perf came up and that is specifically around the data. And you're seeing more and more, let's get together for a week and start testing stuff and you kind of lay it out. How much of that's really going on? ML Perf is important. How do you set up the AI benchmarking? You got to get out there and figure out and scope it. Can you share your thoughts on this new dynamic? Yeah, so benchmarks are important. They give us a standardized way to measure how performance is actually playing out in real world applications. ML Perf is the thing that people are focused on right now. What I'll tell you though for AI in my opinion, it's all about the data, right? And so we make the assertion that if people are serious about AI, more than likely they're going to do it on-prem. And it's a very simple question. If you have petabytes of data, a trillion parameter model, how are you going to train that? You have two options. Option number one is upload all of that data into the cloud and run your AI in the cloud. And a lot of people don't want to move petabytes of data into the cloud. There's tons of concerns about that. Seriously. And if that's not going to happen, the only other option we have is the GPUs come on-prem. And so we think that the ability to do this on-prem and deliver performance is going to be critically important. Things like ML Perf will actually show the world that we can actually be performing. And by the way, that point is also going to lead to, we think, not repatriation, net new opportunities, because if you're running a cloud architecture, I mean, your operating model is cloud, you're on-premises, this one piece of the cloud, you can still go to the public cloud, your workload can move, you move your workload to the edge, far edge, and with satellites coming online, you're going to have a completely connected infrastructure. It doesn't matter what it, I mean... It's a bimodal world, right? And I agree with you, right? There's going to be a certain amount that will live in the cloud and will always be in cloud. And there will be certain net new that we believe will be done on-prem because data's got gravity and they will bring the compute to the data. And by the way, also the movement of the data is one thing. The other thing with the models is intellectual property. Data leakage is a huge factor around training and inferences. And inferences got more traction now because inference is the new real value as you look at the inference, more inferences going on in the way the data sits. That's right, that's right. Agreed, data hygiene. We've been talking about it all week, last week. We were talking about flexibility, we were talking about data hygiene, we were talking about GPUs. I mean, the conversation is only heating up. It's all coming together again, AI, hardware. This is definitely the place to be. What conversations are you most excited to have here on the show floor? Big community and important discussions for you this week, no doubt. Sure, I was asked this question earlier and I think the three hottest things going on in our industry right now is number one, everything AI, right? And so, AI, GPU, that is at the center of what everyone is thinking. This concept of memory disaggregation in CXL for us is critically important and it's getting a tremendous amount of market momentum. We see the hyperscalers jumping into this. And then the last thing I'll tell you guys to keep an eye on is optical interconnects. So, there's optical interconnects. We think that's a new hot emerging space. We will be supporting optical interconnects as part of our disaggregation story. And so, I think that's a threat. And why is that important over, say, for 800 gig ethernet where the bottlenecks might not be the networks? Or is network still the bottleneck? I think what's going to end up happening is the reason why we go to optical number one is power. As you guys mentioned earlier, power, power, power. Optical brings a better power profile. For us, also, a lot of what we do has to do with reach. Being able to go very, very far distances over low latency interconnects. Latency's a critical, critical part of all things AI. That speed of light thing is real. That will help the multi-cloud and super-cloud equations to start getting into more of these pools and composability. We have this grand vision. Eventually, we're going to be composing across data centers, right? Once the latency is acceptable, once we get things fast enough, we're going to be taking a GPU from data center number one and we're going to be composing it, potentially to data center number two. If the latency profile- I think that's a good vision because edge is going to force that because if you're going to have to work at the edge whether it's a far edge or hyper edge, you need to have these data strategies and latency strategies nailed down. There's four kind of markets that we chase with our solution. The first is HPC and research. They consume a lot of AI. The next is enterprise. For the reason that I mentioned is that data is bringing the GPUs to them. Media and entertainment consumes a lot of GPUs and edge. All of the data is created at the edge. And forklifting data and moving it to core to make decisions is not the answer. And so, inevitably, GPUs find their way to the edge. And what is edge limited by? Power, floor space, cooling. Connectivity. Human access, right? And so those are the things we fix, right? With composability. Wow, you have absolutely nailed the definitions. He's going to do wonders for the cube AI. The cube AI just got fed some good content there. Between composable infrastructure, we're talking about the disaggregation of GPUs and the hardware space. That was mind blowing. I saw you fervently taking notes throughout this entire segment. Yeah, I made a couple of notes. What was that again? Tap, tap, tap, tap, tap. I thought that was really awesome. So, but thank you so much for being here. My pleasure. Thank you for having us, guys. We really appreciate the opportunity. And John, thank you for those fantastic questions. And I'm going to close this out here with a first on the cube since I saw something very fun on the show floor. Like we mentioned here, it's super computing 2023. It's a hardware party for all the nerds. There was live bands, there's SkiBall. There's a ton of fun activations all around me. But I saw something I've never seen before on this show floor and it is an entire piece of cheesecake on a stick. So I'm going to take this opportunity and do something that I haven't had the chance to do before. We're here in Denver on the cube live from super computing 2023. And I hope you have some sweet dreams.