 Good morning nerds, and welcome back to Denver, Colorado. We're here day four, our last segment of Supercomputing 2023. My name's Savannah Peterson, joined by my fabulous co-host, David Nicholson. David, I'm really excited for this next conversation. Kind of low key, think we might have saved the best for last, what do you think? I think we have, this is an opportunity to get a sanity check, sort of a summing up, little historical perspective, and guidance on where we should head in the future. Yeah, so, without further ado, please welcome Jazz and Ehab. We've got Broadcom and Dell here on the stage to close us out. You are both here at Supercomputing for the first time. Going to go out on a limb here and guess that AI might have had something to do with that. We're going to deep dive into the AI server here on the show, and I'm super pumped. Ehab, I want to start with you. Give us a little history on the AI server. It's not like it just popped up when ChatGPT became a phenomenal thing a few months ago. Yeah, thank you, and thanks for having us. I've been looking forward to this all week. Good. We started working on our AI server, and one of the ones we just launched is the XE9680 about two years ago, because we knew that GPUs are evolving with a lot more packaged memory, and now you have the ability to put memory fabrics between them. So the whole idea to have a multi-GPU server, a GPUs linked together with a memory fabric gives you 10 times the performance that you would have gotten with a single GPU. So that gave birth to GPU optimized servers versus any compute server that you could just put a GPU in. So it's been a two-year journey, lots of joint innovation, working with Botcom very closely to optimize the performance of the server for throughput, for connectivity, as well as working with NVIDIA to optimize the design. And now it's one of the most successful, if not the most successful product in Dell history. I'm really excited about it. Congratulations, I'm not surprised way to look through the crystal ball and be on time with this, I can imagine. It is a thrilling time to see this level of application. Jazz, I'm curious for you, you showing up at the show for the first time. What have the conversations been like? Is everyone wanting to go into the server and see what's going on? So people want to be part of the AI server. They want to be part of the AI cluster. And if you go back to your first question on how this has evolved, we took decades to create a really open x86 compute server ecosystem. And we've had abstraction of software, applications, OSs, hardware. And that's really how this started where you take a compute server and add some GPGPUs and call it an AI server. And as they have walked through, that's evolved quite a bit. And if you walk around the show, you look at some of these AI servers, their works of art. They're extremely complex. They really are. Thermal aspects, mechanical designs, having all these things come together is really complicated. One of the key things I've been doing at the show is meeting with companies innovating in the accelerator space that are like, I want to be part of this AI ecosystem. And today the model for them is I build a GPGPU or an accelerator and I expect to have a system wrapped around it. But that's not going to work. It's not going to scale. So I think the job, part of my job and Ehab, we both share this is we want to take a lot of this innovation that's on the show floor, enable it in the hands of people. And that's not an easy thing. What's the biggest barrier there? Well, I think one of the barriers is in the compute server world that this evolved from, the CPU was at the center and the CPU did the connectivity, did the abstraction, it was a center. Now the center is not clear and a lot of people think it's a GPGPU or is it still the CPU or is it the fabric? Actually it's all of them together. So it's more complicated to coordinate and I think some of these smaller companies that are wanting to compete and bring different innovations at the table in terms of accelerators, they're not sure quite to go about it. Do I build a piece of silicon and go to Dell and says can I be part of your platform? I think that's the right answer or do I try to build my own mainframe like fully integrated solution? Which is for some verticals might be good but will be challenging. So I think we want to help the innovation at the show and put it into the AI server platform in an open way. Now the two of you, this is the first time that you have physically been at the show but let's be clear, Dell and Broadcom have been a part of this forever. Very much. And so when we talk about the evolution of the AI server to Jazz's point, we've sort of been to this movie before. Along this journey, optimizing and creating standards, not that long ago you can imagine all of the diversity that sort of consolidated around x86 architecture and this sort of beautiful ecosystem that we have in the compute server world. So where are we now? If you were to compare that evolution in the compute server world alongside where we are today in the AI server world. I know your position is that we're in a good place with what you're offering today but what's your vision for the future? How does it get better taking a page out of the compute server playbook? Yeah, I think you described it well. We've spent years jointly working together to optimize compute between us and Broadcom is our number one partner for all connectivity on all our compute platforms. We match the speeds of the connectivity to the speed of the CPU, optimize the number of NICs, how you use them, we automated all the support, we do extensive testing, development automation so that when somebody buys one of our compute servers they don't have to think about any of that. It all takes care of itself. We're at the beginning of the long journey of optimizing the fabrics, the connectivity for the AI world. I think it will be years before it's as fully optimized as the CPU world and that's not because people are not working hard at it, it's because in the GPU world, innovation is so fast. We have a new silicon almost every six months and we have a significant explosion of silicon diversity. At the same time, networking and fabrics are evolving but there are some really interesting facts here. One. Let's hear them. Most of the performance of gen AI fabrics is dependent on connectivity, not the GPU itself. When you build large clusters. Obviously it is the GPU at the end of the day for the crunching of the model but if you're trying to build a cluster for training or trying to optimize your inferencing, the connectivity usually ends up being the bottleneck and therefore our work with BODCOM is essential to solve those issues. And I think the second one is that the speeds of networking and connectivity needed for gen AI are bigger than anything we have ever seen on networking, period. Hands down. By a lot. Yeah. And not everybody expected this. Everybody's saying, well, it's going to be a little more. It's only been two, three months ago that the whole community started to wake up to say, this is like a massive multiplier and it has a huge impact on performance. How do we get together to solve that? So I think we're at early stages. We want to end up in a place where again, nobody has to think about it. They're going to get maximum performance from the compute and GPU because we have worked through all of that. You talked about collaboration and ease of use, two things that tie very much into a conversation. We were having earlier jazz about standardization. How important are standards right now? So if you look at an x86 compute server, it's governed by hundreds of standards. And there's a quota conduct in a data center space that we need customers need options, things need to be open, they need to plug together, they need to work. And over the past few decades, that's been created. This is much more critical in AI servers because in the same box, you're putting way more components, way more elements. So one of the things that we're working on is example of, we talked a lot about Ethernet that the show and many people from Dell and Broadcom, I'm going to focus on inside the AI server. So you need a fabric to interconnect all these elements. And what we believe is the right answer is PCIE with CXL extensions and so forth. So one of the things I'm doing is example, I'm meeting with a lot of these emerging AI accelerator companies and saying, this is the fabric we have, you can plug to it, let us help you with that journey. And actually we've changed our business strategy around that. We've decided this year to be completely open, non-NDA with a roadmap. So sharing the fabric roadmap with the entire industry so that they know what's coming. That's a big pivot. That's one big pivot. The other pivot is, actually you have said, An imperative really, that's great. The speed bumps are coming at us fast and furious. And they're more and more complicated to implement. So providing a platform from a fabric perspective to help end devices solidify their solutions. So offering them tools so that they can test their solutions before they have silicon at the RTL level, so we're doing all sorts of things like that. So it's an interesting time, but there's so much innovation and you see it here at the show that we need to move fast to consolidate. Yeah, the velocity is real. Yeah, it's interesting when you contrast where we are with the development of compute servers versus AI servers. Folks who have a lot of background with compute servers have told us that the process of just standing up clusters and working with these systems, they don't come into it with decades of experience. They don't come into it with a level of expectation that's properly aligned with the level of complexity that they run into because they're sort of spoiled in a way by the standards that have been leveraged in the compute server world. So it's not just, oh well, if these two things weren't generationally produced at the same time, you won't get a maximum advantage. It's, no, no, no, they won't work. It doesn't work if you don't have standards. So I think the actual practitioners where the rubber meets the road are finding that the lack of standardization is causing a serious bottleneck. And as we've referred to this whole thing as the primordial soup of AI, there are many, many folks here who are going to evolve over time and they'll either be delivering what they build to the market differently or not at all. Dell has always been in a position of, hey, bring us the best Lego blocks you have and we'll integrate them and Broadcom build Lego pieces. As built, building the pieces. So, but you know, Savannah asked the question, like what are the inhibitors? I would ask what would the goal be if we talk to you a year from now for advancement? How far can we move this ball over the next year in terms of standardization? Yeah, I think we're going to be able to move it significantly but the ball, the goal will keep moving on us. I'll give you examples of standardization where we're diving right now. If you walk the floor here, you'll find 20 different solutions to liquid cooling. You'd find 20 different configurations of how you could deploy the servers. We, as you said correctly, we're diving an open diverse scalable ecosystem. We feel that's going to be critical to scale the industry and to get the best of the best out to our customers. And to do so, we need standardization. So you can mix silicon for multiple vendors. You can get the best SSD drives. You can get the best deployment options, liquid cooling and all those different components regardless of if it's filtration or connectivity. So I think there's enough mobilization in the industry on standardization that I haven't seen for years. Let me put it that way. Oh, that's good to hear. Because everybody realizes it's critical to everybody's success to go towards standardization. I think some things will be quick. We will see big moves in a year but they'll continue to be big things we need to solve that may take few years. You're giving us a lot of hope here on the show today. I love to hear that. Jazz, you mentioned something that I think is awesome and I want to follow it up with a question about where we're at with the AI server. You mentioned you've gone into the government and actually shown them a server, opened it up, shown them how the components can be switched out depending on as a result of standardization. Are the players in the game that are coming together around standardizing the AI server the same players that we saw with traditional compute or is this a new world with different companies and organizations chipping into the ecosystem? It's mostly the same players but the speed at which it needs to get done is super compressed because things are breaking at the seams. People want to run these workloads. There's power problems, there's cost problems, there's complexity problems. So the process we need, it's happening so fast, this AI transition that we need to on one dimension accelerate innovation and on the other dimension accelerate standardization, interrupt and it's not just drafting a standard, it's actually mechanizing and operationalizing that process and I'll give you an example. PCIe, which is the main protocol to interconnect elements inside the AI server or a compute server, it's the same protocols going from gen three to gen four to eight years. Next generation two years and we want to bring this down to a one year cadence going from gen six to gen seven. That's really difficult for the ecosystem so you can draft a standard but you need to help people make this stuff work. And Savannah just mentioned the reference to opening up a server and looking inside. I want to get each of your perspectives on this idea. You peel back the cover of a server and a lot of people don't understand what's there but you gaze down upon it. It's the perfect example of co-opetition, isn't it? I mean, you have people who are competing with one another, who are partnering with one another. Jazz, you mentioned that this has not been some mandate from above that said you must all do this but it's been this collaborative effort. How do we use that model for everything else in society? I mean, am I? Kajal, how do we save the world's question? How do we save the world? NBD. I mean, it sounds like what you're saying is that this is actually a really good model that everything that's happened up to this point for the compute server that if we can get the AI server world as quickly as possible to hear we all benefit from it. But I think that people who are not familiar with what is happening on a motherboard and everything it's plugged in should really look at it and ask the question, well, why can't we all get along this way? I mean, do you agree with that? It sounds like that's what you're saying but I want to make 100% clear. Yeah, absolutely. I think necessity is the mother of invention here. I think what's happening, the reason we're all working even faster together is we have these big super applications, which is gen AI models, that people really like. I think at the end of the day, the fact that you're going to get significant productivity benefits out of these gen AI models is what's uniting everybody. This is not a hypothetical goal. Usually where standardization breaks down or- I'm glad you brought that up. It's when people think there's a hypothetical goal and people have different opinions of why the fact that we actually have the end results, you were talking about how you use it every day. We have everybody using it and the expectations is 60% of the world population will have some kind of gen AI content in the next few years. So think about the expansion- Talk about scale. Think about scale and think about enormous amount of new content showing up. So there is no other way but all the key technology players that we all have worked very well together, as Jess said, to come together to solve. The fact that we have a problem to solve is what gets technical people working together quickly. And the speed has never been more of the essence. I also want to talk about cost, since some people might not be familiar just how dramatically different the cost is for an AI server. Jazz, you gave me some examples earlier. What are the numbers compared to traditional compute servers? So you have correct me if I'm wrong, but compute server, five, 10,000 dollar range. And again, there's a big range on that. AI server, 200, $250,000. It's huge, huge. 10 to 20X and the power scales with cost. So those are real challenges for enterprises that have a fixed budget. And they want to be part of this. They need it. They need it as compared to the differentiator. So it's challenging. And how important, well actually, how does the standardization conversation we're having impact cost? Is that going to bring it down? Yeah, I can take that one. I think they'll continue to be demand for these very high-powered servers given the many models that are showing up. And they have even more and more parameters. They're exploding in size. So we need that performance. And if you look at the roadmaps from all the GPU providers, NVIDIA, AMD, Intel and everyone else, they're going to continue to double every year almost. So these high-powered servers are not going away. They're getting bigger and bigger. But at the same time, what's spending a lot of time on innovation is at the place where it gets consumed, which is influencing. We're working very closely with Meta. We're working very closely with Huggingface and other companies to create very small model deployment options that they can deploy on one GPU even on CPU only. So people can get the benefits out of AI regardless if you're training a big model or you're trying to implement influencing at a data center at the edge. That's what we see is going to happen to make it affordable at both sides. I think you're accurate. And it's such an important point. I think sometimes people don't realize if you have a smaller model or a smaller batch, you don't need a huge, crazy amount of power. Use what's available to you now and optimize systems. Your first time here at Supercomputing, as we mentioned twice before, what are we going to be talking about at the next one? Because something tells me, given the AI revolution that we're amidst right now, you'll probably both be back at the show. Y'all, do you want to take that one? Mine is a hope. Is that okay? Is that we have standardization, even if not complete, figured out for the four or five big things we have to do. Yes. There's so much stuff that we're working on that's in the pipeline that'll be real next year. I think we're going to have a lot of cool stuff to show. We're going to see these AI servers with more and more solutions in them and stuff working. So I think it's going to be an exciting next 12 months. We've got a lot of work ahead of us. Great. Well, we can't wait to see what comes out of this continued collaboration. Y'all, Benjas, thank you both so much for being here with us. I'm already looking forward to having you back on the show together to celebrate those things a year from now. And David, thank you for an absolutely fantastic week. It's been quite a ride. Thank you to our wonderful, beautiful production team. You are all baby angels. And I feel very grateful to work with you. Thank you for listening to my voice for the last four days. And most importantly, thank you, our fabulous community for tuning into our four days of live coverage here in Denver, Colorado. My name's Savannah Peterson. You're watching theCUBE, the leading source for emerging tech news.