 Hello, and welcome to theCUBE Studio here in Palo Alto, California. I'm John Furrier, host of theCUBE. We're here for coverage of ISC High Performance 2023. We're covering all things HPC, machine learning, AI, high performance analytics, and quantum and more, and one of the most important topics in the HPC community, stability, sustainability, and this segment, HPC networking and composable computing. Welcome to our guest, Himmels Shaw Distinguished Engineer, architect for Ethernet Nix at Broadcom and VP of technology at the DMTF, still setting the standards, still doing great work. Himmels, thanks for joining us. Henrike Laurent, Hendricks, Head of Product Management at Ethernet Nix for Broadcom is here. Laurent, thanks for coming on. And Jeff Kirk, Engineering Technologist, Infrastructure Ecosystem at Dell Technologies. Gentlemen, thank you for coming on this power panel on networking, composable computing. First of all, what is composable computing? I think I can answer that. I probably know more about that than the Broadcom gentlemen. Composable computing, you know, on the old days when you wanted to build a cluster and you wanted, say, GPS in the cluster, you had to buy a specific server that had GPS in it. Composable computing brings to bear a very high speed network that allows you to, decide whether that server is a GPU server or a memory server. And so it's this concept of an external, very high speed, low latency fabric that lets you essentially decide what the architecture of your server is. Yeah, I love that latency angle. You were talking about that before we came on camera. We know the network is the critical piece and that's always kind of the last spot where everyone kind of goes, go faster, there's physics involved, but it connects to servers, storage and makes the hardware act that wants one HPC system instead of several independent systems. How is the HPC networking space evolving in this context? I think that the real issue here is, as you mentioned, latency is extremely important. If you want to add memory operations to the list, in other words, you have some memory that's up in a garage connected via, then you have to have low latency because the CPU will stall until the memory access completes. So this is definitely an area where latency matters. And there's also new standards like CXL that's coming around to enable this and make it happen. Prior to this time, my customers, if they wanted specific kinds of servers, they had to buy them at bedtime. Now they can decide after it's deployed what the server's going to look like. Very, very helpful for HPC. Himal Anwaran, the NIC is a big part of the HPC network equation. What's Broadcom doing here? Yes, so one big change in the networking space when it comes to high performance computing and networking is Rocky. So Rocky is RDMA over converged, Ethernet, it enables high bandwidth and low latency networking for HPC applications. So Rocky itself has been around for quite some time, but the technology has matured significantly over the last few years to the point that HPC users or operators now have access to a broad ecosystem of hardware solutions and software for their network. And in addition with Rocky, Ethernet has substantially closed the performance gap with InfiniBand. So whereas in the past, InfiniBand might have been your go-to technology for a high performance network. Right now you have the ability to deploy a network with similar performance and latency using Ethernet and taking all the benefits that come with Ethernets in terms of standards, software and hardware ecosystem. Hemal, there are multiple players involved in this area. What's the exciting piece here? What's the disruptive angle? What's the new thing? So there are several areas where there's a lot of excitement with large scale. One area is the congestion control where players are working together to standardize at scale kind of network level congestion control where end points and switches will participate. Another exciting thing that's happening is we talk about hardware, but a lot of the high performance is being used from the application level below. So application framework and the software infrastructures, they are being optimized for things like MPI layer optimization, RDMA-verse layer libraries, and that's where significant software level architecture enhancements going on to improve low latency, achieve really high message rate for large scale HPC and machine learning kind of applications. There's a nice connective tissue there with the software, software and hardware go together. How do you guys see this keeping up with some of the scaling needs? One of the top conversations we're seeing is automation, having software in there. What's the equation for what we know today around getting more power out of the processors, out of the chips? What are some of the server angles here? I mean, yeah, I got GPs required, it's been on a bandwidth. How are these technologies making the network more efficient? What's the scaling angle? How do we keep up? So I think I'll start another scan. Scaling angle is we know with GPU and CPU with more and more cores being packed in the same server, we have more application horsepower. GPUs are taking it to the next level. So because of that, they're able to process more data, now it puts more demand on the networking bandwidth. Not only that, the number of nodes in the fabric, they're being so high, you really need high ready ethernet switches to build this kind of large fabric. You also need scalable transport protocols so that you can support large number of communication endpoints on a server. And then you have to tie all of this end to end together to get to the scale and the application requirements you want to have. So all this different levels all the way from physical layer connectivity, fabric, the software layer, and then the cores or compute power that you have. They all need to scale together. It's interesting when I hear fabric, I hear physical layer, I hear software. I kind of feel like it's a platform, a system architecture that's a platform and systems work, you change one thing and changes something else over a year. Is that the way to look at it now? I mean, it's a system, it's technically connected but what's different today on these larger size problems, the exascale for instance, these are serious workloads, serious challenges. Is that a platform? Yeah, can I add a comment to that? I think the real issue is over the last 20 years, workloads have changed. And a lot of workloads have changed and how we operate clusters have changed. Clusters are now running like clouds, multiple tenants but the workload that's really soon as we're causing the most grief is AIML. These large language models require that the network operate at the highest levels of efficiency, maybe 95% or better. And so what that means is that folks like Broadcom have got to come up with a good end to end solution that will deliver that high performance. And that's pretty challenging in my book. Broadcom, you guys are the engineers, you're architects, you're product management, you've got the keys to the kingdom Laurent, and while you've got the architect side here, distinguished engineer, I hear that these large language models are pumping up more compute power than crypto. And it's only going to get, it has to get more efficient. This is a huge engineering challenge. How are you guys looking at, not necessarily large language models, but these new kinds of workloads are a tell sign that the right network topology, the right system architecture, the right software. What's the engineering scope here? Share with us what you guys are thinking. Yeah, so it's not just limited to HPC and AIML, right? Data centers in general are using a lot more power. I was looking at the data from a German research firm called Climatic, right? And they're forecasting that data centers will use more power and produce more emissions than the aviation industry. So that's obviously getting a lot of attention, not just from operators, but also from governments and regulators. Now, if you look at HPC and AIML in particular, right? Those are designed for high performance. So they use high performance components, whether it's CPUs, GPUs, and of course the network piece. And those high performance components, typically require more power than your standard compute server. So that's a power challenge here very clearly, right? Now, if you look at the network in particular, which is our focus in our team, the network is often regarded as overhead in the sense that the network doesn't produce any additional compute power. So it's really a tax on the system. Every watt that's used in a network is a watt that's not useful or usable by your useful components, which are the compute elements, whether it's a CPU or GPU or an ASIC. So there's more and more attention to minimizing the power used by the network. So to us, that means a few things. One is that we need to design network components that are low power, right? That's primarily the nick and the switch. It also means that you need to think about the network topology and go for efficient network topologies. And typically those are topology with fewer nodes. And as Hemel mentioned earlier, that means you need switches with a higher radix and the higher bandwidth. And the third thing it means is the cables or the interconnect themselves at 200, 400 gig uses substantial amount of the total network power. An optical link at 200 gig uses the same order of magnitude of power as the nick itself. So we and our customers are looking at emerging technologies or new and old technologies that will allow them to reduce the power in the cabling itself, right? There's the good old DAC technology on passive copper cables, which are sort of getting a new life, right? With the new generation of silicon, you can use passive copper cables up to four meters, which practically means you can do the cabling inside the rack with passive components, which have the added benefit of very high reliability, right? Because they're passive, those cables have higher reliability than active components like optical transceivers. And then there's been a lot of talk recently about a new technology called linear drive, which basically consists of removing some components from optical transceivers, typically the retimer or the DSP, right? And thereby significantly reducing the power of the optical link, right? So going forward, data center operators will have, you know, do those two technologies, right? A passive copper and a linear drive. And together they should be with those technologies, they'll be able to significantly reduce the amount of power in the cabling. And Mal, what's your reaction? Obviously on the architecture side, squeezing more power for the compute makes a lot of sense. Is that what kind of changes happen in the architecture? It sounds like an architecture game is he's squeezing more out of the racks, more out of the data centers, save the power, move things over. What changes? What's the new guiding principles around this? So guiding principle there with this power, what goes along is do process the net, process the packets or networking traffic where it's best suited for. And that's where the NICs and switches come into picture. That CPUs and GPUs, they're good at application processing, optimize the data path from there and have most of the communication in handle by the NICs which are doing it low power, not only that they're doing it more efficiently. So designing not only the very efficient silicon pipelines but also optimized software data path that go along with that gives you that full benefit of not only you are using the right components for doing the right things, but overall the efficiency of the power that you're using to get to the specific application results is going to be the best. Yeah, it sounds like a lot of good thinking around how to offload, how to build abstraction layers, software, combination of the two. Now let's go back to full circle, Jeff, on composable around the network being a critical piece here. As all these things come together, offloading the processor, making the network more energy efficient. I think there's a couple of principles we need for composable to really make sense. Number one, I've had many customers that tell me I don't want to use two fabrics. So please don't have a composable fabric and a peer-to-peer HPC fabric. That's principle number one, one fabric. And number two, if you're going to do that, if you're going to have a single fabric, you got to make sure that those two different types of traffic aren't interfering with each other and slowing down the system. And I think it's early, it could be that the composable is the highest priority or maybe the peer-to-peer, until we get a chance to try all this stuff out and see what's what, it's going to be difficult to know what's the right way to configure these systems. You know, I was talking to Dave Vellante the other day, we were actually going back, talking about our age, and I was about, is that we were talking about the days they're token ring versus Ethernet. And so, Ethernet was the winner. Ethernet is the world's choice for clusters. You know, we're seeing that everywhere. And as to getting higher performance and bandwidth, networking benchmarks can change and grow. What's the state of the art to understand performance in this new era? I see cloud players trying to replicate what Broadcom does or other chips. People are trying to get merchant silicons, a big game. New people doing things, and you can have copycats, but at the end of the day, Ethernet is standard, but there's a lot more going on under the covers. What benchmarks should people pay attention to? What's the key metric for success? And this is something that I've had many discussions with Broadcom, they know what I'm going to say right now. If you look at a lot of academic papers, they talk about load versus performance, load, right? But they inevitably are running a single application on this cluster. And that's not load where I come from, because one of the biggest changes in the HPC market has been that people are operating their big systems as clouds, and they have multi-tenants. And a load is when your next door neighbor is really flooding the network with traffic, right? And so it affects you. Your neighbor's traffic affects your performance. And this, so I believe that we need a change in how we benchmark, we need to benchmark the applications people are interested within the presence of conflicting traffic. And many discussions with the Broadcom and they're doing great work in this area. So, Hamal, you got something to add to that? Yeah, sure, Jeff. So as Jeff was talking about, if you look at today's focus or like a few years ago, people used to focus on micro benchmarks, which give you some communication primitive performance or you try to measure it at the library level, middleware layer, and then there were like domain-specific application performance, but that doesn't address what today's needs are. So as the way I look at it is, we need a new benchmarking class, which will allow us to measure performance under application interference, multiple applications running, real-life network congestion scenarios, also ability to measure the communication patterns that will get generated when specific datasets are being used for a set of applications. So that to me is an emerging class of benchmarks that we need going forward. It's a whole new era, it's a modern era of benchmarking, basically, not the old school away. That's what I'm hearing, because the application workloads are running at different scale and levels. Is that right? Correct. Aran, what's your reaction to this benchmarking or reality check of what is a, how do you measure what's great? Yeah, so I second what Hemel said, it's difficult, it's not sufficient to measure point-to-point performance, right? Because we're dealing with very large clusters here, right? With a lot of complex effects due to scale and multiple applications, as Jeff mentioned. So the trick here is to build enough scale even in the lab, right? To be able to run those high-level benchmarks that Hemel was talking about, on real systems using real applications and understand what really matters in your system. Yeah, I mean, I think you guys are hitting a great point here that's coming together, which is the architectures are changing, the workloads are changing, I mean the passive copper example that was just talked about, that's now actually specific to the data center, more is going to be there, better results, a lot more things are happening. So clearly we're in a new next-gen environment for computing, I call it the super stack, you know, super compute, computing, and interactions, infrastructure, and then you've got cloud and applications. It really is a whole nother error. I guess everyone's top question is, wants to know, is Infiniband dead or dying? Is Infiniband dead or dying? All right, Brad, if you want me to answer that a little bit. I have to end on a happy note here. It is not dying, but on the other hand, here are a lot of customers that have a distinct preference for Ethernet. If somebody like Broadcom can deliver Ethernet that performs equivalently, yeah, they prefer Ethernet. So we're waiting on you, Broadcom. Yeah, and I have worked from one gigabit Ethernet to now we are going to be pretty soon, one terabit Ethernet, we are already at 800 gig. Ethernet, the simplicity, the interop, the ecosystem has survived for now, we can say four decades plus, right? And then the matcliffe and other guys, like drew a little network on the paper versus what it has become. So as you can see, our bad is going to be, Ethernet is going to be winning, whether other network technologies are going to be around or not, that they might find a niche, but Ethernet is going to be ubiquitous. You can't bet against the open ecosystem and innovation equations, especially as architectures change. I guess, Hemel, while I got you out in the here, we were talking about DMTF earlier from going back in the old days, you've seen the movie before, many ways of innovation. What's exciting right now that you guys see, I mean, even HPC has always been cool, as far as I'm concerned, it seems to have a whole nother level with the performance and we're seeing a lot more emphasis on go faster, compute, I'll see power, you mentioned that. Guys, final question, what's exciting in this new inflection point we're seeing? You go back, you look at the waves, this is a pretty big one. I mean, the idea that this, we're capped out with power and performance, not really, we're seeing more action. What do you guys think about this next wave? What's your personal views? I think that it's end-to-end control and performance. And I think the Broadcom probably does that as good as anyone. To handle these highly optimized scenarios, like AIML, you need end-to-end control. I don't need some cheesy Nick from Asia, combined with my Cisco switches. I need one vendor end-to-end and I need really end-to-end optimizations. But I think Broadcom has a very strong story there. Laurent Hemel, take us home, final word. What's the difference now? Integrated, software intelligence, AI, something's in there. What's this, what are we talking about now? What's this big wave? What's your final thoughts? Yeah, by quickly summarizing three things, the increasing adoption of this high-speed VVN from 10 to 100 versus going from 100 to one terabit or higher is going faster, more and more demands for more networking performance. That really creates a really exciting opportunity for you to innovate and then build this network infrastructure. So that's one. Second thing is we are taking more application-level view as Jeff was mentioning. If you can make, as the complexity grows, but at the same time you provide this end-to-end turnkey kind of solution that will have the application developers really adopt this new technologies much faster because now they don't have to worry about low-level information which is all taken care of them. And then third one is maturity of software infrastructure where we were like 30 years ago versus where we are now. The software infrastructure for networking has really matured and it can really handle a lot of complexity in the networking stack, whether it's from the kernel level or user mode. So those pieces are all coming together to take us to the next era of networking in this high-speed networking for HPC machine learning or other applications. Laura, take us home, last word. Yeah, so I agree with what Amal and Jeff just said. The key to success is going to be in the overall system integration and the ability to optimize the system. In the Ethernet world, we used to worry about the link, the physical layer. Now we have to worry about the link, the drivers, the application and the overall system and be able to optimize everything at scale end-to-end. So that's the challenge in front of us. Gentlemen, thank you very much for this power panel on HPC, really appreciate it. Networking for the action is obviously more compute, save some power, move it over, new architecture, fabrics. Jeff Kirk, thank you so much. Laurent Hendricks and of course Amal, thank you so much all from Broadcom. Thanks for joining us here for the coverage, CUBE coverage, thanks for coming on. Thank you. Okay, that's the CUBE coverage here for ISC High Performance 2023. I'm John Furrier, your host. Thanks for watching. Thank you.