 Good afternoon friends and family in the HPCAIML world. Lisa Martin here with Dave Nicholson. We are live at Supercomputing 23 in Denver, Colorado. This is midway into our third day of four days of wall-to-wall CUBE coverage. We're going to be having a great conversation. You've been hearing us, if you've been watching this show this week, talk a lot about Dell and Broadcom. We have two alumni back with us, Dave. We're going to be digging in deep. And the show and tell. Please welcome back Hemel Shaw, a distinguished engineer and architect at Broadcom. And Jim Winnie joins us as well, senior product manager at Dell. Guys, welcome back, great to have you. Thank you, happy to be here. Happy to be here. Good to see you. And you guys just did a presentation together, so you're all warmed up, ready to go. You're locked and loaded for this conversation. That's right, that's right. Awesome. And we'll talk about, there are a lot of players involved in AI, ML, networking technologies. What is, from Broadcom's perspective, what's new and exciting in the networking space? And then, Jim, we'll have you kind of, kind of weigh in on the Dell partnership. Sure. Broadcom, we have been doing Ethernet networking for decades now. What is exciting with AI, ML, is pushing the envelope, bringing really large scale clustering, asking for high bandwidth. And what we have is both, in terms of our NICs and switches, we have our solution today. And we are innovating with other things around ultra-Ethernet, where we are putting more intelligence in network infrastructure in the NIC, to make AI, ML solution from networking standpoint, the best. Jim, comment on that. It's very exciting. Yeah, very exciting. Jim, comment on that from Dell's perspective. It's a crowded space. It is a crowded space. Why the Dell, Broadcom, tight, better together partnership? Well, it's always best to work with the lead dog. That's where Broadcom comes into the technology side. We're very excited. We've been partnering with them for literally decades. And it's always exciting to see what they're brewing up. So we've been like supporting the Tomahawk line, the top of the food chain for hyperscalar networking solutions, the Trident line, the top of the food line for enterprise solutions. We have switches in those lines and we continue to work with them, having the discussion about where do we go next? What do we just do? What worked, what didn't? So all of those things, yeah. So, Hemel, I like you for more than just your NICs, just to be clear. Not, you know. That's a compliment. I know Jim when he says top dog, he means it affectionately. So, we've been talking a lot about networking and connectivity. This specifically is what you might term inter-server networking specifically. Absolutely. When you use the term ultra ethernet, what's the difference between ultra ethernet and just plain old garden variety ethernet? Right, so what we have today is garden variety ethernet. On top of that, we have Rocky as a RDMA transport. People are using it. It's good and it has service purpose for the scale that is being deployed today. With ultra ethernet, what we are doing is keeping the same ethernet ecosystem infrastructure, but adding things like multi-pathing, adaptive routing, congestion control mechanism with scale to large number of nodes. Also, addressing some of the Rocky transport efficiency means RDMA goes to the next level by having selective retransmissions. So, all of those together with the physical infrastructure of being ethernet, that's the ultra ethernet. Can we get right to what you brought with you because I'd hate to show it at the end and then we'd have all sorts of questions we wouldn't be able to ask because of time. Hopefully we can get some tight shots on that maybe camera over here. What did you bring with you, what do you have? On my left is Jericho 3 AI. This is for AI fabric market, 100 billion plus transistor top of the line switch chip. On my right, I have Tomahawk 5, which is 64 ports of 800 gig. Again, top of the line switches. Both of these together allows you to do multi-stage stage switching fabric for AI about. So, really glad to see this. These are both production and it's being deployed in production. And just to be clear, the way that these actually are implemented, they would be in an enclosure. Rack mounted as a switch with the ports in the back so to kind of familiarize people who are in data centers who maybe don't get down to that level of componentry. How does that partnership work with you? I was just going to jump in and let me brand for Hamel and Broadcom. This is the top of the industry right now. You can't get any faster, any better than what he's holding in his hands. I was telling him, I'm so impressed. He was able to wrangle these out of the hands of the engineers to actually show it to everybody here. So, very cool stuff. We are very excited to be partnering again with the Tomahawk 5. We have a Tomahawk 4 already shipping, Tomahawk 5. Well, I can't release dates and that kind of stuff, but let's just say we're looking very closely. Oh, it's when we can do something there. So, having a high capacity, a bunch of 800 gig ports on top of 400 gig ports, critical. Absolutely critical for advancing the AI ML fabric solutions. So, I'd love to get your perspectives on differentiation. And obviously this is incredibly powerful and potent what you just shared with us. And I thought they were able to get it on camera. How is that leading edge top of the market? Talk about that as especially with Dell and Broad come together and both of you, I'd love your answers. How does that differentiate you guys in that space when you're in customer competitive situations? So, I'll start and James, feel free to jump in. So, what we bring in is end to end networking components, right, from the silicon standpoint and then all the infrastructure software that is needed in order to make that end to end connectivity and networking going. Partnering with Dell, what we do is also move up the stack in the system level. So, that's where end to end monitoring, management. How do you do at the cluster level management? So, together we are very complementary to bring the whole solution for an end customer, make it easy to deploy, easy to use. I agree with that completely. And so, if you look at like Ethernet, if you have a small one gig switch, when you run on it, Ethernet, you have a 800 gig switch, you run on it, Ethernet. And so, you don't have to have one network for the high end, one network for the low end. It's the same network, it scales perfectly. And that's where the ultra Ethernet comes in is how do we take the same building blocks and take the super high end and really take it to the next level. We've had conversations with folks over the last couple of days who have made reference to clusters of half a million servers. So, we're talking about potentially massive environments. So, connectivity in that space is a non-trivial decision to make. Absolutely. Historically, when you say Ethernet, you're thinking of an open standard, you're thinking of a common denominator that people can arrive at for compatibility. Does that change with Ultra Ethernet? Is the Ethos still there that this is a more open standard than something else that might be out there? Yeah, with Ultra Ethernet, nothing changes. All your open ecosystem and tools stay the same. What you will see is you will see more enhancements to those, like the infrastructure level. There'll be more end to end kind of making it more configuration free so that the users don't have to worry about what is happening at the layer below. So, that's how I look at it. So, Jim, you mentioned the idea of one gig Ethernet all the way through the fabric. Is that really a key differentiator that you're not having to instantiate a separate networking technology for your cluster that is completely different than the rest of IT? Yeah, we talk a lot about inference and training and all the activities and moving data back and forth. Maintaining two separate kinds of networking sounds more complicated than having a single thing. Absolutely. I mean, if I were to say, well, all I have is the cluster and I'm not going to interact with it. I don't have any pre-existing networking. Is then it less clear in terms of the value proposition or how would you? I think it's just as clear. I mean, will you have a green field, a new build out, you're going to use the top, the latest that is available? You still want to go with Ethernet because today's top product is three years from now. It's still interesting, but it's not the top product. Well, what do you do with those? You generally take those and repurpose them into another solution. Having the Ethernet being the same, you don't have to go retrain, retool. You can keep pushing it down the chain as you add more top-end equipment on there. So that's a very powerful, very powerful story. And where do we see, I ask this question a lot. I think of chasing bottlenecks in IT broadly is sort of a game of whack-a-mole. Once you've created, once Jericho and Tomahawk have bandwidth, a plenty, something comes along that saturates that bandwidth. So maybe for a period of time, it's not the network. That is the bottleneck. It becomes something else. What are we seeing there, Hemel, in terms of where bottlenecks arise? So depending on different workloads, what we have seen, especially for AI ML, some specific stages within network get congested. And if you are an administrator, you would like to know where are those congestion points. And today, most of these are manual, but with giving more automated kind of rerouting the traffic, allowing multipathing, avoiding the congestion. Those things, the really end user will appreciate as well as if you're an administrator, you really like it. And to go back to the previous question, the tools and everything's are the same. So what they've built today as scripts, they will continue to use like E-Tool on Linux. No issue, now it just got more enhanced with more and more information that they have using the same set of standard tools. Let's give that thing a name, GPUs. Yeah, yeah, sure. All of a sudden, within the last nine months, GPUs, they are the discussion, right? And being able to pipe 400 gig to each GPU is critical. All of a sudden, the demands in the rack have just skyrocketed. And this has to be line rate, has to be reliable. And that's where UltraEatNet really helps out, continue that discussion, and where do we go from there? So aren't these GPUs like 20 bucks each? So who cares if you fully saturate them? That's a joke, folks. They are massive, they are massively expensive into your point if it's underutilized. Yeah, big time, big issue. Big no no, absolutely, yeah. Interesting. Can we take a step up? I want to understand the power of what you're talking about, what you're helping organizations navigate in terms of the dynamics of AI, ML, networking. What are some of the business outcomes or the impacts that together Dell and Broadcom are helping customers achieve, whether it's a hospital or a financial services organization or a manufacturer. I'd love to have any examples of real world use cases where the business impact is dramatic. Yeah, so, go ahead. Take it. So I'll take a few examples like what we have, everybody loves chat, GPT kind of, right? So, but you can imagine similar things in other businesses where people may want to build their own training model based on patient data, right? And then doctors want to ask questions to like maybe some common symptoms based on that. So they may want to build their own dedicated, secure cluster and they would like to keep their cost of managing pretty much zero, right? And not require too much knowledge about how to deploy this. So that's where we come in, right? You provide them the tools, you make it easy for them and let them deploy the application which is the best for them. And let them focus on their core competencies. Exactly, exactly, exactly. And that's where these large language model solutions come in where they learn, they send all this data like chat, GPT sending the whole internet of data through it, you know, through a matter of days, being able to learn specific to medical or traffic or air flight, air controls. I mean, you know, you don't have to worry about, oh, I'm going to learn the world and I'm going to be the best at this area, whatever your area is and being able to do that and be very price competitive, yeah. So I've got a go to market strategy question for you. So from a Dell's perspective, typically if we think about this, we think about, you know what Lisa was talking about, like let's talk about the outcomes and the cool things. You get down to the infrastructure layer and there's a saying, nobody cares. We care. But somebody has to. Somebody has to, somebody has to, it's okay, okay, maybe you don't care, but somebody has to, otherwise none of it's going to work. But when you're working with an end user environment, whether it's a service provider or an actual end user in their own data center, how does this conversation of networking come up? Is it part of a package, is the typical engagement? We're going to stand up an environment with a certain quantity of capability and it will include end number of Dell servers with whatever components are inside and this is going to be the fabric that attaches all of them and then and the entire thing goes in. Is that more the conversation? You're not going in specifically having, you know, sadly, you're not having specifically networking questions all the time. So it depends on what the customer's asking for. We do have plenty of very specific pure play networking solutions, but that's not what you're talking about. You're talking about, hey, I have a problem to solve. You know, Dell, help me. We come in and we specialize in compute connectivity, which is our storage, our power storage line and storage as well and all the connectivity, all the cables and optics. We bring the whole thing to bear. So we come in, we're specialists in that because we have so many ways, so many ways you can solve problems, right? Yeah, one of your peers actually, Broadcom was on it earlier today and she mentioned a company that actually works with Dell, Scalers AI and she was saying that standing up the cluster took longer than the training of the model in this one instance. People take that for granted that that process is going to be simple. We definitely have services that specialize how to get that right so that it's productive immediately. You don't want to have, well, just go figure it out. Three months later, I still don't get how do you connect on wire this? No, you want to bring in the specialists. They know what they're doing. They get it up immediately and then things are coming along. So very important. I will, you also, sorry, did you talk to me? Yeah, no, no. You kind of led me in a direction there. Really, one of the things that Dell specializes in is kind of the open flavor. Okay, so open networking, open AI. So we are not just only having one GPU solution. So yeah, we work closely with NVIDIA for their GPU. We work closely with AMD for the MI 300X. We work closely with Intel for the Gowdy line. You know, we want to be able to have a full array so that when customers come, let's face it, every supplier goes through, oh, I'm out of that. Well, we've got several other options for you. You know, if timeline is your number one criteria, we're there and we're ready. And the networking and that infrastructure is the same. You know, it's just the server and the GPU. It will be a tweak there. So that's something that where we really excel as well. And you're cool with that, because I think those are all broadcom customers that you just mentioned. Not only that, that's true. But I was also going to add, we follow the same spirit. Like what Jim mentioned is very important that for our networking, we don't tie ourselves specific to one GPU architecture. We can work with any X-Ranger and that's why with Linux community, there's whole infrastructure that's being created which allows NICS to work with any peer device, directly transfer data in and out of peer memory. That's what this GPU computes large set of data. NICS are being moving the data. So that open ecosystem really works and then you really have end to end networking solution that way without really worrying about specific architecture. Without worrying about specific architecture, how do you help customers? This is a marketing term, future proof, that I always love to unpack and understand. Well, how do you actually deliver that? But as AI and ML networking technologies, the landscape will continue to evolve. How does Dell and Broadcom together help organizations future proof their environment so that they can continue to deliver at the speed and probably faster as the days go by that they need to? Go ahead. Go ahead. We are bored. Tag, you're it. We're both excited. I want both of your answers. Jim, we'll start with you. All right, so this is where the discussion about Ethernet comes back, right in our face. So here's where having an infrastructure that is not tied to one specific vendor is critical. If you have a networking that only, you can only get it from one player, then you're locked, right? And so with Ethernet, and this is where Dell is super excited, you know, you can buy Dell today if you need to buy somebody else tomorrow to plug into ours. Well, Ethernet is Ethernet. And we relish that fact. It also keeps us competitive, keeps us humble, you know, because we know that we have to continue to excel at what we do and provide excellent service. So you can't keep complacent. Go. No, I think it's good. Tag, you're it. So one of the things Dell and Broadcom, we have been talking a lot about this is to, for end customer, build reference architecture. And depending on different customer needs, we can say, hey, if your model is not going to be more than this, a single stage fabric is good for you. Two stage fabric for your future, this is how it will look like. So if you show them that path, that they can deploy something now and we can help them scale, that really helps them. And having that reference architecture, proven architecture, really makes them confident in our solution. Let's keep confidence, keywords. Sorry, go ahead. Yeah, no, no, I was just going to say, I've got, I know we're getting close to wrapping, but I've got one quick one. I've got one quick one, and you've got time for as much, you have as much time as you want. Of course. Hey, I'm with the driver's seat here. Exactly, exactly. So a year from now, we're back here, we're back together. What would you like to be talking about that hasn't reached maturity yet today? What would you like, where would you like us to be a year from now? Or crazy prediction for something that we have no idea we'd be talking about a year from now, then. What do you think? I actually would like. One year. One year from now, I would like to talk about how far we have progressed in the enhancements I was talking about. Be more concrete about the benefits of those. Okay. What do you think? I'd piggybacking on that. I mean, this year it's all about 400 gig. Next year it will be about 800 gig. Real deployed in vast numbers. That's what we'll be talking about, guaranteed. So I look forward to that discussion. All right, well, we appreciate your time. Thank you. Insightful educational discussion with us today. The show until was awesome. Thank you. And thanks for capturing that, because we did forget to tell you that we were going to do that. But guys, you have to come back because I think we're describing the surface here, this fast-moving environment, but what Dell and Broadcom are doing together, better together, it's not strong enough of a statement about what I see from the ecosystem together. So. It's not one plus one. It's one times 10 kind of a exponential, yeah. New math. I like it. New math. Gentlemen, thank you again for the time. Thank you. Or maybe sooner. All right, for our guests and for Dave Nicholson, I'm Lisa Martin. You've been watching theCUBE's live coverage of Supercomputing at 23. We're going to be back with our next guest after a short break. So we'll see you then. Walk the dog, get some coffee, get some water. We'll see you soon.