 Good afternoon, hardware nerds, and welcome back to the Mile High City. We're here at Supercomputing at 2023. My name's Savannah Peterson, joined by my beautiful co-host, Lisa Martin. Lisa, it is such a joy to share the show with you. Isn't it a joy to share the show with you too? Yeah, I feel very lucky that we're getting to hear from the top minds in Supercomputing together. It's awesome. Yeah, and you know what? Your energy, the energy on the floor, it's all still up. Just like our fabulous guests, please welcome to the show Tommy from TAC and John Furrier. The co-founder of the Cuban SiliconANGLE Media. I'm glad I could be here for this segment. Yeah, we are too. Welcome to your show. Really appreciate you showing up. Tommy, this has got to be a very exciting show for you. You have one of the biggest, coolest booths on the floor. How's it been going so far? Ah, the show's been great. You know, it's always entertaining to see all the community get to see face to face, chat with all of my colleagues and everything around. And get to see all the new faces, the new technologies, everything that's out there on the floor. It's pretty exciting. Every year there's something new and I learn every time. And so, and my understanding is we had the largest attendance ever now for Supercomputing. Wow. They surpassed even the four year ago one. That is what I heard. Hey, that actually just gave me news from the four times I had. Yeah, it's however $10,000. So we're back to where we were several years ago. And yeah, but it's been exciting. It is exciting and speaking of exciting, you know, two very big announcements I want to make sure that we unpack. Let's talk about Stampede first. Okay. Big new Supercomputing news for y'all. So Stampede 3 will be a follow on to the history we've had with Stampede 1 and Stampede 2. Our community still requires a lot of CPU computations. Needs to be able to leverage the CPUs and they haven't been able to take advantage of some of the GPU technology as of yet. And so part of our strategy on Stampede 3 is to deploy an Intel based system. We actually have other systems to support other type of research activities. And with Stampede 3, we're going to be deploying at least 560 new of the Intel Sapphire Rapids nodes. These are the Dell servers, four and two U, direct liquid cooled. So all liquid cooling, everything, 60 kilowatts per rack. So these are going to be a pretty dense solution. So we're pretty excited because the performance of the Sapphire Rapids has been great for our application space, especially those codes and applications that aren't suited for GPUs or accelerators as of yet. And that was all thanks to a $10 million award from the National Science Foundation, correct? That's correct, that's a follow-on. In fact, the Stampede 1 and Stampede 2 were both NSF awards and Stampede 3 is the follow-on to those two awards. And we're going to continue to build the plan. I want millions of dollars in awards. I know. There's a lot about the work you're doing there. It all goes to hardware. But we do get some funding to operate and maintain and support the system to the community and to the researchers that use it. So yeah, it's exciting. It's many generations. And we're going to operate the system for five years and it's going to have a pretty long life. And so we're excited about what's going to offer. And we will have some of the new Intel Panaveco GPUs as kind of experimentation in that system as well. So we'll have 80 new Panaveco GPUs similar to what's in Argonne's Aurora system. And again, we want to explore that space and see how well our users can leverage and utilize that technology. So you've given us a peek under the hood of some exciting announcements. TAC for anyone that doesn't know is Texas Advanced Computing Center based out of UT Austin, I believe. Yeah, that's right. The University of Texas at Austin. So all of the technology, technological advancements that you are doing under the hood that you just kind of talked about, how does that really drive the mission and the vision of TAC forward? So our mission really is to support scientific research at the University of Texas, in the state of Texas, and now with NSF funding across the entire country. And in fairness, across the entire world, you were mentioning that you were sending happy our last guest servers down in South Africa. That's right, and so we have researchers that access our systems from pretty much around the world because they're collaborating with US researchers and they're allowed to work with them and get access to our systems and utilize them to do their research and support. Like perfect example is the Large Hadron Collider. They do some processing on our systems like Frontera and Stampede II currently. So cool. So we're pretty excited about being able to do that in the future, so. Tommy, we had you guys on theCUBE last year talking about the support you guys had for national disasters, COVID, oil spills, weather, a lot of high-end, high-performance computing. This year, we kind of saw the AI way, but we didn't really drill into it last year hard enough, but this year with the ecosystem growing, AI is the lift. You guys were at the Dell Community HPC event that was the pre-event for this event. You actually got a call out from Armando from Dell. Good customer, good for that, good for Dell. But you guys were talking about some, I want to call this out because we've been using it all week on the show. Dan from TAC said, AI vindicates the HPC way. AI hardware will dominate. AI needs interconnects. Hyperscales will dominate the trend. Really kind of terse, but right on point, that's the market dynamics going on right now. So you have chips and cloud coming together that's changing the hardware and software market because the complexity of the workloads that AI is bringing is going to change the game. What's your reaction to that? Explain to our audience why this is important and why this is an inflection point for HPC. Yeah, so HPC over the years and the trends, we used to build big iron systems and big giant, very complex, kind of single-purpose systems. And it's interesting because then the x86 revolution came along, the gamers, all the desktops and everything. And it's like HPC then realized that, well, that's a much bigger market than us. We need to leverage that technology to be able to do our science and support our researchers. Same things now, we see trend happening with AI. And in fact, we see them leveraging what we've already learned in HPC and they're like, oh, we got to scale out, we got to scale big, we got to go to thousands and thousands of GPUs. And it's like, well, we need to have interconnects to be able to talk to these, they'd be high bandwidth. And so yeah, it's things we've been doing for 30 years in HPC. Do you feel like everyone is seeing you now? Like is this almost a little bit of the deep computing moment in the sun? I hope so, but yeah, it's not quite yet, but I think it's coming. The thing that I love about this, Savannah, we talked to us on our opening a couple of days ago, is that there's an exuberance and enthusiasm from the folks who have been doing it. In the AI world and HPC particular, there's been grinding going on, years of grinding. We saw this on other markets like video, now it's exploding. The folks that have been doing the work for years, like we've been doing HPC for years, we've been doing it. But then it's going mainstream, so now I'll do it. Well, you don't even realize, I mean, how much HPC impacts your general daily life and like how much is behind the scenes that companies are doing to simulate and develop their products in computational space. And the consumer doesn't necessarily see all of that activity. So it is good to see it broadening out, getting outside. The university and academic DOE research community has been doing HPC for a long time, but now companies are really leveraging it and deploying it. And with the GPU and the acceleration that you can get and the benefits that can get you out of the lower power potentially to deliver more performance, that's really going to be a key aspect. And one thing I want to get your thoughts on, I love this because we've been talking also about how HPC has been about precision and scale, but now with LLMs and models, you can get broad breadth and then depth and precision at the same time, which opens up personalization, you got productivity, and then the access that you mentioned earlier, all happening. So what is your, how do you see that opening up from a user perspective? What kind of new access and enablement will you enable? Because right now it's the high end stuff. Okay, storms to natural disasters. What are some of the things you might see pop out of the woodwork or use cases that we might not see now around the corner? What's going to be interesting from our point of view is a lot of what we've supported in the past has been simulation and just being able to model through equations what we think the physics of the world is and everything. What we're going to see with AI and everything is taking surrogate models, smaller approaches where you can very quickly iterate and optimize to a specific answer and solution. And then when you need to get real precision, you drill down on that answer and you've already been part of the way there. I think we'll see a little bit of that. I have to admit in the research and academic space, we haven't seen, there's a lot of interest in AI and ML, but our challenge is we don't have the data sets that people are doing image training, doing analysis, finding wolves in pictures and things like that or doing traffic patterns. Now with LLM and a lot of these other... And synthetic data, by the way, is hot. Yeah, exactly, generating synthetic data because the challenge is to really get AI to work well, you've got to have good, clean data. And getting good, clean data is a challenge. People aren't talking about that enough. The hygiene. Yeah, yeah, the hygiene. You've got to wash up a little bit on that data front. And then you've got to keep it too because if you're using this data to train your models and the data changes, you need to retrain the models and you've got to, it's a whole adaptive technology that's really going to change how people perceive and interact with the computers. And it's going to change how we fundamentally ask questions around what's possible. I mean, you already see it now. I mean, all of the chat bots and all the stuff and a good chat, TBT and everything, it really is going to be a game changer and we hope to be on that forefront and support that activity. But again, it's a bit challenging to keep up with the technology. Well, you had an announcement about this this week too, right? We've got Vista, it's a departure from your X86 based architecture, first system with an ARM processor. Tell us about it. Yeah, so we like to have a whole ecosystem of platforms attack. So, you know, we do have an Intel based platform. We have a name D platform. You play nice with everybody. Yeah, yeah. We like to joke we're an eco opportunity data center. We'll take whatever vendors will give us. I like that. And, you know, we do explore a lot of new technologies, try to evaluate what's going to be coming. And because of that, you know, we know our users, you know, one type of system may not work well for them. So, we want to offer options and variety. And, you know, AMD's got some good platforms, works really well on certain things. Intel's work on some things. NVIDIA GPUs are good for some. We have some in my 250s or yeah, 200s in sought and attack that we've been tinkering with. But anyway, Vista is following on to all that technology. So, it'll be our very first large scale ARM system. Yeah, which is cool. Which is interesting. The ARM ecosystem software we really see is finally getting to the point we can support scientific computing. You can feel their presence as much bigger here on the show floor too. Well, they've got vectors, pieces, and the processors now, they've got a lot of the things that HPC needs. You know, they've been low power for a long time and everything. But now we see, with the Grace processor, the number of cores and the performance it can provide is quite good. And like I said, a lot of our applications are still CPU oriented. So, for Vista, we plan a pretty decent sized portion that will be primarily CPU. So, it'll be two socket Grace nodes. Over an NDR interconnect from NVIDIA. But also, we're going to have 300 of the Grace hopper nodes. And this is really a potential game changer in terms of technology. You're having a single memory space with your Grace processor and your hopper right there and the amount of power you can get and the efficiency it could provide is really one of the big aspects from us. So, we expect a lot of our AI and ML workloads will start moving over to the Grace hopper processor. But beyond that, we got a lot of scientific applications that we're already starting to port onto that platform. And we've been working very closely with the NVIDIA developers and teams to help us port and tune those applications onto their Grace. And those GPUs is like getting that gift you wanted. You know, it's like, it's the shiny toy you want. It's a good toy to have. People want it. And that brings up the power. And I got to ask you about the question that's been in the hallways here is open, open converse, open connectivity, silicon diversity and AI silicon platforms. You see chips. How important is open standards going to be for this? Because you're starting to see a lot of innovation and kind of people building stuff that's new on top of the pre-existing scale, especially in France, for example. Yeah, we are huge fans of open standards. In fact, almost every piece of software that we develop are right at tact. We make open. We provide to the community. We allow them to get the source code. We understand that that's the best way for us to advance everybody else. They don't have to reinvent the wheel. They don't have to figure it all out. And so, yeah, we just, so we're supportive of that community. We certainly encourage it. And the other thing is is we, a lot of times when we work with other academics and other research organizations, they, you know, we know we're leading the way. We realize that we're forging into the new technology. And they basically want to be able to reproduce what we did and not have to go through all the struggles. And it's like, well, we'll figure out some of this stuff out. We'll get it working and then you can replicate and take what we've done and do it at your own side or and use our software tools and what's available in the open source community. So, big fans of that. I'd love to double click a little bit on the security issue and have you tell us, how does tech balance the need for providing that open access to competing resources with the security considerations, the privacy and considerations that are associated with very likely sensitive research data? Well, this is a huge challenge. You know, I can make the absolute most secure system in the world, but it would not be usable for any researcher. And of course, if you have an insecure research system, it's going to get hacked into and all the researchers then won't be able to do their work anyway. So we have a delicate balance that we have to maintain. We certainly try to implement all the best security practices. All of our system are multifactor authentication to get into. You're starting to see this pretty much banks everywhere you go, you've got to have two factor or some kind of multifactor authentication. It's got to be something you know, like your password and something you have, like a token or some kind of other thing. And that's how we've tried to address the security issue. But even so, you know, we don't want to get it in the way of the research. And so we don't want to put anything that's going to be secure and inhibit performance, like on our compute nodes or anything like that. We want them to get 100% best performance. So we have to kind of weigh, you know, deploying the systems, but ensuring that they are secure and making sure that, you know, the data doesn't get lost, doesn't get ex-filtrated. And, you know, we don't, we only have a small enclave where we support protected data and HIPAA type things. And so that is expanding. We see that actually growing and now, you know, more and more data is becoming protected and personalized. And so we're moving into that realm, but I have to say it's a bit challenging on the big HPC general purpose systems. When you got thousands of users on there, ensuring data's not getting taken or stolen. And so it's tough, it's a challenge. I would love to have nice open systems that users could easily log into, transfer data in and out of. But in me, they'd get hacked into like next, and Bitcoin miners would be running everywhere. Yeah, they would, yeah, it would be a free-for-all. So. How big of a magnet for recruitment is TAC for you two? It's a great magnet. I mean, it certainly is. We've been able to track faculty and researchers to the campus and having these systems easily available to the campus. But even so, we don't allocate most of the time on the systems. The National Science Foundation has a committee that allocates the time and judges the research that gets to run on the system. Oh, I didn't realize that. So we did not pick the users that run on our systems. Oh, interesting. So there's a whole committee that does that. Oh, cool, oh, wow. They do the allocations. You really are equal opportunity. Yeah, well, it's peer-to-peer. It's a side door. Yeah, well, it's peer-reviewed science. Again, you know, we want to make sure that the community is telling us who we need to support and who needs to run on our systems. And then we figure out how to make sure their applications work well in our environment. Tommy, can I ask an infrastructure setup question? Because I think when I hear GPUs, I'm like, get as many as you can, or stockpile them. And then you got to build around them. So one of the things that's, the theme here is that the hardware is changing. You're seeing dedicated AI clusters emerge, which is not a new concept of having clusters, but they're dedicated to AI with the GPUs. What are you building around it? What's your vision? How do you see that unfolding? And then how does that change the tech stack? Now that you have foundational models and inference scaling out and training models being focused specifically on getting it up and running, it sounds like a sandbox and then scale. This is going to be one of our challenges moving forward because we want to make sure that we can leverage the hardware that's doing AI and ML. Also to support HPC and research, other research activities. So we're going to try to build our clusters in such a way that we can still do a lot of different things on it and not have just AI specialized. But you know, we're going to have to have some dedicated resources to do LLMs and some of these things so that they persist and that they're available and that, so we're going to have to explore that. And that's a new technology. The enthusiasm is high on AI. How is you seeing experimentation and production workloads migrating? How do you see that evolving? Is it going faster than you thought? Is it slower? Are people taking their time? You mentioned the iteration. Unfortunately it's going a little slower than it probably has. And again, it's because a lot of our academic researchers need to get their research done. They have grants, they have deliverables, they need to get their science done, they want to get their papers out, they got a little bit of it. A lot of people watching. Yeah, and funding. And so it's hard because they can't go back and change that code that they've been using for years and years and to port it to another app, a new technology or new thing. So they want to make sure it's going to persist for a long time. And so we do see a lot of adoption of CUDA. And certainly that's been great. I would say that from my mind, I'd like that to be a more open standard so that it would work across other technologies, especially with a lot of these new accelerators coming along. It's going to be imperative. We're going to have to have something. Yeah, we've got to fix some kind of standard language so that users don't have to worry about all these different languages to support. So true. All right, final question for you, Tommy. I'm going to ask you to take your hat off, your tack hat off, which might be a little tricky. But I'm curious for you personally, I mean, you get to see so many cool research projects. Do you have a favorite or a clutch? I mean, you don't have to say favorite because that term can be a little demoralizing to anyone who's not your favorite. But one that's really exciting you right now. So the one of the projects that I really is kind of close to my heart that we support and run at TAC is this hypersonics flow algorithm that this one research group is we're working on. My original background is aerospace engineering. I'm actually an aerospace engineer. I'm a total AV nerd for the record, so we're good. And I specialized in computational fluid dynamics. So CFD has been, yeah, it's my- Okay, so this is all kind of connecting the dots here on how you have the roll you have now. All right, keep going. So anyway, so because of that, this one project for hypersonics, we've, I've seen them go from running on only 2,000 nodes on eight frontera. They've been able to scale up and do problems I dreamed about 25, 30 years ago when I was a graduate student. And it's like, yeah, the way that you can solve the, and analyze the problems now and is just amazing. And the fact that they can process and do models that we just had never dreamed about way back when I was doing my degree. It's a long time ago, but still. Just a couple of weeks ago. So, anyway, that project I think is great. But even so, you know, we got all kinds of design, you know, national hazard support, you know, storm surge modeling, hurricane modeling. I love all those projects. They always have something interesting from my point of view. So I like supporting them all. The work that you and the researchers do around you can quite possibly save our lives or at least extend them. And we are extremely grateful for that. Tommy, you are an absolutely stellar guest. Thank you so much for being here for all that you do attack. John Furrier, thank you so much for creating this company so we could be sitting on the stage having this discourse. Lisa Fabulous, as always, thank you for your insights. And thank all of you at home for tuning in to theCUBE. Here in Denver, my name's Savannah Peterson. We'll see you for our final two segments next.