 Welcome back everyone to theCUBE's coverage here in Las Vegas for AWS re-invent. I'm John Furrier, your host with Dave Alonzo. We had full team coverage, articles hitting on, exclusive with Adam Silevsky before the show started. I think I nailed the keynote on that one, Dave. Adam was, we got that done. Amazing announcements, what a great show. It's kind of wrapping down. And it's really been an inflection point as Gen AI and infrastructure and the new stack emerges. We're going to break that down with Bill Vast. Cube alumni and friend of theCUBE. Very knowledgeable, VP of engineering at AWS. He's got his hands in all the good stuff. Welcome back to theCUBE. Good to see you. Always a pleasure. I'm happy to be here. I'm glad to see you again, John. This is a tradition we'll have to get you and Swami back to kind of close out theCUBE because it's like leaning back after the game and be like, hey, how'd that home run hit? Ah, I ran around the base, almost missed second. But we still touched all the bases. Congratulations. Let's dig into it. A lot to talk about, but I think we're just on camera. A couple of key things going on. The Nvidia relationship on stage with Adam and Jensen was a huge point. We were coming a thing earlier as a big customer and also as a place where Amazon has everything. You don't need to do anything. You go, where else can you go for whatever you need? You got everything there? Yeah, I mean, we've been, I don't know if you saw the announcement of the Qualcomm instances on AWS. And we even have Apple instances on AWS. I mean, we have Graviton. We have AMD. We have Intel. We have, for the last 13 years, we were the first to put Nvidia on the cloud, right? And we continue to be the leader in that and certainly in volume of Nvidia on the cloud for by long shot. So it's exciting to see that. And now I'm really excited about the new L40Ss and that's, because that's been a big area in mind where I work the digital twin and a lot of the simulation and training and autonomous vehicle components. And what they're doing with Omniverse is going to be the future for that in a lot of ways. So we're integrating it with TwinMaker. We announced today, so you'll see that coming as well. And we use it in our fulfillment centers for product training. Explain how that manifests into sort of a customer value. So as far as the Omniverse, or as far as... Yeah, the TwinMaker. The TwinMaker, yeah. So we've got thousands of customers building digital twins. And the future, what we've been building up for over the years is sort of this virtual cycle. And we always like flywheels at Amazon. Everybody in the flywheel right now. Yeah, you started out that with swallowing. Exactly, so from my perspective, we start with connect and collect, right? So connect and collect, all the things that I've done since I've been here has been focused first on connect and collect. So snowballs, you can send data in. ETL, data transmission, data sync, storage gateway, kinesis streams, Kafka streams, all of those things to get data in. And then IoT, right, connect and collect. And now Kuiper, private 5G, all those things connect and collect. Because you're going to build these dense models, you need a lot, a lot, a lot, a lot of data. So then you've got to be able to manage that data, right? So you've got S3, you've got the new accelerated S3, it's very exciting. You've got FSX, you've got all these ways to manage exabytes and in the futures, zettabytes of data, right? We even have the ZFS file system open, ZFS under FSX as well. So I mean, so that's there. And then you're going to need to generate synthetic data. And that's in its early stages, because for example, when you're training these models, like if you were training a model to dynamically adapt your logistic systems for all conditions, you can't record your logistic systems and break it over and over again. You want to synthetically generate breakages, right? Or if you're training it to not hit a person with a car, you don't want to have people running in front of cars to do that. But you need to do it hundreds of millions of times. So you generate those things synthetic, right? And those come together. And then all of that allows you to then build a software to find everything, right? And then you could build a digital twin of everything. And now we're getting to the compute, and you see this omniverse, where we've got for the first time in history, full fidelity in a digital twin. So omniverse, if you don't know much about it, it does full 100% ray tracing, something that no other rendering engine does. And it does that to give you full fidelity. And that's why you need the L40S is to support that. And they're also doing for the first time full physics. Now, in our gaming technology that we do, we cheat a lot, if you lot not cheated the game, but we cheat, we don't render things we don't see. We do a lot of rastering in addition to ray tracing. We do make-believe physics in a lot of cases. But when you're doing autonomous driving, you can't make any of that up. It's got to be real. So you got to have your collected data and your real and the synthetic data in the simulation. And then once you have all that, you can apply machine learning. And these new LLMs are going to be able to do things like generate a lot of that synthetic data. They're not quite there yet. And like in one area, we really need them to generate 3D data. And I must see a new paper every week getting closer and closer to that. 2D data, 2D images, got it down. You saw it in Titan. You saw it at Stable Diffusion. You see it in others, right? But the 3D data, we just have started showing the beginnings of 3D, but you'll need that because the world is 3D. And if you want to simulate the world in full fidelity, you do that. And then after you've done that, you push to production, whether that production's in the cloud or in your car or whatever. And that creates more data and it all starts over again. And it just gets better and better. And you optimize your enterprise better and better. You optimize everything better and better. So everything we've been doing the last 10 years or so has been building up all of the parts. And I wish we could have done it right out of the gate, but the technology wasn't due there. Elastic fabric adapter wasn't there to be fast enough interconnects for memory sharing at the time. Or we didn't have Graviton. We didn't have Tranium. We didn't have the level of GPUs we have today. So all of that is coming together. That's what my, by the way, if you want to go watch my emerging technology innovation talk, that's what it was all about. I will link, we'll definitely link to that. Let's get back to that flywheel because that brings up a great conversation around the future operating environment. I love the thing that. We've been kind of riffing and trying to bring metaphors in. Like we interviewed Andy Betjenstein in 2018 into the Rembrandt of Motherboards. That's Pat Gelsinger would say. Yeah, he really was. I don't know. I worked with him as son. And he even like, I don't know if you ever saw that. The heat sink on the sun chip. The, he built into the fins, the sun logo. It was gorgeous. He's a G. And by the way, great fan of theCUBE as well. So we were riffing and he said the constraint was the board, size of the board. That was the real estate. Boards get bigger. Clouds different. The constraints are power. If you're trying to put GPUs on the cloud, what are some of the constraints you got to work with? You're talking about a system now. You're talking about data, a lot of data moving, synthetic data, real data, harmonizing data, bringing it through. What's the pipeline? Are they auto built? Is it dynamic? Is there policy? All this is like complicated. It sounds like a, you got to lay it all out with components and interconnects and need software to run it. Yeah. Well, I mean, that's why, I mean, from our foundation, we've been a software to find infrastructure in a lot of ways, right? So software to find compute, software to find storage, software to find networking, right? And that gives you the flexibility, right? And that concept of software to find infrastructure is just moving into the enterprise and everywhere else and cars and everything else. So that's an important piece. So that's one part is the assumption that all the hardware is failing all the time and the software is routing around the failures. The second thing is that I'm very involved in is our transition to renewable energy. So we are by far the largest purchaser of renewable energy. We're about eight X larger than the next person in line in purchasing renewable energy. And that's not just for our data centers. That is for our fulfillment centers, our logistics, for everything else you see us, you know, coming with the Rivian vans that are electric and a bunch of other suppliers are bringing in vans that are electric and hydrogen power and moving green hydrogen to our fulfillment centers and our data centers and doing large capital investments in wind farms and solar. So by 2025 we'll be 100% renewable energy for our operations and by 2040 we'll be net zero. And that's Amazon and AWS combined, right? Together, yeah. And that's a big goal. It is, for a company like us, it's just massive. But you know, if we can do it, anybody can do it. Nobody can say they're too big to do it. Well, we just saw an agent in Cockroach out there. You got to build the muscles to get going. You got to volunteer and take some things. But back to the constraints. If you look at the cloud, now I'm thinking architecturally, the new system to support all this data, the Genevieve AI and the new innovations, is it, what's the constraints that you're working with? Get that flywheel going to build a enterprise and have a full digital business. Well, I think a lot of it for us is just, you know, how fast can we build out the infrastructure, right? I mean, we build a tremendous amount of infrastructure every day, right? And so how can we build out the infrastructure that's one piece? How can we get enough energy? That's another piece, right? And then, of course, our supply chains. And our supply chains are pretty robust. So we have a pretty good supply chain for everything, especially since we started building our own silicon seven years ago. You realize it's been seven years? We've been building our own silicon. We've been tracking that. Yeah, I know. I was at a replay party with Andy, and someone goes to me, hey, you're the cube guy, look at Desi. Holds his hand out. It was the first Ann apprenticeship. I said, Ann, can I take a note? Don't take a picture. Yeah, yeah, yeah. You see Andy's tweet, he's like, a lot of people are skeptical about what we're doing. Not us. I mean, all our internal services have moved to Graviton, huge savings in power, huge savings in cost, and massive increases in performance. Trainiam is going to change the game in language model training and inference and execution. So I think those are just really exciting things to see. There's no compression algorithm for experience. Somebody once said that. Does that apply for ARM-based chips as well? Or is that because we've seen a lot of companies announcing that stuff? Is that as? It definitely does because you've got to have the compilers, you've got to have the drivers, you've got to have all the different components there for ARM, and it takes time to build all that up. It doesn't happen overnight, right? That's why NVIDIA's got an advantage right now. Yeah, it does as well. They've been in ARM for a very long time also. Oh, yeah, indeed. Right, so they've got their Grace hopper chip, which is an amazing chip as well. And Jensen's done a fantastic job focusing on accelerated computing, and their GPUs are very impressive. I mean, they're an impressive company. I mean, he said 32 Grace hoppers are connected as one unit with Nitro making one giant virtual instance. I think they're using MV link one terabyte per second. This is now what you were saying a little bit, the interconnects, they're becoming, I won't say clustered, but the units, but they're systems. They're like, not just chips, it's what's around it. Yeah, yeah, it's all together, right? I mean, you've got to have the network, you've got to have the interconnect. You have to have the computer, you have to have the storage, all those things, so it's really fantastic to see. I mean, and again, it's getting to that point where you can start doing full fidelity digital twins, which is an amazing thing to say, right? And so this idea of where these large language models are going to really accelerate everything in the two and three D space, along with the textual space as well, right? And they're going to just be so common. They're going to be like spell check on your laptop. You know, you're just going to be, they're going to be part of everything you do every day, right? And where's the impact going to be out of the gate? Where's the initial, take me through the impact of the digital twin full fidelity? So now, Swami might disagree with me. I don't know, I don't think he would. So my feeling on a lot of this digital twin space and a lot of this generative AI space, that the biggest impacts first are going to be in anything that's visual. So designing cars, buildings, bridges, movies, games, those things. And the reason being, John, if you say make me a dragon flying over a castle, you know immediately whether that's what you want or not. If you say, read the 60 page legal document and tell me if I should sign the contract. Do you have to check that? Right, you know, I get that. I think, you know. You need a compiler for that. Yeah, yeah, yeah, yeah, yeah. So I think there's interaction between engineers. Again, if you watch my innovation, one of the things we did, this is very excited about this. So basically, I went to stable to design me a luxury sedan and it did, right? That's all I said was a luxury sedan, Ford or a luxury sedan. I said, okay, now show me that 360 degrees around in both directions and image that. And then my team took that and we ran it through a neural radiance field, which converts it to 3D, okay? And you can't yet do this well. And that says again, this is like future what we're working on. Then we converted that to a 3D point cloud. It's a little grainy still, but it's getting better. In the last week, it got even better, right? It's like this is changing all the time. And then we had the machine go through and estimate the drag coefficient on that shape without me doing a high-performance computing run, which it did instantaneously. And then I told it to change the design of the car to reduce my drag coefficient, and it did. Right, so you can imagine this kind of action. In the old days of HPC, that would be like 800,000 core hours. Yes. Basically, the cost of doing that would have been astronomical. Yeah, now the caveat in this is that the HPC versus the ML for estimating the drag coefficient is about 98 to 99% accurate. So don't fly a plane designed by a large language model without doing the HPC first. But when you talk about these design cycles that you can go through now, where as, you know, like do the structural engineering on that, the ML is going to do that, and say it looks good to me, and okay, change this, change this, okay, that's the way I want it. And then you go home at night, the HPC runs overnight and says, yes, that's correct, or no, it wasn't. If it says no, it wasn't, you feed that back into the model, and the next time the model gets that right, right? So there's just a... So what would Swammy not agree with that? I think it's a Swammy. Well, Swammy's really, they're focused a lot more on chatbots, and on the textual side of things. And I focus a lot more on the visual side of things. You roll right into his narrative. He's in the present tense, you're in the future. Yeah, yeah, we'll get there. I mean, you know, we're working with his team. His team's done some amazing things in this space, because like everything I talked about was done with his team, so it isn't like our team's work. I mean, just the things that you're saying, and just to kind of zoom out as we wrap up, is the advancements in performance for things like, even six months ago, stable stability, AI, image rendering. Wasn't it really that possible? Like with artists, wasn't that... And then now you talk about 3D. Scoped the order of magnitude of how much has changed just in six months. It's just crazy, and that's another thing I talk about in my presentation, when I was in the early days of the internet getting popular, which I know you guys were too. Remember how excited we were when we would see a TV ad that had a URL in it? Remember that? And we're like, that would be stupid now. Oh, well, look at the image. It's an image. The URL, right, anyways. So now, you know, and we thought that was going fast, but it's going five times faster than that. So literally like, and I can show you on my phone, from three weeks ago, when I did pull my presentation together to today, the resolution of the 3D model that I talked about, we made it, has gone up by about 10X. Yeah. Wow, right. Order of magnitude in three weeks. In three weeks. Just the team's working on it. I mean, the things you're seeing, it's just... So we're going to see a lot between now and next year. Yeah, I think this area is going to really transform. And you're a Dwawi, hey, gotcha here. I know you got to go. We're super happy to have you on the queue for your busy schedule. Entrepreneurs out there who are building want to start a company. A lot of young generation engineers coming in and saying, hey, I'm going to jump on this wave of whatever we're calling it, generative AI, but the overall innovation, in pleasant and inflection point we're seeing performance. What areas would you recommend entrepreneurs to solve problems in that are good white spaces and or territory to take? What would you recommend? Yeah, so I think what I mentioned to you when we met earlier right before re-invent is a synthetic data generation that can be fed back into these models, right? I think that's an area that needs a lot of evolution and is going to really, really, really change the game. Because you need billions of defects to train a model and in the real world you don't have billions of defects. You need to drive 15 million miles an hour of synthetic driving in the cloud. So you've got to generate 15 million miles of 3D of a virtual world to have a car learn to do that. You have to do billions of grass plans on a robot with components. And so generating those is a big deal and training and guiding the generation of that is a big deal. I think the other areas that need to be focused on is what we call explainability. Like why did the model do that? And then attribution is another huge thing in this intellectual property world. So if I said draw a dragon and it took the dragon from your art and then I go sell it, you're not going to be happy about that. So what we would love to be able to do is say, a draw a dragon and then send John money when John gets some of that. DRM. Yeah, yeah, yeah. Remember the web data. There's other things that we've run into. This is one of my favorite ones. It's called catastrophic forgetfulness. That is when you over train a model in a specific domain where it starts to override its ability to be more of a human or a chatbot kind of interaction because it's almost like a professor that you can't talk to because they're so, they're so heady in their subject, right? So it's like our language model. Yeah. It's only speaks cute. Yeah, yeah, yeah. And then hallucinations and dealing with those. So I think models around all those and models that chat models and synthetic data generation, those are the gaps that where everybody's got to fill out. And then last but definitely not least, is taking these models that are 200 billion parameters and making them do the same thing with like 10 billion parameters so we can fit them on a much smaller computer. Yeah, awesome. Right, I think that those would be big areas and that's a huge mathematical problem, right? I mean, this is really, in the end, this is just math, remember that. Math and science and physics are all part of the curriculum. Math and data and compute. Yeah, compute, that's what the model looks like. Cube's got the flywheel going on here. We've got tons of content going on. We've got a master class and kind of a look at the future. Great to have you on. VP of engineering, you've got the emerging tech. Always good to see that. And again, the progress has been great. Congratulations on all your success. Well, we didn't talk about quantum at all, you know. I mean, that's another thing we can spend a lot of time on. We will definitely get you back on the studio. Come in for most of the studio next month. We'll get you on the quantum. We'll get you on the quantum. There's another topic we want to unpack. Oh, great. We'll thank you for your time. All right, Cube coverage continues. Back to Palo Alto. We'll be back right after this short break.