 Welcome back everyone to theCUBE's coverage here. On location at AWS re-invent, AWS annual user conference, our 11th year covering it. It's been a great journey. We are here up in the press area and the show's kicking off today. We've got Mindy Ferguson, Vice President of Streaming and Messaging. Data streaming, not video streaming. We do a lot of it. And of course, this is part of our SuperCloud 5 special edition out of Palo Alto. We've got guests coming on there. We have tons of content hitting. Check out SiliconANGLE.com and our special report for the battle of AI supremacy. Tons of research, opinion pieces and news. Mindy, thanks for coming on theCUBE. Great to see you and great chatting with you. Thank you so much for having me, John. I'm so excited to talk about all things data streaming, not video streaming, but data streaming and what we're hearing from customers this week. Well, it's a fun event. This is probably one of my favorite years because of all the buildup of Generavai. Last year, not a lot of Generavai discussions. Although Adam Schleske did mention LLMs in my interview with them, but it wasn't part of the events. It really kind of kicked up with the whole Generavai craze. The Generavai movement is here. The consumers see it, they think it's magic, they see chat GPT, they see the benefits. So when you see a user expectation experience change, it's a switch over. So it reminds me of the web and okay, I get it. It's nascent, it's early on. It's going to change, but one of the things that's coming out of our research and our coverage is, if you have data and you have data hygiene and you're handling your data properly, you're set up for this Generavai movement because the chips are getting faster. There's all kinds of technology in the new stack and the old stack. So if you have the data done right, it's a massive win for potentially changing the game for a company. This is kind of the big secret. It's out in the public, but you got to get the data right. You do, John, you do. Data is absolutely the foundation for everything a business is sitting on. Like when you say we're writing on the shoulders of giants, we're really writing on the shoulders of data when we make decisions. And so companies have long understood that they need high quality data. What they're understanding right now though is that it's more than just accuracy. It's also the timeliness of that high quality data. And the timeliness then really adds into this generative AI space. I couldn't be more happy with how our data streaming services are set up to play in this gen AI world. Well, let's get into some news. You have some news that just shipped today starting to hit now. That's a big announcement coming out tonight. It's a big kickoff on the infrastructure side with Peter DeSantis. What news do you have for us? You know, customers have been talking to us about a few different themes. And we heard a lot about this at last year's re-invent. We've shipped a ton of new things in our streaming services over the past few weeks. And today we're shipping out one that we're quite proud of because customers have asked us for Amazon MSK on Graviton 3. And today we're doing that. It gives us 24% greater compute over an M5 instance. So this is MSK on an M7G instance. 29% higher throughput. And get this, because I am so obsessed with our sustainability goals. I love this number. 60% better power efficiency over other similar EC2 instances. So I'm super excited about that. Teams work really hard. We've had a lot of customers asking us to be able to run Amazon MSK on Graviton 3. And today's the day. On the numbers there. That's the Graviton. With the Graviton combo, that's the improvement on energy. That's correct. That's correct. Well, this is what I was just talking about early on our opening today was all the hype aside at the end of the day when people see the cost of the bill that's going to come for the energy and then they see their performance. This is going to, we're back to speeds and feeds again. No more solutions, it's not the solution. It's cool to talk about the speeds and feeds now. Because it matters. Price performance is really what everyone's focused on right now. They're still on experimentation. But if you don't get it right, you don't get the data right and you don't get the cost equation, performance equation right, then you might be on the wrong side of history with AI. This is a big discussion here this week. It's very true. And it's why Graviton 3 on MSK really plays in nicely to the Gen AI story because 24% more compute over an M5 instance, 29% greater throughput. Customers are telling us that they need a lot more data to feed into their Gen AI models and they want to use that data as a real differentiator to just using foundation models out of the box. And so this is a tremendous day for us to be able to deliver for customers. So what was the customer conversation you had when you heard the feedback? You guys build a product, obviously you guys work back from the customer. Well known, Amazonian, tech tactic. What was some of the use cases? What was the trouble spots? Just pain, suffering around speed, latency? What was the core problems that you guys were solving and delivering on? Well, there are a couple of things. Cost has been top of mind for a couple of years here, right? I came into AWS as a customer. So even as recently as 2022, cost was top of mind for me as a customer. And we definitely heard it last year in the 2022 re-invent. We're still hearing it again this year, although it's just getting started. So we'll see how the week plays out. But solving it from a cost point of view and just the sheer cost benefit of Graviton 3 is one thing. People are looking for higher throughput, higher compute. They're looking for better performance from just without making a great deal of architectural changes. And so that's what customers were asking us for. I will say Adam has done a tremendous job of talking about Graviton 3. And so customers are very excited about it. And what we've heard for the better part of a year now is when can we get MSK on Graviton 3? Well, the thing about Graviton that's interesting is this third, the number three is in there, other companies. I think Microsoft announced their chip and it's only for internal use. You guys have a lead on the game here. And the AI conversation is interesting because it's not just the chips that are in it, it's what's around it. And so if you get the data, it's like the bloodstream. Moving data around has been a big discussion. If you're at the edge, you're going to have inference. You're going to have inference and compute are two big killer areas that we see in terms of interest. Because if you get inference right, then the applications can iterate in real time with the data. So again, the data is an ingredient into all AI and whether it's synthetic data or just other data. So pipelining it and having zero ETL, which was announced last year was interesting. I see the ET, but I can see that ETL vision there. I mean, are data pipelines going to be at some point run by AI? I mean, I can see, I mean, when I saw our 3D printer for the first time, I thought that was magical. I can imagine infrastructure could be provisioned similarly with AI like, okay, just provision me some data pipelines. I don't think we're, I don't think we really, truly know where we're going to go. I think customers will continue to push us on making each and every step forward. I do think we definitely hear from customers that data pipelines are not something that they want to replicate or do in parallel, specific for just generative AI use cases. They want to build a data pipeline once and they want to be able to use that across their organization. And they want to think about that pipeline as kind of the backbone across their data silos. And I think that that is, that's where they're pushing us today. They're going to push us tomorrow into even more new uncharted territory and we'll see where that goes. So I was talking to Adam Sileski and I want to get your reaction to this. Great, he said an effective data strategy requires thought and understanding around what's available data and making sure it's harmonized and usable across applications. This is kind of a concept where data is now going to be the critical agreement. It's not treated as a silo, it's got to be available. Low latency, available, real time, and multiple environments, core cloud, maybe the edge. So inference will be a big part of that. How should customers be thinking about the scale as you look at some of those announcements you guys are making? Data's going to be flying around everywhere, right? So you're in the streaming messaging area. This is your, you're like the connective tissue you're moving data around. Yeah, what a fun place to be by the way. What a fun place. Talk about the scale that's coming in. How big, scope the scale? Like what are you seeing right now in terms of the scale of the customer environments around how they're handling their data and what it could be with Genevieve. I can imagine that it will be huge. If data's going to be replicated at the edge and been inferred on, you move the workloads to the data, data is a critical architectural challenge now. Absolutely it is. Well first of all, let's go back in history for maybe five years and let's think about how companies brought data into their organization five years ago. They had maybe a handful of data sources, maybe 10, but not too many. Today, data is coming out of absolutely everything and even when I walk the floor of re-invent, my mind kind of explodes with thinking of all of these new pieces of technology and how they're transmitting data. Think of the IoT space as an example. It's an incredibly fascinating place. So companies have data coming at them from just magnitudes larger than what we saw even five years ago. So I think when we think about, as an organization, we have to be able to think about how do we handle pipelines that can scale to the throughput needed of this larger data set? Also thinking of the timeliness of that, like look at Kinesis data streams on demand. We just recently took that, the read and write up. We doubled it. So now I have two gigabytes per second of read throughput on Kinesis data streams on demand. That's just incredible, but customers keep pushing us for that. I think next year, if you and I get a chance to talk next year, we're going to be talking about a whole different number than that because customers will continue to push us for higher and higher throughputs because they are seeing so much data. You know, what's interesting, Dave and I always talk about the waves. We've seen that in our lifetime. PC, I was a PC generation when I was college and then the web with two inflection points that I personally saw the same movie we're seeing now which is the performance and price got better every time and the applications moved to that next level of threshold of opportunity. So Windows got faster in the PC and the web, the web pages load faster, bandwidth happened, was growing, but the web was dial up, it was slow, but it just kept getting better. So I think we're in the same AI wave now where we're going to see it kind of embryonic and growing where, okay, you put a wrapper around it, put some data, but if you have data laying around like data exhaust as they used to call it, AI can make sense out of that. So I'm seeing companies, we've reported on SiliconANGLE, they take data that's laying around and they turn it into gold. You turn the exhaust into gold, but that's kind of key for customers. So that's, I see that as low-hanging fruit use case. It is. What should companies do in your opinion? Because you've been on Amazon for a few years now but you've been building stuff for companies and teams. What's the enterprise should do? What should we be thinking about as they look at this next architectural setup? How to handle the exhaust data that's laying around, could be log files, or what net new data can they build? I can imagine people start thinking differently. I will start aggregating data. So you're going to start to think, I think the low-hanging fruit, what data do I have laying around and can use that AI can make value out of? And then the new data opportunities. How should companies and teams think about structuring their mindset, plans, architecture, product building? Yeah, I think it first starts with having a data strategy. I think that's super important. If you just start working towards a generative AI build and your goal is to put out a generative AI application, I'm not so sure that that's always going to be the most successful. But you are right about something which is almost every company has data that's lying around and they haven't been able to realize the value of that data. I think AI is going to be able to help us. But to be able to do that, we need to be able to actually capture and analyze that data. Being able to do it with timeliness, so in real time, will allow us to find value that we never even saw in the data that maybe we left on the floor, maybe we didn't consider. I've worked at a number of companies where we've actually found really interesting use cases and real business solutions out of data where we never expected it. We weren't looking for it, but we've been able to put that data together and find meaningful output. I think we're going to see a lot coming in the next couple of years from people finding things that they already had. Already had in their own closet. They just needed to open it up and say, oh my goodness, there it is. It's going to unleash some creativity too. Every employee, every person working could contribute big time to the mission of an organization just by looking at the data and seeing the opportunity for value. So there's like a whole opportunity recognition wave coming with data that we've never seen before. That is so true. That is so true. I don't think we've even begun to scratch the surface here and I'm excited about the days and weeks that come. I'm excited to think about what re-invent will look like next year. What will people have discovered out of their data? I know even this morning, I've had five meetings already this morning. I've had a couple of breakfasts and lunches. I'm being well fed while I'm here, so. But I will say, you know, we continue to hear from customers telling us, hey, I've just realized that real time streaming data is so important for generative AI. I actually realize I don't have a true data strategy and I need to take a step back and think about that. So, you know, if you're a company that's just getting started, if you're a company that has data around, I do think it's important to think about what is your data strategy? How are you planning to build data pipelines that go across your organization? It will still allow you the flexibility and the architecture to branch out into new spaces like generative AI, or just into feeding your traditional ML models with fresh data, but I think the sky's the limit for where we can go. I mean, the blood will flow through the body, data needs to flow through the organization, similar concepts. I mean, if you look at even just our chat, our GPT, and you look at the user, it's streaming the results. It's generative, it's generating data. So, I think the user experience is starting to stream. So, streaming and messaging become key architectural conversations. What are those conversations that happen? Like, okay, data warehousing, yeah, a couple of years ago, five years, I could see that. What's some of the conversations around pipelining and architecture? Is there platform engineering? Is that a platform engineering conversation? Or is it a, I mean, data engineering and platform engineering, to me, seems synonymous now. Data has to be- They are, they are synonymous and we are having those types of conversations where customers are saying to us, I have this data, I know I need to do something with it. How can I think about that across the entire life of my organization? And I think, you know, it's really important for customers to work back from an end-to-end solution in their organization, instead of just going off and trying to solve for one thing, making sure that they're able to bring data in once, use it across their entire business. I mean, I have to ask you this question because you have a great background. You've worked in a lot of big organizations in the past building all your life and career. What's your observation or advice with people watching this now, saying, hey, you know, what is the field of data going to look like as a career? Because, you know, computer science evolved, you got data scientists, you got platform engineering, DevSecOps now has gone mainstream with the cloud. Now you have a whole nother level. And like you said, next year, it could be next gen is here. So the next gen cloud is here. It's definitely next level in my mind. What's the career and data look like? I mean, we know data warehouses. That's old hat, old and gone. Data is in the cloud. Now they're going to have to be edge network set up, tons of networking. What is the data career? Well, you know, John, coming into AWS as a customer, and I've come from now three different companies who have made a transition into AWS. And so when I was coming into AWS, I chose this space. Like I chose wanting to come into data messaging and streaming. This was where I wanted to be. And I chose it because I knew how incredibly rich this area was going to be. I know that this is a space where customers still have not yet found the untapped potential. And I think that there's just so much opportunity. I'm not sure we're going to be done with this space in a year or two. I think we've just only begun to scratch the surface. But I do think that there's a tremendous amount of urgency for companies who are watching us today and they're wondering, okay, it feels like maybe I should get started. Yeah, you really need to get started because the world is passing you by and the technology is male moving. This whole entire data space is moving at such a fast pace that I feel like we have to, there's a sense of urgency here. I totally agree. In fact, we've been sitting on the Cube for years that the SRE movement was all about large-scale server management, one person can handle a bunch of server clusters. The data is going that same direction where there's going to be a lot more data and not enough headcount. So you have to come, you have to think of it like a platform. That's right. Not the database. Because the database is going to be everywhere. So if you think of a data as a platform, that's a different mindset. That's a systems kind of thinking, not just kind of just one thing. I think of it a little different than a platform. I think of it as a backbone. Think of it as kind of what we were told the internet was going to be many years ago. The internet was going to be this connectivity that brings us all together. And data is really today's connectivity, today's backbone. And it's quite the platform that people will leverage, but I think it's more than just a traditional platform. And I think of it as a true living backbone. It's always going to be a living substance. And getting data into the right place at the right time was always talked about as a key thing. But now with China AI, it's more important than ever because the better the data, and if it's available, that makes the AI better. I mean, that's all about feeding the AI with the data. Yeah, that's so true. We see it in two different places. So data is really helpful in fine tuning. So in places where data doesn't need to change very much, it's perfect for fine tuning use cases. And then retrieval augmented generation is probably the one that I'm the most passionate about. I'll say I'm not an expert by any means, so don't ask me too many questions. I think from the how to use your data in a rag use case, it's quite fascinating what customers are starting to do. But in the IoT space and the weather space, we're seeing so many different uses for data that changes at a very quick pace. I saw Bill Vass and we were talking to him all the way about synthetic data, and how I was kind of skeptical, but he kind of clarified it to me. I didn't think it was going to be that real. He's like, no, it's very relevant. So I'm kind of looking at the whole synthetic data at the edge for IoT is another big thing going on. So again, data is the lifeblood of an organization if done properly. Couldn't say it better myself. Data is the lifeblood. Mindy, thanks for coming on theCUBE. Appreciate you coming on. Thank you for sharing, and good luck with the rest of the show. Thanks for having me, John. Okay, we are on location here in Las Vegas for re-invent 2023 Cube coverage. We'll be back after this break.