 Kevin Miller joins us. He's the vice president and general manager of Amazon S3. And we're going to discuss the evolution of data lakes. Hey, Kevin. Hey Dave, great to be here. Yeah, let's riff on this a little bit. Why is S3 so popular for data lakes? How have data lakes on S3 changed and evolved? Well, I think a lot of the core benefits of S3 really play directly into what customers are looking for with when they're building a data lake, right? They're looking for low cost storage, some place that they can put shared datasets and have make it very easy for other teams and businesses to access a set of data, as well as have all the management around it, knowing that the data is secure, is durable, is protected. And so all of the capability that S3 provides out of the box is just a really good fit for what customers need out of a data lake storage provider. And it's really, you know, the simple form. I remember when schema on read hit, people were like, oh, great, we can just shove all our stuff into a data lake. And then of course, the old bromide became a data swamp. But the industry has evolved, hasn't it? New tools, machine intelligence and AI and machine learning have really helped a lot. Talk about how that's changed from the old days, if you will, where it was just kind of this mess and you really couldn't do much with it. And why today we're able to get so much more out of data lakes? Yeah, I think that, you know, original use of data lakes centered a lot around analytics and sort of Hadoop or Spark type applications. And that continues to be a big driver. But I think that one is that we're continuing to expand the kinds of applications, like you mentioned, machine learning or other kinds of intelligence. Those applications are increasing as things that customers want to do around these shared data sets and being able to pretty easily sort of dynamically combine data sets together and use that to drive more insight. I think that you're absolutely right. You know, if you left unstructured or left without any kind of governance, you can quickly develop a lot of unusable data. And so I think where we're seeing the evolution is in customers, putting more of a governance structure in place around it, really trying to understand and catalog the data sets they have. And, you know, I think that's going to continue. That's something that we're seeing pretty actively developed right now in terms of knowing what data I have, knowing the kind of the essence of metadata around it as far as how frequently is this data being updated? You know, when is it updated? What are the rules around when I can access it and so forth, as well as around data lake access control, making it very easy to grant an end user, a specific end user access to certain data sets, knowing that they can then audit and really know exactly who has access to what data in that data lake. So you're seeing a lot of that governance type structure come around while not taking away the essence of having a simple, low-cost, scalable way to store and then access data from a number of applications. So that's all now starting to really come together, I see. I think this is a really important point you're making because I see organizations rethinking their data architecture and their data organizations to really put data in the hands of the lines of business, those with domain expertise, and self-service is becoming really, really important. I see a lot of organizations say, hey, we're going to give the lines of business their own data lakes that they can spin up, but they have to be governed in a federated fashion. I know you guys use this term lake house. How do these things fit together? Well, Dave, I think you're absolutely right. I think that what a lot of organizations, what I see a lot of organizations doing is evolving to a point where they want as minimal layers between someone who owns a business outcome, whether it's a top-level revenue generation line or bottom-level cost line, they want to connect the people who are closest to the business problem with the applications and the technology that they can use to solve it. And that's a big part of that then is the data and the data sets that are available. So I think where it needs to come together and where it is coming together is around making it very easy to federate, to know what data sources I have, to know what the rules are around accessing it, to remove as much of the friction as we can around just the basics of provisioning access, knowing that this set of people is allowed to access it and how do they access it? Just as much as possible, removing that so that it's not weeks between when I have an idea and when I can build an application to process that data. Ideally, it's within an hour. I have an idea, I can spin up a notebook, I can pull in the data sets I need, train an ML algorithm or build some analytics function and then start to see some results and see is this really working or not? And then of course sort of scale it up from there in a seamless fashion. So I think that a lot of the essence of AWS that we've built over the years is really starting to come together and where we are continuing to make it simpler for customers is all around that federation and the simplicity of provisioning access to the data. And share that data across a massive global network. Kevin Miller, thanks so much for coming to CUBE and talking about Data Lakes. Yeah, thanks for having me, Dave. You're welcome. And thank you for watching. This is Dave Vellante for theCUBE. Thank you.