 Hello everyone and welcome back to fabulous Las Vegas, Nevada, where we are here on the show floor at AWS Reinvent. We are theCUBE. I am Savannah Peterson, joined with John Furrier. John, afternoon, day two. We are in full swing. What's got you most excited? Just got lunch, got the food coma kicking in. No, we don't have coffee. No, it's all, no. Way to bring the hype there, John. There's so many people here just in Amazon. We're back to 2019 levels of crowd. The interest levels are high. Next-gen cloud security, big part of the keynote. This next segment, I am super excited about. CUBE alumni going back to 2013. 10 years ago, he's on theCUBE. Now 10 years later, we're at Reinvent. Looking forward to this guest. And it's about security, great topic. I don't want to delay us anymore. Please welcome Mark. Mark, thank you so much for being here with us. Massive day for you and the team. I know you oversee three different units at Amazon, Inspector, Detective, and the most recently announced, Security Lake. Tell us about Amazon Security Lake. Well, thanks to Anna. Thanks, John, for having me. Well, Security Lake has been in the works for a little bit of time, and it got announced today at the keynote, as you heard from Adam. We're super excited because there's a couple components that are really unique and valuable to our customers within Security Lake. First and foremost, the foundation of Security Lake is an open-source project we call OCSF, Open Cyber Security Frameworks, EMA. And what that allows is us to work with the vendor community at large in the security space and develop a language where we can all communicate around security data. And that's the language that we put into Security Data Lake. We have 60 vendors participating in developing that language and partnering within Security Lake. But it's a communal lake where customers can bring all of their security data in one place, whether it's generated in AWS, they're on-prem or SaaS offerings or other clouds, all in one location in a language that allows analytics to take advantage of that analytics and give better outcomes for our customers. So Adam Sileski's big keynote, he spent all the bulk of his time on data and security. Obviously, they go well together. We've talked about this in the past on theCUBE. Data is part of security, but this security is a little bit different in the sense that the global footprint of AWS makes it uniquely positioned to manage some security threats, EKS protection, a very interesting announcement, runtime layer, but looking inside and outside the containers, probably gives extra telemetry on some of those supply chain vulnerabilities. This is actually a very nuanced point. You got GuardDuty kind of taking its role. What does it mean for customers? Because there's a lot of things in this announcement that he didn't have time to go into detail. Unpack all the specifics around what the security announcement means for customers. So we announced four items in Adam's keynote today within my team. So I'll start with GuardDuty for EKS runtime. It's complementing our existing capabilities for EKS support. So today, Inspector does vulnerability assessment on EKS or container images in general. GuardDuty does detections of EKS workloads based on log data. Detective does investigation and analysis based on that log data as well. With the announcement today, we go inside the container workloads. We have more telemetry, more fine grain telemetry, and ultimately we can provide better detections for our customers to analyze risks within their container workloads. So we're super excited about that one. Additionally, we announced Inspector for Lambda. So Inspector, we released last year at Reinvent and we focus mostly on EKS container workloads and EC2 workloads. Single click, automatically assess your environment, start generating assessments around vulnerabilities. We've added Lambda to that capability for our customers. The third announcement we made was Macy sampling. So Macy has been around for a while in delivering a lot of value for customers around providing information around their sensitive data within S3 buckets. What we found is many customers want to go and characterize all of the data in their buckets, but some just want to know, is there any sensitive data in my bucket? And the sampling feature allows the customer to find out their sensitive data in the bucket, but we don't have to go through and do all of the analysis to tell you exactly what's in there. So I was gonna- Structured end, structured data? Any data? Correct, yeah. Okay, and the fourth? The fourth security data lake. Yeah. Okay, ocean theme, data lake. Very complimentary to all of our services, but the unique value in the data lake is that we put the information in the customer's control. It's in their S3 bucket. They get to decide who gets access to it. We've heard from customers over the years that really have two options on gathering large-scale data for security analysis. One is we roll our own, and we're security engineers, we're not data engineers. It's really hard for them to build these distributed systems at scale. The second one is we can pick a vendor or a partner, but we're locked in, and it's in their schema and their format, and we're there for a long period of time. With security data lake, they get the best of both worlds. We run the infrastructure at scale for them, put the data in their control, and they get to decide what use case, what partner, what tool gives them the most value on top of their data. Is that always a good thing to give the customers too much control? Because you know the old expression, you give them a night they play with, and they can cut themselves. But no, seriously, because what's the provisions around that? Because control was a big part of the governance. How do you manage the security? How does the customer worry about if I have too much control, if someone makes a mistake? Well, what we're finding out today is that many customers have realized that some of their data has been replicated seven times, 10 times, not necessarily maliciously, but because they have multiple vendors that utilize that data to give them different use cases and outcomes. It comes costly and unwieldy to figure out where all that data is. So by centralizing it, the control is really around who has access to the data. Now, ultimately, customers want to make those decisions, and we've made it simple to aggregate this data in a single place. They can develop a home region if they want, or all the data flows into one region. They can distribute it. They're in charge. They're in charge. But the controls are mostly in the hands of the data, the data of a governance person in the company, not the security analyst. So I'm really curious. You mentioned there's 60 AWS partner companies that have collaborated on the security lake. Can you tell us a little bit about the process, how long does it take, are people self-selecting to contribute to these projects? Are you cherry-picking? What does that look like? It's a great question. There's three levels of collaboration. One is around the open-source project that we announced at Black Hat early in this year, called OCSF. And that collaboration is we've asked the vendor community to work with us to build a schema that is universally acceptable to security practitioners, not vendor-specific. And we've asked- I'm sorry to interrupt you, is this a first of its kind? There's multiple schemas out there developed by multiple parties. They've been around for multiple years, but they've been built by a single vendor. Yeah, that's what I'm drilling in on a little bit. It sounds like the first time we've had this level of collaboration. There's been collaborations around them, but in a handful of companies. We've really gone to a broad set of collaborators to really get it right. And they're focused around areas of expertise that they have knowledge in. So the EDR vendors, they're focused around the scheme around EDR. The firewall vendors are focused around that area. Certainly the cloud vendors are in their scope. So that's level one of collaboration. And that gets us the level playing field and the language in which we'll communicate within security. It's just so important, yeah. Super foundational. And then the second area is around producers and subscribers. So many companies generate valuable security data from the tools that they run. We call those producers, the publishers, and they publish the data into Security Lake within that OCS format, OCSF format. Some of them are in the form of findings. Many are them in the form of raw telemetry. Then the second one is in the subscriber sky. And those are usually analytic vendors, SIM vendors, XDR vendors, that take advantage of the logs in one place and generate analytic driven outcomes on top of that. Use cases, if you will, that highlight security risks or issues for customers. Yeah, cool. What's the big customer focus when you started looking at security lakes? How do you see that planning out? You said there's a collaboration, I love the open source vibe on that piece. What data goes in there? What sharing? Because a big part of the keynote I heard today was, I heard clean rooms, I've got my antenna up, I love to hear that. That means there's an implied sharing aspect. This is, the security industry has been sharing data for a while. What kind of data is in that lake? Give us an example. Well, there's a number of sources within AWS as customers run their workloads in AWS. We've identified somewhere around 25 sources that will be natively single click into Amazon Security Lake. We're announcing nine of them. They're just traditional network logs, VPC flow, cloud trail logs, firewall logs, findings that are generated across AWS, EKS audit logs, RDS data logs. Anything that customers run workloads on will be available in daily. But that's not limited to AWS. Customers run their environments hybridly, they have SaaS applications, they use other clouds in some instances. So it's open to bring all that data in. Customers can vector it all into this one single location if they decide. We make it pretty simple for them to do that, again, in the same format where outcomes can be generated quickly and easily. Can you use the data lake off on-premise or is it has to be in an S3 in Amazon Cloud? Today it's in S3 in Amazon and, you know, if we hear customers looking to do something different, you know, as you guys know, we tend to focus on our customers and what they want us to do. So they've been pretty happy about what we've decided to do in this first iteration. So we've got a story about SiliconANGLE, obviously the ingestion is a big part of it, the reporters are jumping in, but the 53rd party sources is a pretty big number. Is that coming from the OCSF or is that just in general who's involved? Yeah, OCSF is the big part of that and we have a list of probably 50 more that want to join in part of this. You get the big major there, Cisco, CrowdStrike, Palo Alto Networks. I mean, all the big daughters in there. All big partners of AWS anyway. So it was an easy conversation. And in most cases, when we started having the conversation, they were like, wow, this has really been needed for a long time. And given our breadth of partners and where we sit from our customer's perspective in the center of their cloud journey, that they've looked at us and said, you guys, we applaud you for driving this and... So Mark, take us through the conversations you're having with customers that reinforce. We saw a lot of meetings happening. It was great to be fact-faced-of-face. You guys have been doing a lot of customer conversation. Security data lake came out of that. What was the driving force behind it? What were some of the key concerns? What were the challenges? And what was the, what's now the opportunity? That's different. We heard from our customers in general, one, it's too hard for us to get all the data we need in a single place. Whether through AWS, the industry in general, it's just too hard. We don't, we don't have those resources to data wrangle that data. We don't know how to pick a schema. There's multiple ones out there. Tell us how you, we would do that. These three challenges came out front and center for every customer. And mostly what they said is our resources are limited. And we want to focus those resources on security outcomes. And we have security engineers. We don't want to focus them on data wrangling in large-scale distributed systems. Can you help us solve that problem? And it came out loud and clear from almost every customer conversation we had. And that's where we took the challenge. We said, okay, let's, let's build this data layer. And then on top of that, we have services like detective and guard duty. We'll take advantage of it as well. But we also have a myriad of ISV third parties that will also sit on top of that, Kate, and render out. What's interesting, I want to get your reaction. I know we don't have much time left, but I want to get your thoughts. When I see security data lake, which is awesome, by the way, love, love the focus. Love how you guys put that together. It makes me realize the big thing in reinvent this year is this idea of specialized solutions. You got instances for this and that, use cases that require certain kind of performance. You got the data, pillars that Adam laid out. Are we going to start seeing more specialized data lakes? I mean, we have a video data lake. Is there going to be a FinTech data lake? Is there going to be, I mean, you got the great lakes kind of going on here. What is going on with these lakes? I mean, is that a trend that Amazon sees our customers are aligning to? Is that? Yeah, you know, we have a couple lakes already. We have a healthcare lake and a financial lake. And now we have a security lake. Foundationally, we have lake formation, which is the tool that anyone can build a lake. And most of our lakes run on top of lake foundation, but specialized. And the specialization is in the data aggregation, normalization, enrichment, that is unique for those use cases. And that, I think you'll see more and more of that. So that's a feature, not a buck. It's a feature, it's a big feature that customers have asked for. So they want to roll their own specialized purpose built data thing, lake. They can do it. You know, customers don't want to combine healthcare information with security information. They have different use cases and segmentation of the information that they care about. So I think you'll see more. Now I also think that you'll see where there are adjacencies that those lakes will expand into other use cases in some cases too. And that's where the right tools comes in. As he was talking about this ETL, zero ETL feature. Yeah, and it'd be like an 80-20 rule, right? So if 80% of the data is shared for different use cases, you can see how those lakes would expand to fulfill multiple use cases. Okay. All right, you think he's ready for the challenge? All right. Look, we were on the same page. Okay, we have a new challenge, go ahead. So, think of it as an Instagram reel, sort of your hot take, your thought leadership moment. The clip we're going to come back to and reference your brilliance 10 years down the road. I mean, you've been a CUBE veteran now at CUBE alumni for almost 10 years in just a few weeks. It'll be that. What do you think is, and I suspect I think I might know your answer to this. So feel free to be robust in this. But what do you think is the biggest story, key takeaway from the show this year? We're democratizing security data within security data lake, for sure. Well said, you are our shortest answer so far on the CUBE and I absolutely love and respect that mark. It has been a pleasure chatting with you and congratulations again on the huge announcement. This is such an exciting day for y'all. Thank you, Savannah. Thank you, John. Pleasure to be here. We look forward to 10 more years of having you here. Well, maybe we don't have to wait 10 years. Well, more years, not in another 10. I feel it would be a lot of security content this year. Yeah, pretty hot theme. Pretty hot theme for us. Of course, Reinforce will be there this year again coming up 2023. All the Rees. Yep, all the Rees. Love that. We look forward to seeing you there. All right, thanks a lot. Speaking of Rees, you're the reason we're here. Thank you all for tuning in to today's live coverage from AWS Reinvent. We are in Las Vegas, Nevada with John Furrier. My name is Savannah Peterson. We are theCUBE and we are the leading source for high tech coverage.