 Okay, we're back covering AWS Storage Day 2022 with Ashish Pelikar, who's the general manager of AWS EBS Snapshot and Edge and Kami Tavares, who's the head of product at Amazon EBS. Thanks for coming back in theCUBE, guys. Great to see you again. Great to see you as well, Dave. Great to see you, Dave. Ashish, we've been hearing a lot today about companies with all kinds of applications to the cloud and AWS and using their data in new ways. Resiliency is always top of mind for companies when they think about just generally their workloads and specifically the clouds. How should they think about, customers think about data resiliency? Yeah, when we think about data resiliency, it's all about making sure that your application, the data that your application needs is available when it needs it. It's really the ability for your workload to mitigate disruptions or recover from them. And to build that resilient architecture, you really need to understand what kinds of disruptions your applications can experience, how broad the impact of those disruptions is and then how quickly you need to recover. And a lot of this is a function of what our application does, how critical it is. And the thing that we constantly tell customers is, this works differently in the cloud than it does in a traditional on-premises environment. What's different about the cloud versus on-prem? Can you explain how it's different? Yeah, let me start with the on-premises one. And in the on-premises one, building resilient architectures is really the customer's responsibility and it's very challenging. You have to start thinking about what your single points of failure are. To avoid those, you have to build in redundancy. You might build in replication as an example for storage. And doing this now means you have to have provision more hardware. And depending on what your availability requirements are, you may even have to start looking for multiple data sentence, some in the same region, some in different geographical locations. And you have to ensure that you are fully automated so that your recovery processes can take place. And as you can see, that's a lot of owners being placed on the customer. One other thing that we hear about is really elasticity and how elasticity plays into the resiliency for applications. As an example, if you experience a sudden spike in workloads in a on-premises environment, that can lead to resource saturation. And so really you have two choices. One is to sort of throttle the workload and experience resiliency, or your second option becomes buying additional hardware and securing more capacity and keeping it fallow in case of experiencing such a spike. And so your two propositions that are either experiencing resiliency challenges or paying really to have infrastructure that's lying around. And both of those are different really when you start thinking about the cloud. There's a third option. There's a third option too, which is lose data, which is not an option. Go ahead. This is not a, yeah, I pretty much as a storage person, that is not an option. The reason that we think is reasonable for customers to take. The big contrast in the cloud really comes with how we think about capacity. And fundamentally, the cloud gives you that access to capacity. So you're not managing that capacity. The infrastructure complexity and the costs associated with that are also just a function of how infrastructure is built really in the cloud. But all of that really starts with the bedrock of how we designed for avoiding single points of failure. The best way to explain this is really to start thinking about our availability zones. Typically, these availability zones consist of multiple data centers located in the same regional area to enable high throughput and low latency for applications. But the availability zones themselves are physically independent. They have independent connections to utility power, standalone backup power resources, independent mechanical services, and independent network connectivity. We take availability zone independence extremely seriously so that when customers are building the availability of their workload, they can architect using these multiple zones. And that is something that when I'm talking to customers or Kami's talking to customers, we highly encourage customers to keep in mind as they're building resiliency for their applications. Right, so you can have the, within an availability zone, you can have instantaneous, you know, when you're doing a write, you've got, you've captured that data and you can asynchronously move to outside of that in case there's, you know, the very low probability, but it does happen, you get some disasters, minimizing that RPO. So, and I don't have to worry about that as a customer and figuring out how to do three site data centers. That's right, like take that even further. Now imagine if you're expanding globally, all those things that we described about like creating new footprint and creating a new region and finding new data centers as a customer in an on-premises environment, you take that on yourself. Whereas with AWS, because of our global presence, you can expand to a region and bring those same operational characteristics to those environments. And so again, bringing resiliency as you're thinking about expanding your workload, that's another benefit that you get from using the availability zone region architecture that AWS has. And as Charles Phillips, former CEO of Infor said, friends don't let friends build data centers. I don't have to worry about building the data center. Let's bring Kami into the discussion here. Kami, think about elastic block storage. It gives your customers, you get persistent block storage for EC2 instances. So it's foundational for any mission critical or business critical application that you're building on AWS. How do you think about data resiliency in EBS specifically? I always ask the question, what happens if something goes wrong? So how should we think about data resiliency in EBS specifically? Yeah, you're right. Dave, block storage is a really foundational piece when we're talking to customers about building in the cloud or moving an application to the cloud and data resiliency is something that comes up all the time. And with EBS, you know, EBS is a very large distributed system with many components. And we put a lot of thought and effort to build resiliency into EBS. So we designed those components to operate and fail independently. So when customers create an EBS volume, for example, we'll automatically choose the best storage nodes to address the failure domain and the data protection strategy for each of our different volume types. And part of our resiliency strategy also includes separating what we call a volume lifecycle control plane, which are things like creating a volume or attaching a volume to an EC2 instance. So we separate that control plane from the storage data plane, which includes all the components that are responsible for serving IO to your instance and then persisting it to durable media. So what that means is once a volume is created and attached to the instance, the operations on that volume, they're independent from the control plane function. So even in the case of an infrastructure event, like a power issue, for example, you can recreate an EBS volume from a snapshot. And speaking of snapshots, that's the other core pillar of resiliency in EBS. Snapshots are point-in-time copies of EBS volumes that were stored in S3. And snapshots are actually a regional service. And that means internally, we use multiple of the availability zones that Ashish was talking about to replicate your data so that the snapshots can extend the failure of an availability zone. And so thanks to that availability zone independence and then this built-in component independence, customers can use that snapshot and recreate an EBS volume to another AZ or even in another region if they need to. Great. So, okay. So you touched on some of the things EBS does to build resiliency into the service. Now thinking about over your right shoulders, you know, joie de vivre. So what do organizations do to build more resilience into their applications on EBS so that they can enjoy life without anxiety? That is a great question. Also something that we love to talk to customers about. And the core thing to think about here is that we don't believe in a one-size-fits-all approach. And so what we do in EBS is we give customers different tools so that they can design a resiliency strategy that is custom tailored for their data. And so to do this resiliency assessment, you have to think about the context of the specific workload and ask questions like what other critical services depend on this data and what will break if this data is not available and how long can those systems withstand that, for example? And so the most important step, I'll mention it again, snapshots. That is a very important step in a recovery plan. Make sure you have a backup of your data. And so we actually recommend that customers take the snapshots at least daily. And we have features that make that easier for you. For example, Data Lifecycle Manager, which is a feature that is entirely free. It allows you to create backup policies and then you can automate the process of creating the snapshots. So it's very low effort. And then when you want to use that backup to recreate a volume, we have a feature called Fast Snapshot Restore that can expedite the creation of the volume. So if you have a more shorter recovery time objective, you can use that feature to expedite the recovery process. So that's backup. And then the other pillar we're talked to customers about is data replication. Just another very important step when you're thinking about your resiliency and your recovery plans. So with UBS, you can use replication tools that work at the level of the operating system. So that's something like DRVD, for example, or you can use AWS Elastic Disaster Recovery and that will replicate your data across availability zones or nearby regions too. So we talked about backup and replication. And then the last topic that we recommend customers think about is having a workload monitoring solution in place. And you can do that at UBS using CloudWatch Matrix. So you can monitor the health of your UBS volume using those metrics. We have a lot of tips in our documentation on how to measure that performance. And then you can use those performance metrics as triggers for automated recovery workflows that you can build using tools like auto-scaling groups, for example. Great, thank you for that advice. Just quick follow-up. So you mentioned your recommendation least daily. What kind of granularity, if I want to compress my RPO, can I go at a more granular level? Yes, you can go more granular and you can use, again, the data life cycle managers to define those policies. Great, thank you. Before we go, I want to just quickly cover what's new with EBS. Ashish, maybe you could talk about, I understand you've got something new today, you've got an announcement. Take us through that. Yeah, thanks for checking in and I'm so glad you asked. We talked about how snapshots help resilience and are a critical part of building resilient architectures. So customers like the simplicity of backing up their EC2 instances using multi-volume snapshots. And what they're looking for and what is the ability to back up only to exclude specific volumes from the backup, especially those that don't need backup. So think of applications that have cash data or applications that have temporary data that really doesn't need backup. So today we are adding a new parameter to the Create Snapshots API, which creates a crash consistent set of snapshots for volumes attached to an EC2 instance, where customers can now exclude specific volumes from an instance backup. So customers using data lifecycle manager that Cammy touched on can automate their backups. And again, they also get to exclude these specific volumes. So really the feature is not just about convenience, but it's also to help customers save on costs as many of these customers are managing tens of thousands of snapshots. And so we want to make sure they can take it at the granularity that they needed. So super happy to bring that into the hands of customers as well. That's a nice option. Okay, Ashish, Cammy, thank you so much for coming back in theCUBE, helping us learn about what's new and what's cool in EBS. Appreciate your time. Thank you for having us, Dave. Thank you for having us, Dave. You're very welcome now. If you want to learn more about EBS resilience, stay right here because coming up, we've got a session, which was a deep dive on protecting mission critical workloads with Amazon EBS. Stay right there. You're watching theCUBE's coverage of AWS Storage Day 2022.