 Welcome back to AWS Storage Day 2022. I'm Dave Vellante and we're pleased to have back on theCUBE, Ed name, the GM of AWS File Storage. Ed, how you doing? Good to see you. I'm good Dave, good to see you as well. You know, we've been tracking AWS Storage for a lot of years, 16 years actually. We've seen the evolution of services. Of course, we started with S3 and Object and saw that expand to block and file. And now the pace is actually accelerating. And we're seeing AWS make more moves again today and block an object, but what about file? It's one format in the world. And the day wouldn't really be complete without talking about file storage. So what are you seeing from customers in terms of let's start with data growth? How are they dealing with the challenges? What are those challenges? If you could address specifically some of the issues that they're having, that would be great. And then later we're going to get into the role that Cloud File Storage plays. Take it away. Well, Dave, I'm definitely increasingly hearing customers talk about the challenges in managing ever-growing data sets. And they're especially challenged in doing that on-premises. When we look at the data that's stored on-premises is zettabytes of data. The fastest growing data sets consist of unstructured data that are stored as files. And many companies have tens of petabytes or hundreds of petabytes or even exabytes of file data. And this data is typically growing 20%, 30% a year. And in reality, on-premises models really are designed to handle this amount of data and this type of growth. And I'm not just talking about keeping up with hardware purchases and hardware floor space, but a big part of the challenge is labor and talent. To keep up with the growth, they're seeing companies managing storage on-prem. They really need an unprecedented number of skilled resources to manage the storage. And these skill sets are in really high demand and they're in short supply. And then another big part of the challenge that customers tell me all the time is that operating at scale, dealing with these ever-growing data sets at scale is really hard. And it's not just hard in terms of the people you need and the skill sets that you need, but operating at scale presents net new challenges. So for example, it becomes increasingly hard to know what data you have and what storage media your data is stored on when you have a massive amount of data that's spanning hundreds of thousands or thousands of applications and users and it's growing super fast each year. And at scale, you start seeing edge technical issues get triggered more commonly, impacting your availability or your resiliency or your security. And you start seeing processes that used to work when you were a much smaller scale, no longer work. And it's just scale is hard, it's really hard. And then finally companies are wanting to do more with their fast-growing data sets to get insights from it. And they look at the machine learning and the analytics and the processing services and the compute power that they have at their fingertips on the cloud. And having that data be in silos on-prem can really limit how they get the most out of their data. You know, I've been covering, I'm glad you brought up the skills gap. I've been covering that quite extensively with my colleagues at ETR, our survey partner. So that's a really important topic and we're seeing it across the board. I mean, really acute in cybersecurity. But for sure, just generally in IT and frankly CEOs, they don't want to invest in training people to manage storage. I mean, it wasn't that long ago that managing LUNs was a talent. And that's, of course, nobody does that anymore. But executives would much rather apply skills to get value from data. So my specific question is, what can be done? What is AWS doing to address this problem? Well, with the growth of data that we're seeing, it's really hard for a lot of IT teams to keep up with just the infrastructure management part that's needed. So things like deploying capacity and provisioning resources and patching and conducting compliance reviews. And that stuff is just table stakes. The asks on these teams to your point are growing to be much bigger than those pieces. So we're really seeing fast uptake of our Amazon FSX service because it's such an easy path for helping customers with these scaling challenges. FSX enables customers to launch and to run and to scale feature rich and highly performant network attached file systems on AWS. And it provides fully managed file storage, which means that we handle all of the infrastructure. So all of that provisioning and that patching and ensuring high availability and customers simply make API calls to do things like scale up their storage or change their performance level at any point or change a backup policy. And a big part of why FSX has been so appealing as well as customers is it really enables them to choose the file system technology that powers their storage. So we provide four of the most popular file system technologies. We provide Windows file server, NetApp on tap, OpenZFS and Luster, so that storage and application admins can use what they're familiar with. So they essentially get the full capabilities and even the management CLIs that they're used to and that they've built workflows and applications around on premises. But they get along with that, of course, the benefits of fully managed elastic cloud storage that can be spin up and spun down and scaled on demand and performance changed on demand, et cetera. And what storage and application admins are seeing is that FSX not only helps them keep up with their scale and growth, but it gives them the bandwidth to do more of what they want to do. Supporting strategic decision making, helping their end customers figure out how they can get more value from their data, identifying opportunities to reduce cost. And what we realize is that for a number of storage and application admins, the cloud is a different environment from what they're used to. And we're making it a priority to help educate and train folks on cloud storage. Earlier today, we talked about AWS Storage Digital Badges and we announced a dedicated file badge that helps storage admins and professionals to learn and demonstrate their AWS skills. And our AWS Storage Badges, you can think of them as credentials that represent cloud computing learning that customers can add to their repertoire, add to their resume, as they're embarking on this cloud journey. And we'll be talking more in depth on this later today, especially around the file badge, which we're very excited about. So a couple of things there that I wanted to comment on. I mean, it was there for the NetApp announcement. We've covered that quite extensively. This just shows that it's not a zero sum game necessarily, right? It's a win, win, win. For customers, you've got your, you know, specific AWS services, you've got partner services, you know, customers want, want choice. And then the managed service model, you know, to me is a no-brainer for most customers. We learned this in the Hadoop years. I mean, it just got so complicated. Then you saw what happened with the managed services around, you know, data lakes and lake houses. And it's just really simplified things for customers. I mean, there's still some customers that want to do it yourself, but a managed service for the file storage sounds like a really easy decision. Especially for those IT teams that are overburdened as we were talking about before. And I also like, you know, the education component is nice touch too. You get the badge thing. So that's kind of cool. So I'm hearing that the fully managed file storage service is a catalyst for cloud adoption. So the question is, which workloads should people choose to move into the cloud? Where's the low friction, low risk sweet spot ed? Well, that's one of the first questions that customers ask when they're about to embark on their cloud journey. And I wish I could give a simple or a single answer, but the answer is really it varies. And it varies per customer. And I'll give you an example. For some customers, the cloud journey begins with what we call extending on-premises workloads into the cloud. So an example of that is compute bursting workloads where customers have data on-premises and they have some compute on-premises, but they want to burst their processing of that data to the cloud because they really want to take advantage of the massive amount of compute that they get on AWS. And that's common with workloads like visual effects or chip design simulation, genomics analysis. And so that's an example of extending to the cloud, really leveraging the cloud for bursting your workloads. Another example is disaster recovery. And that's a really common example. Customers will use the cloud for their secondary or their failover site rather than maintaining their second on-prem location. And so that's a lot of customers start with some of those workloads by extending to the cloud. And then there's a lot of other customers where they've made the decision to migrate most or all of their workloads and they're skipping the whole extending step. They aren't starting there. They're instead focused on going all in as fast as possible because they really want to get to the full benefits of the cloud as fast as possible. And for them, the migration journey is really, it's a matter of sequencing which specific workloads to move in when. And what's interesting is we're increasingly seeing customers prioritizing their most important and their most mission critical applications ahead of their other workloads in terms of timing. And they're doing that to get their workloads to benefit from the added resilience they get from running on the cloud. So it really does depend, Dave. Yeah, thank you. I mean, that's pretty good description of the options there. And I just come something, I mean, bursting, obviously. I love those examples you gave around genomics, chip design, visual effects rendering, the DRP. Again, very common sort of cloud, historical sweet spots for cloud. But then the point about mission critical is interesting because I hear a lot of customers, especially with the digital transformation push, wanting to change their operating model. I mean, on the one hand, not changing things, putting it in the cloud to lift and shift. You don't have to change things, low friction. But then once they get there, they're like, wow, we can do a lot more with the cloud. So that was really helpful, those examples. Now, last year at storage day, you released a new file service. And then you followed that up at re-event with another file service introduction. Sometimes I get lost in the array of services. So help us understand, when a customer comes to AWS with like an NFS or an SMB workload, how do you steer them to the right managed service, the right course for the right course? Yeah, well, I'll start by saying, a big part of our focus has been in providing choice to customers. And what customers tell us is that the spectrum of options that we provide to them really helps them in their cloud journey because there really isn't a one size fits all file system for all workloads. And so having these options actually really helps them to be able to move pretty easily to the cloud. And so my answer to your question about where do we steer a customer when they have a file workload is it really depends on what the customer is trying to do. And in many cases, where they're coming from. So I'll walk you through a little bit of how we think about this with customers. So for storage and application admins who are extending existing workloads to the cloud or migrating workloads to AWS, the easiest path generally is to move to an FSX file system that provides the same or really similar underlying file system engine that they use on premises. So for example, if you're running a NetApp appliance on premises or a Windows file server on premises, choosing that option within FSX provides the least effort for a customer to lift their application and their dataset. And they'll get the full set of capabilities that they're used to. They'll get the performance profiles that they're used to but of course they'll get all of the benefits of the cloud that I was talking about earlier like spin up and spin down and fully managed and elastic capacity. Then we also provide open source file systems within the FSX family. So if you're a customer and you're used to those or if you aren't really wedded to a particular file system technology, these are really good options. And they're built on top of AWS' latest infrastructure innovations, which really allows them to provide pretty significant price and performance benefits to customers. So for example, the file servers for these offerings are powered by AWS' Graviton family of processors. And under the hood, we use storage technology that's built on top of AWS' scalable reliable datagram transport protocol, which really optimizes for speed on the cloud. And so for those two open source file systems, we have open ZFS and that provides a really powerful, highly performant NFS V3 and V4 and 4.1 and 4.2 file system built on a fast and resilient open source Linux file system. It has a pretty rich set of capabilities. It has things like point-to-time snapshots and in place data cloning. And our customers are really using it because of these capabilities and because of its performance for a pretty broad set of enterprise IT workloads and vertically focused workloads like within the financial services space and the healthcare life sciences space. And then Luster is a scale out file system that's built on the world's most popular high performance file system, which is the Luster open source file system. And customers are using it for compute intensive work loads where they're throwing tons of compute at massive data sets and they need to drive tens or hundreds of gigabytes per second of throughput. And it's really popular for things like machine learning training and high performance computing, big data analytics, video rendering and transcoding. So really those scale out compute intensive workloads. And then we have a very different type of customer, very different persona. And this is the individual that we call the AWS builder. And these are folks who are running cloud native workloads. They leverage a broad spectrum of AWS's compute and analytic services. And they have really no history of on-prem. Examples are data scientists who require a file share for training sets, research scientists who are performing analysis on lab data, developers who are building containerized or serverless workloads and cloud practitioners who need a simple solution for storing assets for their cloud workflows. And these folks are building and running a wide range of data focused workloads. And they've grown up using services like Lambda and building containerized workloads. So most of these individuals generally are not storage experts and they look for storage that just works. S3 and consumer file shares like Dropbox are their reference point for how cloud storage works. And they're indifferent to or unaware of file protocols like SMB or NFS. And performing typical NAS administrative tasks is just not a natural experience for them. It's not something they do. And we built Amazon EFS to meet the needs of that group. It's fully elastic, it's fully serverless, spreads data across multiple availability zones by default. It scales infinitely. It works very much like S3. So for example, you get the same durability and availability profile of S3. You get intelligent tiering of colder data just like you do on S3. So that service just clicks with cloud native practitioners. It's intuitive and it just works. There's a mind boggling the number of use cases you just went through. And this is where it's so, a lot of times people roll their eyes, oh, here's Amazon talking about customer obsession again. But if you don't stay close to your customers, there's no way you could have predicted when you're building these services, how they were going to be put to use. The only way you can understand it is watch what customers do with it. I loved the conversation about Graviton. We've written about that a lot. I mean, Nitro, we've written about that, how you've completely rethought virtualization, the security components in there, the HPC Luster piece and the EFS for data scientists. So really helpful there, thank you. I'm going to change topics a little bit because there's been this theme that you've been banging on at Storage Day, putting data to work. And I tell you, it's a bit of a passion in my head because frankly customers have been frustrated with the return on data initiatives. It's been historically complicated, very time consuming and expensive to really get value from data. And often the business lines end up frustrated. So let's talk more about that concept. And I understand you have an announcement that fits with this theme. Can you tell us more about that? Absolutely, today we're announcing a new service called Amazon FileCache. And it's a service on AWS that accelerates and simplifies hybrid workflows. And specifically Amazon FileCache provides a high-speed cache on AWS that makes it easier to process file data regardless of where the data is stored. And Amazon FileCache serves as a temporary high-performance storage location. And it's for data that's stored in on-premise file servers or in file systems or object stores in AWS. And what it does is it enables enterprises to make these dispersed data sets available to file-based applications on AWS with a unified view and at high speeds. So think of sub millisecond latencies and 10s or hundreds of gigabytes per second of throughput. And so a really common use case supports is if you have data stored on-premises and you wanna burst a processing workload to the cloud, you can set up this cache on AWS and it allows you to have the working set for your compute workload be cached near your AWS compute. So what you would do as a customer when you wanna use this is you spin up this cache, you link it to one or more on-prem NFS file servers and then you mount this cache to your compute instances on AWS. And when you do this, all of your on-prem data will appear up automatically as folders and files on the cache. And when your AWS compute instances access a file for the first time, the cache downloads the data that makes up that file in real time and that data then would reside on the cache as you work with it. And when it's in the cache, your application has access to that data at those sub millisecond latencies and at up to hundreds of gigabytes per second of throughput. And all of this data movement is done automatically and in the background, completely transparent to your application that's running on the compute instances. And then when you're done with your workload, with your data processing job, you can export the changes and all the new data back to your on-premises file servers and then tear down the cache. Another common use case is if you have a compute intensive file-based application and you wanna process a dataset that's in one or more S3 buckets, you can have this cache serve as a really high-speed layer that your compute instances mount as a network file system. And you can also place this cache in front of a mix of on-prem file servers and S3 buckets and even FSX file systems that are on AWS. All of the data from these locations will appear within a single namespace that clients that mount the cache have access to and those clients get all the performance benefits of the cache and also get a unified view of their datasets. And to your point about listening to customers and really paying attention to customers, Dave, we built this service because customers asked us to, a lot of customers asked us to actually, it's a really helpful enabler for a pretty wide variety of cloud bursting workloads and hybrid workflows ranging from media rendering and transcoding to engineering design simulation to big data analytics. And it really aligns with that theme of extend that we were talking about earlier. You know, I often joke that AWS says it's the best people working on solving the speed of light problem. So this is, okay. But so this idea of bursting, as I said, it's been a great cloud use case from the early days and bringing it to file storage is very sound and the approach with file cache looks really practical. When is this service available? How can I get started bursting to AWS? Give us the details there. Yeah, well, stay tuned. We announced it today at storage day and it will be generally available later this year. And once it becomes available, you can create a cache via the AWS management console or through the SDKs or the CLI. And then within minutes of creating the cache, it'll be available to your Linux instances and your instances will be able to access it using standard file system mount commands. And the pricing model is gonna be a pretty familiar one to cloud customers. Customers will only pay for the cache storage and the performance they need. And they can spin a cache up and use it for the duration of their compute burst workload and then tear it down. So I'm really excited that Amazon file cache will make it easier for customers to leverage the agility and the performance and the cost efficiency of AWS for processing data no matter where the data is stored. Yeah, cool. Really interested to see how that gets adopted. Ed, always great to catch up with you. As I said, the pace is mind boggling. It's accelerating in the cloud overall but storage specifically. So by ask is, can we take a little breather here? Can we just relax for a bit and chill out? Not as long as customers are asking us for more things. So there's more to come for sure. All right, Ed, thanks again. Great to see you. I really appreciate your time. Thanks Dave, great catching up. Okay, and thanks for watching our coverage of AWS Storage Day 2022. Keep it right there for more in-depth conversations on theCUBE, your leader in enterprise and emerging tech coverage.