 From the SiliconANGLE Media office in Boston, Massachusetts, it's theCUBE. Now, here's your host, Dave Vellante. Hello everyone and welcome to this CUBE conversation. You know, data protection, it used to be so easy. You'd have apps, they'd be running on a bunch of servers, you'd bolt on a little backup and boom, one size fit all, it was really easy peasy. Now, business disruptions at the time, they were certainly not desired, but they were definitely much more tolerated and they were certainly fairly common place. Today, business disruptions are still fairly common occurrence, but the situation is different. First of all, digital imperatives have created so much more pressure for IT organizations to deliver services that are always available with great consumer experiences. The risks of downtime are so much higher, but meeting expectations is far more complex. This idea of one size fits all, it really no longer cuts it. You've got physical, virtual, public cloud, on-prem, hybrid, edge, containers, add to this cyber threats, AI, competition from digital disruptors. The speed of change is accelerating and it's stressing processes and taxing the people skills required to deliver business resilience. These and other factors are forcing organizations to rethink how they protect, manage, and secure data in the coming decade. And with me, to talk about the state of data protection today, and beyond, is a thought leader from one of the companies in data protection, Arthur Lent, is the Senior Vice President and CTO of the Data Protection Division at Dell EMC. Arthur, good to see you again, thanks for coming in. Great to see you, Dave. So I'm going to start right off. This is a hot space and everybody wants a piece of your hide because you're the leader. How are you guys responding to that competitive threat? Well, so the key thing that we're doing is we're taking our proven products and technologies and we've recognized the need to transform and really modernize them and invest in a new set of capabilities and changing workloads. And our core part of that, with some changes in leadership, have been to shift our processes in terms of how we do stuff internally. And so we've moved from a very big batch waterfall style approach where things need to be planned one, two, three years out in advance to a very small batch agile approach where we're looking a couple of weeks, a couple of months in advance of what we're going to be delivering into product. And this is enabling us to be far more responsive to what we're learning in the market in very rapidly changing areas. And we're at the spot where we now have several successive releases that have been taking place with our products in this new model. So that's a major cultural shift that you're really driving. I mean, that's allows you to track younger people. You guys are a global organization. So I mean, how is that sort of dynamic change? People sometimes maybe think of you as the stodgy company been around for 20 plus years, but what's it like when you walk around the hallways? What's that dynamic like? It's like we're the largest startup in the data protection industry. But we've got the backing of a Fortune 50 company. Nice. All right, well, let's get into it. I talked in my narrative upfront about business disruptions. And I said there's still kind of a common occurrence today. Is that what you're seeing? Absolutely. So our latest data protection index research has 82% of the people we surveyed experienced downtime or data loss within the last 12 months. And this survey was just completed within the last month or two. So this is still very much a real problem. Why do you think it's still a problem today? What are the factors? So I would say the problem's getting worse and it's because complexity is only increasing in IT environments. Complexity around multi-platform between physical servers, virtual servers, cloud, various flavors of hybrid cloud, data distribution between the core edge and the cloud, growing data volumes, the amount of data, and the data that companies need to run their business is ever-increasing and growing risk around compliance, around security threats. And many customers have multi-vendor environments. And multi-vendor environments also increase their complexity and risk and challenge. Let's talk about cloud. Because we entered last decade, cloud was kind of this experimental throw some dev out in the cloud. And now as we enter this decade, it's kind of a fundamental part of IT strategies. Every CIO, she or she has a cloud strategy. But it's also becoming clear that it's a hybrid world. So in thinking about data protection, how does hybrid affect how your customers are thinking about protecting their data in the coming decade? So it produces a bunch of changes in how you have to think about things. And today, we have over 1,000 customers protecting over 2.5 exabytes of data in the public cloud. And it goes across a variety of use cases, from long-term retention in the cloud, backup to the cloud, disaster recovery to the cloud, a desire to leverage the cloud for analytics and dev test, as well as production workloads in the cloud and the need to protect data that is born in the cloud. And we're in an environment where IT is spanning from the edge to the core of the cloud and the need to have a cohesive ability and approach to protect that data across its lifecycle for where it's born and where it's being operated on and where value is being added to it. Yeah, and people don't want to buy 1,000 products to do that or even a dozen products to do that. They want a single platform. I want to talk about containers because Kubernetes specifically, but containers generally, one of the hottest areas, it's funny, containers have been around forever, but now they're exploding, people are investing much more in containers. IT organizations and dev organizations see it as a way to drive some of the agility that you maybe talked about earlier. But I'm hearing a lot about, you know, protection, data protection for containers, and I'm thinking, well, wait a minute, you know, containers come and go, they're ephemeral. Why do I need to protect them? Help me understand that. So first, I want to say, yeah, we're seeing a lot of interest in enterprises deploying containers. Our latest survey says 57% of enterprises are planning on deploying it next year. And in terms of the ephemerality and the importance of protection, I have to admit, I started this job about a year ago, and I was thinking almost exactly the same thing you were. I came in, we had an advanced development project going on around how to protect Kubernetes environments, both to protect the data and the infrastructure. And I was like, yeah, I see this as an important advanced development priority, but why is this important to productize in the near future? And then I thought about it some more and was talking to folks where the Kubernetes technologies, there's two key things with it. One, it's Kubernetes as a DevOps CI CD environment. Well, if that environment's down, your business is down in terms of being able to develop. So you have to think about the loss of productivity and the loss of business value as you're trying to get your developer environment back up and running. But also, even though there might not be stateful applications running in the containers, there's generally production usage in terms of delivering your service that's coming out of that cluster. So if your clusters go down or your Kubernetes environment goes down, you got to be able to bring it back up in order to be able to get it up and running. And then the last thing is in the last year or two, there's been a lot of investment in the Kubernetes community around enabling Kubernetes containers to be stateful and to have persistence with them. And that will enable databases to run in containers and stateful applications to run into containers. And we see a lot of enterprises that are interested in doing that, but now they can have that persistence, but it turns out they can't go into production with the persistence because they can't back it up. And so there's this chicken and egg problem in order to do the production, you need both the state and the data protection. And the nice thing about the transformation that we've done is as we saw this trend materializing, we were able to rapidly take this advanced development project and turn it into productization. And we were able to get to a tech preview in the summer and a joint announcement with Pat Gelsinger around our work together in the Kubernetes environment and being able to get our first product release out to market a couple of weeks ago. And we're going to be able to really rapidly enhance the capabilities of that as we're working with our customers on where do they need the features added most and be able to rapidly integrate in with VMware's management ecosystem for container environment. So you got a couple of things going on there. Kind of describing the dynamic of the developer and developer is such a key strategic linchpin now. See, because the time between you developing function and you get it to market. I mean, it used to be weeks or months or sometimes even years. Today, it's like nanoseconds, right? Hey, we need this function today. Something's happening in the market, go push it. And if you don't have your data, you don't have the containers, the data in the containers is not protected, you're in trouble, right? Okay, so that's one aspect of it. The other is the technical piece. Help us understand like how you do that. What's the secret sauce conceptually behind protecting containers? So there's really two parts of what one needs to do for protecting the containers. There's the container infrastructure itself and the container configuration and knowing what's involved in the environment so that if your Kubernetes cluster goes down, being able to restart it and being able to get your appropriate application environment up and running. So the containers may not be stateful, but you've got to be able to get your CI CD operate environment up and running again. And then the second part is we are seeing people use stateful containers and put databases in containers and development. They want to roll that into production. And so for there, we need to back up not just the container definitions, but back up the data that's inside the container and be able to restore them. And those are some of the things that we're working on now. One of the things that I've learned being around this industry for a while is people who really understand technology. They'll ask questions about what happens when something goes wrong. So it's all about the recovery is really talking about is that's the key. How does machine intelligence fit in? So stay on containers for a minute. Is machine learning and machine intelligence allowing you to recover more quickly? Does it fit in there? So a key part of the container environment that's different from some of the environments of the past is just how dynamic it is and just how frequently containers are gonna come and go and workloads may expand and contract their usage of IT resources and footprint. And that really increases the need for automation and using some AI and machine learning techniques so that one can discover what is an application as it's containerized and what are all the resources it needs so that in the event of an interruption of service, you know all of the pieces that you need to bring together and automate its recovery and bring back. And in these environments, you can no longer be in a spot to have people handcraft and tailor exactly what to protect and exactly how to bring it back after protection. You need these things to be able to be to protect themselves automatically and recover themselves automatically. So I want to sort of double click on that. Again, it's 2020. So I'm always going back to the last decade and thinking about what's different. When beginning of last decade, people were afraid of automation. They wanted knobs to turn. They even exiting the decade recently and even now people are afraid about losing jobs. But the reality is things are happening so fast there's so much data that humans just can't keep up. So maybe you could make some comments about automation generally and specifically applying it to data protection. Okay, so with the increasing amounts of data to be protected and the increasing complexity of environments, more and more of the instances of downtime or challenges in performing a recovery tend to be because of the complexity of having deployed them and having the recovery procedures right and ensuring that the SLAs that are needed are met. And it's just no longer realistic to expect people to have to do all of those things in excruciating detail. And it's really just necessary in order to meet the SLAs going forward to have the environments be automatically discovered, automatically protected and have automated workflows for the recovery scenarios. And because of the complexities of changing we need to reach the point of having AI and machine learning technologies help guide the people owning the data protection on data criticality and what's the right SLA for this and what's the right SLA for that and really get a human machine partnership. So it's not people or machines but it's rather the people and machines working together in tandem with each doing what they do best to get the best outcome. That's great. You'd be helping people prioritize and the criticality applications like that. I want to change the conversation and talk about the edge a little bit. You guys sponsor often like IDC surveys on how big the market is in terms of just Zettabytes and it's really interesting and thank you from the industry standpoint for doing that. I have no doubt edge is coming into play now because so much data is going to be created at the edge. There's all this analog data that's going to be digitized and it's just a big component of the digital future. In thinking about data at the edge a lot of the data is going to stay at the edge maybe it's got to be persisted at the edge and obviously if it's persisted it has to be protected. So how are you thinking about the evolution of edge specifically around data protection? Okay, so I think you kind of caught it in the beginning there's going to be a huge amount of data in the edge. Our analysis has us seeing that there's going to be more data generated and stored in the edge than in all the public clouds combined. That's just a huge shift in that three to five to 10 year time flow data. Lot of data, you're not going to be able to bring it all back. You're just going to have elements of physics. So there's data that's going to need to be persisted there. Some of that data will be transitory. Some of that data is going to be critical and need to be recovered and a key part of the strategy around the edge is really again going back to that AI and machine learning intelligence and having a centralized control and understanding of what is my data in the edge and having what are the right triggers and understanding of what's going on of when has an event occurred where I really need to protect this data? You can't afford to protect everything all the time. You got to protect the right things at the right time and then move it around appropriately. And so a key part of being successful with the edge is getting that distributed intelligence and distributed control and recognizing that applications are going to span from core to edge to cloud and have just specific features and functions and capabilities that implement in the various spots and then that intelligence to do the right thing at the right time with central policy control. So this is a good discussion. We've spanned a lot of territories but let's bring it back to the practical uses for the IT person today saying, okay, Arthur, look, yeah, I'm doing cloud. I'm playing around with AI. I've got my feeding containers and my dev staffs doing that. Yeah, edge, I see that coming but I just got some problems today that I have to solve. So my question to you is, how do you address those really tactical day-to-day problems that your customers are facing today and still help them plan for the future and make sure that they've got a platform that's going to be there for them that they're not going to just have to rip and replace in three or four years? Okay, and so that's like the $100,000 question as we look at ourselves in this situation. And the key is really taking our proven technologies and proven products and solutions and taking the agile approach for adding the most critical modern capabilities for new workloads, new deployment scenarios alongside them as we modernize those solutions themselves and really bringing our customers along in the journey with that and having a very smooth path for that customer transition experience on that path to our powered-up portfolio. I mean, that's key because if you get that wrong and your customers get that wrong, then maybe not it's a $100,000 problem, it's going to be billions of dollars a problem. Billions, fair. So I want to talk a little bit about alternative use cases for data protection. We've kind of changed the parlance. We used to call it backup. I've often said people want to get more out of the backup, they want to do other things with their backup because they don't want just to pay for insurance, the CFO wants ROI. What are you seeing in terms of alternative use cases and the sort of expanding TAM, if you will, of backup and data protection? So a core part of our strategy is to recognize that there is all of this data that we have as part of the data protection solutions and there's a desire on our customers' parts to get additional business value out of it and additional use cases from there. And we've explored and are investing in a variety of ways of doing that. And the one that we see that's really hit a key problem of the here and now is around security and malware. And we are having multiple customers that are under attack for a variety of threats and it's hitting front page news. And a very large fraction of enterprises are having some amount of downtime due to malware or cyber attacks. And a key focus that we've had is around our cyber recovery solutions of really enabling a protected air gap solution so that in the event of some hidden malware or an intrusion having a protected copy of that data to be able to restore from. And we've got customers who otherwise would have been brought down but were able to be brought back up very, very quickly by recovering out of our cyber vault. Yeah, it's a huge problem. Cyber has become a board level issue. People are scared to death of getting hit with ransomware and getting their entire data corpus encrypted so that air gap is obviously critical and increasingly it's becoming a fundamental requirement from a compliance standpoint. All right, I'll give you last word. Bring us home. Okay, so the most important thing about the evolving and rapidly changing space of data protection at this point is that need for enterprises to have a coherent approach across their old and new workloads, across their emerging technologies, across their deployments in core edge and cloud to be able to identify and manage that data and protect and manage that data throughout its life cycle and to have a single coherent way to do that and single set of policies and controls across the data in all of those places. And that's one key part of our strategy of bringing that coherence across all of those environments and not just in the data protection domain but there's also a need for this cross domain coherence and getting your automation and simplification not just in the data protection domain but up into higher levels of your infrastructure. And so we've got automation taking place with our Power One converged infrastructure and we're looking across our Dell Technologies portfolio of how can we together with our partners in Dell Technologies solve more of our customer problems by doing things jointly. And so for example, doing data management that spans not just your protection storage but your primary storage as well. Your AI and ML techniques for full stack automation. Working with VMware around the full end to end Kubernetes management for VMware environments. And those are just a couple of examples of where we're looking to both be full across the data protection but then expand into broader IT collaboration. You're seeing this across the industry. I mean, Arthur, you mentioned Power One. You're talking about microservices, API based platform and increasing, we're seeing infrastructure as a code which means more speed, more agility. And that's how the industry is dealing with all this complexity. Arthur, thank you so much for coming on theCUBE. Really appreciate it. Thank you. And thank you for watching, everybody. This is Dave Vellante and we'll see you next time.