 From theCUBE Studios in Palo Alto in Boston, bringing you data-driven insights from theCUBE and ETR. This is Breaking Analysis with Dave Vellante. Techniques to protect sensitive data have evolved over thousands of years, literally. The pace of modern data protection is rapidly accelerating and presents both opportunities and threats for organizations. In particular, the amount of data stored in the cloud combined with hybrid work models, the clear and present threat of cyber crime, regulatory edicts, and the ever-expanding edge. And associated use cases should put CXOs on notice that the time is now to rethink your data protection strategies. Hello and welcome to this week's Wikibon Cube Insights, powered by ETR. In this Breaking Analysis, we're going to explore the evolving world of data protection and share some information on how we see the market changing in the competitive landscape for some of the top players. Steve Keniston, a.k.a. the storage alchemist, shared a story with me, and it was pretty clever. Way back in 4,000 B.C., the Samarians invented the first system of writing. Now, they used clay tokens to represent transactions at that time. Now, to prevent messing with these tokens, they sealed them in clay jars to ensure that the tokens, i.e., the data, would remain secure with an accurate record that was, let's call it quasi-immutable and lived in a clay vault. Since that time, we've seen quite an evolution in data protection. Tape, of course, was the main means of protecting data and backing data up during most of the mainframe era, and that carried into client server computing, which really accentuated and underscored the issues around backup windows and challenges with RTO recovery time objective and RPO recovery point objective, and just overall recovery nightmares. Then in the 2000s, data reduction made disk-based backup more popular and pushed tape into an archive last resort media. Data domain, then EMC, now Dell, still sell many purpose-built backup appliances as do others as a primary backup target, disk-based. The rise of virtualization brought more changes in backup and recovery strategies as a reduction in physical resources squeezed the one application that wasn't underutilizing compute, i.e., backup. And we saw the rise of Veeam, the cleverly named company that became synonymous with data protection for virtual machines. Now the cloud has created new challenges related to data sovereignty, governance, latency, copy creep, expense, et cetera, but more recently, cyber threats have elevated data protection to become a critical adjacency to information security. Cyber resilience to specifically protect against ransomware attacks is the new trend being pushed by the vendor community as organizations are urgently looking for help with this insidious threat. Okay, so there are two major disruptors that we're going to talk about today, the cloud and cyber crime, especially around ransoming your data. Every customer is using the cloud in some way, shape, or form. Around 76% are using multiple clouds, that's according to a recent study by HashiCorp. We've talked extensively about skill shortages on theCUBE and data protection and security concerns are really key challenges to address given that skill shortages, a real talent gap in terms of being able to throw people at solving this problem. So what customers are doing, they're either building out or they're buying, really mostly building, abstraction layers to hide the underlying cloud complexity. So what this does, the good news is it simplifies provisioning and management, but it creates problems around opacity. In other words, you can't see sometimes what's going on with the data. These challenges fundamentally become data problems in our view. Things like fast, accurate and complete backup recovery, compliance, data sovereignty, data sharing, I mentioned copy creep, cyber resiliency, privacy protections, these are all challenges brought to fore by the cloud, advantages, the pros and the cons. Now remote workers are especially vulnerable and as clouds expand rapidly, data protection technologies are struggling to keep pace. So let's talk briefly about the rapidly expanding public cloud. This chart shows worldwide revenue for the big four hyperscalers. As you can see, we projected that they're going to surpass $115 billion in revenue in 2021. That's up from 86 billion last year. Such a huge market, it's growing in the 35% range. The interesting thing is last year, 80 plus billion dollars in revenue, but $100 billion was spent last year by these firms in CAPEX. So they're building out infrastructure for the industry. This is a gift to the balance of the industry. Now to date, legacy vendors and the surrounding community have been pretty defensive around the cloud. Oh, not everything's going to move to the cloud. You know, it's not a zero sum game we hear. And while that's all true, the narrative was really kind of a defensive posture and that's starting to change as large tech companies like Dell, IBM, Cisco, HPE and others see opportunities to build on top of this infrastructure. You certainly see that with Arvin Krishna's comments at IBM, Cisco obviously leaning in from a networking and security perspective, HPE using language that is very much cloud like with its GreenLake strategy. And of course, Dell is all over this. Let's listen to how Michael Dell is thinking about this opportunity when he was questioned on the queue by John Furrier about the cloud play the clip. Well, clouds are infrastructure, right? So you can have a public cloud, you can have an edge cloud, a private cloud, a telco cloud, a hybrid cloud, a multi cloud, here a cloud, there a cloud, everywhere a cloud. Yes, they'll all be there. But it's basically infrastructure and how do you make that as easy to consume and create the flexibility that enables everything? Okay, so in my view, Michael nailed it. The cloud is everywhere. You have to make it easy and you have to admire the scope of his comments. We know this guy, he thinks big, right? He said enables everything. What he's basically saying is that technology is at the point where it has the potential to touch virtually every industry, every person, every problem, everything. So let's talk about how this informs the changing world of data protection. Now we all know, we've seen with the pandemic there's an acceleration toward digital and that has caused an escalation, if you will, in the data protection mandate. So essentially what we're talking about here is the application of Michael Dell's cloud everywhere comments. You've got on-prem, private clouds, hybrid clouds, you've got public clouds across AWS, Azure, Google, Alibaba, really those are big four, hyperscalers, you got mini clouds that are popping up all the place, but multi-cloud to that HashiCorp data point 75, 76%. And then you now see the cloud expanding out to the edge programmable infrastructure heading out to the edge. So the opportunity here to build the data protection cloud is to have the same experiences across all these estates with automation and orchestration in that cloud, that data protection cloud, if you will. So think of it as an abstraction layer that hides that underlying complexity. You log into that data protection cloud, it's the same experience. So you've got backup, you've got recovery, you can handle bare metal, you can do virtualized backups and recoveries, any cloud, any OS out to the edge, Kubernetes and container use cases, which is an emerging data protection requirement. And you've got analytics, perhaps you've got PII, personally, Identifiable Information Protection in there. So the attributes of this data protection cloud, again, it abstracts the underlying cloud primitives, takes care of that. It also exploits cloud native technologies. In other words, it takes advantage of whether it's machine learning, which all the big cloud players have expertise in, new processor models like things like Graviton, and other services that are in the cloud natively. It doesn't just wrap its on-prem stack in a container and shove it into the cloud. No, it actually rearchitects or architects around those cloud native services. And it's got distributed metadata to track files and volumes and any organizational data irrespective of location. And it enables a sets of services to intelligently govern in a federated governance manner while ensuring data integrity. And all this is automated and orchestrated to help with the skills gap. Now, as it relates to cyber recovery, air gap solutions must be part of the portfolio but managed outside of that data protection cloud that we just briefly described. The orchestration of the management must also be gapped, if you will. Otherwise, you don't have an air gap. So all of this is really a cohort to cybersecurity or your cybersecurity strategy and posture, but you have to be careful here because your data protection strategy could get lost in this mess. So you want to think about the data protection cloud as, again, an adjacency or maybe an overlay to your cybersecurity approach. Not a bolt-on, it's got to be fundamentally architectured from the bottom up. But, and yes, this is going to maybe create some overheads and some integration challenges, but this is the way in which we think you should think about it. So you'll likely need a partner to do this. Again, we come back to the skills gap, but we're seeing the rise of MSPs, managed service providers and specialist service providers, not public cloud providers. People are concerned about lock-in and that's really not their role. They're not high-touch services company. Probably not your technology arms dealer. Excuse me, they're selling technology to these MSPs. So the MSPs, they have intimate relationships with their customers. They understand their business and specialize in architecting solutions to handle these difficult challenges. So let's take a look at some of the risk factors here and dig a little bit into, you know, the cyber threat that organizations face. This is a slide that, again, the storage alchemist Steve Kenniston shared with me. It's based on a study that IBM funds with the Ponemon Institute, which is a firm that studies these things like cost of breaches and has for many, many, many years. The slide shows the total cost of a typical breach within each dot and on the y-axis and the frequency in percentage terms on the horizontal axis. Now it's interesting, the top two comprise or compromise credentials in phishing, which once again proves that bad user behavior trumps good security every time. But the point here is that the adversaries attack vectors are many. And specific companies often specialize in solving these problems, often with point products, which is why the slide that we showed from Optiv earlier, that, you know, messy slide looks so cluttered. So it's a huge challenge for companies. And that's why we've seen the emergence of cyber recovery solutions from virtually all the major players. Ransomware and the SolarWinds hack have made trust the number one issue for CIOs and CISOs and boards of directors. Shifting CISO spending patterns are clear. They're shifting largely from, because of the catalyzed by the work from home, but outside of the moat to endpoint security, identity and access management, cloud security, the horizontal network security. So security priorities and spending are changing. And that's why you see the emergence of disruptors like we've covered extensively, Okta, CrowdStrike, Zscaler. And cyber resilience is top of mind and robust solutions are required. And that's why companies are building cyber recovery solutions that are most often focused on the backup corpus, because that's a target for the bad guys. So there is an opportunity, however, to expand from just the backup corpus to all data and protect this kind of three, two, one, or maybe it's three, two, one, one, you know, three copies, two backups, a backup in the cloud and one that's air-gapped. So this can be extended to primary storage, copies, snaps, containers, data in motion, et cetera, to have a comprehensive data protection strategy. Customers, as I said earlier, increasingly looking to manage service providers and specialists because of that skill gap, skills gap. And that's a big reason why automation is so important in orchestration. And automation and orchestration, I'll emphasize on the air-gap solutions should be separated physically and logically. All right, now let's take a look at some of the ETR data and some of the players. This is a chart that we like to show often. It's a X, Y axis and the Y axis is net score, which is a measure of spending momentum. And the horizontal axis is market share. Now market share is an indicator of pervasiveness in the survey. It's not spending market share, it's not market share of the overall market. It's a term that ETR uses. It's essentially market share of the responses within the survey set. Think of it as mind share. Okay, you've got the pure plays here on this slide in the storage category. There is no data protection or backup category. So what we've done is we've isolated the pure plays or close to pure plays in backup and data protection. Now notice that red line. That red line is kind of our subjective view of anything that's over that 40% line is elevated. And you can see only rubric in the July survey is over that 40% line. I'll show you the ends in a moment. Smaller ends, but still rubric is the only one. Now look at Cohesity and Rubric in the January 2020. So last year, pre-pandemic Cohesity and Rubric, they've come well off their peaks for net score. Look at Veeam. Veeam, having studied this data for the last say 24 plus months, Veeam has been steady eddy. It is really always in the mid to high 30s, always shows a large shared end. So it's coming up in the survey. Customers are mentioning Veeam and it's got a very solid net score. It's not above that 40% line, but it's hovering just below consistently. That's very impressive. Commvault has steadily been moving up. You know, Sanjay, Mershandani has made some acquisitions. He did the HeadVig acquisition. They launched Metallic. That's driving cloud affinity and within Commvault's large customer base. So it's a good example of a legacy player pivoting and evolving and transforming itself. Veritas continues to underperform in the ETR surveys relative to the other players. Now, for context, let's add IBM and Dell to the chart. Now just note, this is IBM and Dell's full storage portfolio. The category and the taxonomy at ETR is all storage. Okay, just previous slide isolated on the pure plays, but this now adds in IBM and Dell. You're probably representative of where they would be. They're probably Dell larger on the horizontal axis than IBM. Of course, and you can see the spending momentum accordingly. So you can see that in the data chart that we've inserted. So some smaller ends for rubric and cohesity, but still enough to pay attention. It's not like one or two. When you're 20 plus, 15 plus, 25 plus, you can start to pay attention to trends. Veeam, again, is very impressive. It's net score is solid. It's got a consistent presence in the data set. It's a clear leader here. Simplimity's small, but it's improving relative to last several surveys. And we talked about convulping. Now, I want to emphasize something that we've been hitting on for quite some time now. And that's the renaissance that's coming in compute. Now, we all know about Moore's law, the doubling of transistor density, every two years, 18 to 24 months. And that leads to a doubling of performance in that timeframe. x86, that x86 curve is in the blue. And if you do the math, this is expressed in trillions of operations per second. The orange line is representative of Apple's A series, culminating in the A15 most recently. The A series is what Apple is now, it's the technology basis for what's inside M1, the new Apple laptops, which is replacing Intel. That's that orange line there, I'll come back to that. So go back to the blue line for a minute. If you do the math on doubling performance every 24 months, it comes out to roughly 40% annual improvement in processing power per year. That's now moderated. So Moore's law is waning in one sense. So we wrote a piece, Moore's law is not dead. So I'm sort of contradicting myself there. But the traditional Moore's law curve on x86 is waning. It's probably now down to around 30%, low 30s. But look at the orange line. If you, again, using the A series as an indicator, if you combine the CPU, the NPU, which is the neural processing, you know, XPU, pick whatever PPU you want, the accelerators, the DSPs, that line is growing at 100% plus per year. It's probably more accurately around 110% a year. So there's a new industry curve occurring and it's being led by the ARM ecosystem. The other key factor there, and you're seeing this in a lot of use cases, a lot of consumer use cases, Apple is an example, but you're also seeing it in things like Tesla, Amazon with AWS, Graviton, the Annapurna acquisition building out Graviton and Nitro. That's based on ARM. You can get from design to tape out in less than two years, whereas the Intel cycles we know, they've been running it four to five years now, maybe Pat Gelsinger's compressing those, but Intel is behind. So organizations that are on that orange curve are going to see faster acceleration, lower cost, lower power, et cetera. All right, so what's the tie to data protection? I'm going to leave you with this chart. ARM has introduced its confidential compute architecture and is ushering in a new era of security and data protection. Zero trust is the new mandate. And what ARM has done with what they call realms is create physical separation of the vulnerable components by creating essentially physical buckets to put code in and to put data in separate from the OS. Remember, the OS is the most valuable entry point for hackers or one of them because it contains privileged access and it's a weak link because of things like memory leakages and vulnerabilities. And malicious code can be placed by bad guys within data in the OS and appear benign, even though it's anything but. So in this architecture, all the OS does is create API calls to the realm controller. That's the only interaction. So it makes it much harder for bad actors to get access to the code and the data. And importantly, very importantly, it's an end-to-end architecture. So there's protection throughout. If you're pulling data from the edge and bringing it back to the on-prem or the cloud, you've got that end-to-end architecture and protection throughout. So the link to data protection is that backup software vendors need to be the most trusted of applications. Backup software needs to be the most trusted of applications because it's one of the most targeted areas in a cyber attack. Realms provide an end-to-end separation of data and code from the OS and it's a better architectural construct to support zero trust and confidential computing in critical use cases like data protection slash backup and other digital business apps. So our call to action is backup software vendors, you can lead the charge. ARM is several years ahead at the moment, ahead of Intel in our view. So you got to pay attention to that, research that. We're not saying over-rotate, but go investigate that. And use your relationships with Intel to accelerate its version of this architecture or ideally the industry should agree on common standards and solve this problem together. Pat Gelsinger told us on theCUBE that if it's the last thing he's going to do in his industry life, he's going to solve this security problem. That's when he was at VMware. Well, Pat, you're even in a better place to do it now. You don't have to solve it yourself. You can't, and you know that. So while you're going about your business saving Intel, look to partner with ARM, I know it sounds crazy, to use these published APIs and push to collaborate on an open source architecture that addresses the cyber problem. If anyone can do it, you can. Okay, that's it for today. Remember, these episodes are all available as podcast, all you got to do is search, breaking analysis podcast. I publish weekly on wikibon.com and siliconangle.com or you can reach me at dvolonte on Twitter, email me at david.volonte at siliconangle.com and don't forget to check out etr.plus for all the survey and data action. This is Dave Volante for theCUBE Insights powered by ETR. Thanks for watching everybody. Be well and we'll see you next time.