 For decades, participants in the storage business have been rewarded for building or buying systems that were fast, cheap, and didn't fail. Now, the shift to software-led storage accelerated innovation by creating an abstraction layer that could be leveraged to simplify management and bring cloud-like experiences to on-prem and hybrid estates, but also enabled value to be delivered more quickly and injected into storage systems to support greater business resilience and more flexible features. The future of storage will increasingly be characterized by unlocking data value with AI and providing a foundation for businesses to better anticipate and withstand market shifts. Hello, and welcome to the IBM Storage Summit Live from the Cubes Palo Alto Studios. I'm Dave Vellante, along with my co-host Rob Streche will be taking you through the day today and joining us to kick off the summit are Scott Baker, who's IBM VP and CMO of the Infrastructure Portfolio, and Sam Werner, who's the IBM VP of Storage Product Management. Gents, great to see you. Thanks for coming in and supporting this great event. You got it, Dave. Rob, really good to see you both as well. So Scott, let's start with you. Lots has changed in storage, gone from kind of bit bucket to this intelligent system, this enabler. How do you see things out there? You know, I really like to think about storage as the true data custodian for any organization. And when you think about the responsibility that a custodian would have, whether it's a traditional custodian that maybe is responsible for a building, it doesn't change when it comes to data. The storage itself is becoming more and more a part of not just holding on to the data, but helping organizations really assess the referential value of that data. And also setting up the appropriate defenses to ensure that anything from an accidental breach of that data to an actual intrusion of that data is safeguarded and defended against. I think, although storage has presented itself mainly as a background service, it's becoming more and more forefront in the consideration for what an appropriate data strategy needs to look like for an organization. And I think that's something that really hasn't been the case up until more recently as people start thinking more about how data affects the overall information supply chain that they try to set up for the rest of the organization they support. Yeah, so Sam, the value equation has shifted, hasn't it? It used to be, don't lose my data. Right. Now it's just so much more. How do you see the market today? It's interesting. I mean, I think there's three things that storage administrators are struggling with. They're kind of the same problems they've always been struggling with, but they've transformed as technology has moved forward. Number one, as application development teams make the decision to move to a cloud native architecture, leverage containers, storage teams are trying to figure out how to build out an infrastructure that is elastic, that can support these new models of development, new types of middleware and still provide high levels of availability and data resilience and all the things that people take for granted when it comes to storage. Number two, unstructured data growth is very difficult to manage. It's very expensive and it's everywhere. It's at the edge, it's in the core data center, it's in the cloud. People are leveraging new technologies for lake houses, whether it's snowflake data bricks, hopefully WatsonX.data, the new solution from IBM, which is helping customers manage their data. We're trying to help the storage administrators figure out how to bring all that together. And then the last piece, which Scott was kind of getting into is around cyber resilience. The threats have gotten so much more significant. Storage administrators now to protect from hardware failure, from logical data corruption, from application developers making a mistake, but now you have targeted attacks come in. It's completely different how you recover. So these are three very significant problems and that's what we're working on in our storage portfolio to help solve and move them forward. Thanks for that, Sam. So Scott, if we kind of work backwards on those three, during the pandemic, I had a lot of CIOs tell us that their strategies were way too DR focused, their business resilience strategies. And they realized, because the force marched to digital, that they just weren't prepared. So what role specifically does storage play? First of all, do you see that with your client base and what role does storage play in advancing that cyber resilience? Yeah, we absolutely do. And in fact, I think Sam touched on one thing that a lot of people maybe not have paid enough attention to, which is the impacts of unstructured data, not just in growth, but the amount of risk that unstructured data can pose to an organization if they don't know how it's been classified, categorized and how it's been valued to the business. We spend a lot of time focused on things like block based storage devices where you have some inherent protection that's in that, where applications have direct access to the data and you build your strategy, whether it's cyber resiliency or data resiliency around that one to one relationship, if you will. But when you think about the human interaction with that data and the relevance of unstructured data and all of the inherent value that's locked into that, that literally anyone can access, depending on how you set your strategy up, it really expands the need for you to be thinking about your data resiliency strategy beyond just simple backup and recovery or disaster recovery. But now, how do you respond very quickly to an incident and not just respond with, oh, here's the point at which I'm gonna recover from, but doing so when you know that the appropriate defenses have led to the creation of that point of restore that has then been validated and verified by the overall storage architecture, which is something that we do that no one else, you know, in the industry that I'm aware of can actually do. Now I just set up the, I'm sorry, not just understanding what the attack surface looks like for a company, setting up the defenses, monitoring in real time, and then finding those points of restore that have been validated to be application consistent and devoid of threat signatures when it goes back into production. And those different data types. It's not just unstructured data. When you start to unpack unstructured data, and Sam, you mentioned Snowflake and Databricks, we love those two companies, we talk about them and write about them all the time, but most of the world's data isn't in their platforms. So that, to your point, you have to have all this other optionality. You mentioned Watson X, we have Vincent Shue coming on later today to talk about some of the capabilities there and how you're handling different query types and different data types, and there are all types of different formats, but you've got to translate that into things that the machine can understand, but then back to the future, which is what the world looks like, people's places, things. So that's a technical challenge for an organization like IBM. So how, to your first point about being able to make these storage systems flexible and elastic, it's not just cloud scale up, scale down, it's new data types, it's new types of applications, it's digital representations of the business, how do you approach that from a storage standpoint? Yeah, you know, first of all, the whole world is so excited about AI all of a sudden, which is great, you know, the world's imagination has been opened up by chat GPT, people be able to get on there and see the value in real time. The thing is, is it all comes down to be able to get real value out of AI, you have to be able to combine your own data with these foundation models, right? You have to be able to bring together the data that you have as an organization, which is your knowledge of your industry, your knowledge of your customers, and this data exists everywhere. How do you bring that data together with a foundation model, whether it's a large language model or some other specific industry vertical type model you've created? How do you bring your data together with that? Like you said, the data's everywhere. I need a way to bring it together, combine it, do training. Just taking one database and combining it with data, that's not gonna give you very valuable insights. You've gotta combine multiple different sources. And if you look at efficiency and being able to get insights in real time, you have to be able to access that data from multiple sources. You can't rely on ingesting everything. It's too expensive, it's too complicated. So that's what we're doing. We're working on technology that virtualizes that, and I'm sure Vincent will get into that in a lot more detail. Yeah, I think that's really one of the big keys is the fact that you guys are taking an approach, it seems, and correct me if I'm wrong, that is more where the data lives, we're going to be there to support that storage, and we're going to bring the value out of the data to you, more of that data platform versus a storage platform approach, is that how you're looking at it? Yeah, and there's another big piece which I should have mentioned, it's probably top of mind with many enterprises, which is around the data governance. How do you ensure that you're not breaking regulatory rules, privacy rules, that you're actually protecting the data, not giving it to the wrong people, there's a security angle to it. So the data governance as well that goes with that, and not creating risk by moving data around and making copies of it. Right, and I think a big piece of it is, you have sovereign clouds out there, you have people who have to keep data in specific places, GDPR and CCPA, given we're in California, you start to look at where that data lives, but you still need to be able to bring it together, like you said, the unstructured data growth is massive, and I think that's what you're seeing as well is that the growth has been huge, you can't move it all around all the time, you have to be more selective and then do that. Yeah, and I think if you think about AI as a workload, it's no different than any other workload that would run on storage, right? It just happens to be both performance intensive as well as capacity sensitive. You need a lot of access to data when you think about the corpuses that you're gonna build. For me, what I find interesting is we use purposely the term information supply chain, and when you think about AI, what we're really trying to drive for is the foundation that you need from an infrastructure perspective that provides the equivalent of a cloud architecture that you can deploy on-prem. There runs consistently in the cloud that will help organizations operationalize their AI investment, and I think that's the key. Right now, what is AI to a business? In some businesses, they've got maturity, they know what they're going to do with it, and I think IBM is creating some really cool foundational capabilities, and they use that term kind of tongue-in-cheek a little bit since they're calling them foundational models to help organizations do that, but at the end of the day, it can't just be a science project. It can't be some skunkworks activity that's occurring in the back office that you've got to drive into production. We have to get to a point to where organizations think about the data strategy no different than they would think about an AI strategy. We help organizations operationalize data. Ergo, we should help them operationalize AI so that they not only have something to work with, but they've got a strategy in place that helps them go from data ingest to processing to governance and management. So a couple of things on that. So last week, we had SuperCloud 3 in this very studio, and prior to that, I was talking to Jeff Jonas, former IBMer, and I was asking him, how do you think about these foundation models and large language models? You know, we were talking about guardrails, and do you think that they'll be able to put them in and apply them in a way that can solve some of these problems that we've been just talking about, and he said to me, Dave, yeah, the problem is it's generative. It gives you a different answer every time. When we talk about governance, you can't have a different answer every time. So there's a lot of excitement around it, but it came out of SuperCloud 3, which was all about security and AI, with the conclusion that it's still really unclear, even though everybody's been doing AI, including us for a long, long time, machine learning, if we can stretch the definition, it's really unclear how we're going to apply large language models specifically to solve these problems. So I'd ask you guys, how have you been using AI and how are you thinking about using it in the future to solve some of these issues? Well, we use AI internally in our products, and we've been doing it for a while. A big part of it is around AI ops, which is simplifying the management of your storage infrastructure. For anybody in the storage business, you know a storage outage, loss of access to storage is bad. I mean, you lose a server, maybe the application moves to another server. If you lose access to your storage, everything goes wrong. So we're making it easier. We're putting in predictive capabilities that will look for problems before they happen, that try to detect when there could be issues coming up within your storage environment, and we have this AI running now, and our support team has access to it. We run it across our systems that are installed, and we're able to take the collective knowledge we have from running all this storage to improve our detection. We're up to over 75% accuracy in detecting anomalies or potential problems within our storage. But now we've taken it even further. One of the things we do uniquely in our flash system is we build our own drives. We build our own NVMe drives, and that gives us the ability to bring AI into those drives to look for anomalous IO activity, which could be indicative of an attack. So we're able to move into near real-time detection of anomalies in your IO to potentially catch a ransomware attack before it spreads across your storage environment and creates a very painful situation. We're gonna be talking about that later on today. Entropy, entropy is winning, is one of the other sort of buzzwords that are phrases that we've been talking about. In other words, randomness, right? So those, that second law of thermodynamics fans, but the point being, you can detect those anomalies, you can detect that randomness, which may represent somebody trying to encrypt data or doing something that's malicious in real-time. And that is the key. But Scott, listening to Sam, that essentially is operationalizing AI. You know, kind of while you sleep. So it's funny, it's earning season, so you saw Microsoft and Google announced last night, all the analysts want to know, well, how much revenue is coming from large language models? And it's like, none, really. And well, is it gonna disrupt search? And so to your point about operationalizing, this is likely how we're gonna see AI adopted. It's just gonna be embedded in systems. We're gonna buy it as part of the systems. It's gonna your job to make it all work, isn't it? Absolutely, and it's an expectation that businesses are gonna have, right? And as much as we think about replication as being just an expectation of enterprise storage, so will inherent AI and machine learning. And to Sam's point, it's not enough to just bake it into the system. The FlashCore modules have been computationally based since they were first introduced. And the idea that you could offload data processing elements to the drive to do the work closest to where the data is actually being stored makes complete sense. But by itself is not enough. It's no different than a car alarm. It was the last time you heard a car alarm and reacted to it, right? The real value here is when you begin to think about what IBM is capable of doing from this point forward, where it's about correlation. It's beyond the moment of detection. It's then being able to ask the ITSM environment, what are you seeing? Can we correlate these two pieces of information together to begin to stray away from false positives and start looking for other activity in the network stream, maybe in the application to host to host a data stream? And how does that again tie back to what we're seeing from the corruptive inline detection perspective? I think that's when you begin to see true operationalization of AI, absolutely. This is a big deal for customers to be talking about cyber resiliency. It's top of mind. What are you hearing from customers in terms of the concerns they have about cyber resiliency and what are they specifically asking you for? Okay, that's, I mean, nobody likes to talk about the fact that they've been breached, right? So it's a little tough. We promote the message out there beginning a customer to step up and say, I was breached can be a little tricky to do. But all in all, what we're beginning to see here is that more often the security office is getting involved in the kind of decisions that are getting made. I mean, Sam and I deal with it internally at IBM, where if we want to bring on a new application, it's got to go through a pretty strict rule set and validation process through the office of the CISO. And we're seeing a lot of organizations deal with that as well where they're looking at how does the introduction of new technology impact the data strategy that we've set forth? More so though, what we're beginning to see is not so much, you know, forefront decision-making considerations, but how do we take the environment that we have today and ensure that it aligns to whatever the data resiliency strategies are that we need to have in place. And so they're looking for ways to where the engagement that they have with the vendor that they choose is just as consultative as it is delivered in terms of innovation. So I'd mentioned the ability to understand the attack surface and work, you know, with that vendor so that we identify where we go and shore up the areas that are risk and gap-related. And then setting up the defenses that are necessary to protect that data, the moment that it hits the actual storage array itself, to then integrating with these different kinds of runbooks or ITSM environments to automate the response, those kinds of operations. I think what we're beginning to see more of than anything else is data resiliency isn't the afterthought, it's gotta be the very first thing that you think about before the application gets installed. I was gonna say it has to be baked in from the beginning, right? If you don't build it into the platform, if you don't build the AI into the platform, it's, you can't just bolt it on. It just doesn't work that way. There's a lot of bolting on going on right now. There it is, there it is. You're not gonna go off and buy a car without brakes and decide later that you might need them, you know, you wanna have the brakes on that car right away. Yeah, you gotta be adventurous. You don't need brakes. You got those handbrakes. I saw that near half of organizations paid ransom last year. So of those of you who are watching this, I'm sure half of you have paid a ransom. So it's serious. When I talk to organizations, it depends on who you're talking to. If you're talking to a CISO, they're trying to figure out who to turn to within the organization to ensure that we'll be able to recover my minimum viable company within a certain amount of time, right? In a lot of cases, there's regulations that say you need to be able to recover or show you can recover certain applications within a time window. How do you achieve that when you don't have individual owners? In today's organization, it doesn't really exist. So that's where we're trying to help. I wonder what percent of the 50% actually got their data back. That's the other piece. It's actually a probably, I think 17 to 20% somewhere in that range that actually... Wow, that low. The other thing is it's possibly illegal to pay ransom, not the act of paying ransom necessarily, but if you're paying a rogue state like North Korea, that is against federal law. So you really gotta be careful out there. Why IBM? Can you pitch us in the audience on why IBM? What's your unique differentiation that separates you from the competition and the real core of your value proposition? Maybe you both could take a stab at that. Maybe I'll start on the cyber resiliency point for a minute. I think we have some really unique capabilities. First of all, we bring together, I mentioned that there's multiple people in the organization responsible for resiliency but there's no single owner. We're bringing together all those different people. So think about it. We have backup. We've been leaders in backup software for many years and we have the capabilities to back up your application to protect you from all different types of problems. We provide some of the most resilient primary storage arrays in the world, including for mainframe, where we can give you endless uptime. We have 100% availability guarantees on our primary storage. So we have that piece. We can do data replication. We can do what we call safeguarded copy, which is immutable copies to protect you from ransomware. And then we have capabilities from our AI team that can do things like detect anomalies in real time. And then we have IBM security, which can help you with your upfront security and also response. We can bring together all of those pieces into one single solution to reduce your detection window, reduce your recovery time and bring all the skills you need to ensure, first of all, that you're prepared upfront and secondly, that you recover your company as quickly as possible. Nobody can bring all that together like we can. The security, the primary storage, the secondary storage, all of that. We're your best line of defense. So it was pretty comprehensive, Scott, but anything you'd add to that, why don't you give us, we'll give you the last word. Sure. I mean, I think what Sam had outlined is not only accurate, but it's definitely what we deliver. For those of you watching, I would just simply say that understand that everybody in the IBM Storage Business Unit, when we wake up, we wake up knowing that it's our responsibility to make every bit of their data available to them in the most insightful way possible and the safest way possible so that they can take informed action on it. Like that's why we go to work every day. And that stretches across the entirety of the portfolio. That touches on cyber resiliency, primary storage, secondary storage, distributed file and object, software defined storage, Watson X integrations, you name it. It's all about making data that's not only available, but relevant data available to the business so that whenever they make that decision based on it, they know it's protected from harm. They know that it's of the highest referential value possible and they can access it from anywhere they need to. Guys, thanks so much for coming into the studio, sharing your story. There's much more to this story. We're going to be talking about cyber resiliency, AI, data, data value all day. We're going to also talk about ecosystem. So keep it right there, Dave Vellante, for Rob Stretch A. We're live in theCUBE's Palo Alto Studios and we'll be right back right after this short break.