 Live from Las Vegas, Nevada, it's theCUBE at IBM Edge 2014. Brought to you by IBM. Now here are your hosts, Dave Vellante and Stu Miniman. Welcome back, everybody. We're here at IBM Edge in Las Vegas. We're going to drill down at the software side. The woj is here, Steve Woj. Welcome back to theCUBE. Thank you, Dave. Let's see, this is probably your, I don't know, fifth time on maybe more. Once it loses a one hand, who stops counting? So hopefully we'll say five. I remember it all started at Edge. We first met three years ago. Steve's the vice president of storage and network management in the software development side of IBM. That's right. Inside of Tivoli, inside of cloud, the smarter planet. That's a smarter infrastructure. That's right. That's a smarter infrastructure. Right, smarter planet, sorry. Good ads. Yeah, good. Good role. Again, welcome back. Thank you. The world is going software-defined. Mark Andreessen said software is eating the world. Software-defined is eating the world. What's that all mean? Well, you know, when you say the values in the software, it's a matter of creating an abstraction layer between what you're trying to separate between the hardware and the applications, except again. So, you know, VMware was a software-defined compute, right, abstract the hardware from the applications and the users. Software-defined network, throw in a software layer. Software-defined storage is the same thing, right? Throw in an abstraction layer between the hardware devices and the applications and users that use it, and now you've got software defined. So, you can mix and match hardware. You can write the application to a single interface and take advantage of all those underlying physical resources and create abstraction. Software-defined abstraction layer between those physical things and the things that use those physical things like the applications, the users, the network, the compute, and storage. So, in IBM, we have this thing called software-defined environment, and it encapsulates computational resources, servers, networks, and storage. So, in order to have software-defined anything, you've got to virtualize the underlying infrastructure. Is that a premise? That is a basic premise. I mean, you can talk about having software that allows interaction between software and your physical resources, but unless you can do it in a heterogeneous way, abstract all of the physical characteristics and the uniqueness of those physical devices, it's really not an abstraction layer, per se, right? If your application has to write to a specific API to access a specific resource, to me, that's not a level of abstraction. So, write the application, write the database, write the solution to a software, piece of software, and let that then communicate, interact with the various hardware devices underneath, from us, from any other vendor, as long as they expose those, you know, interface with the eyes of those resources to an abstraction to be able to be managed, then you can have an abstraction layer in there and have a software-defined environment, software-defined computation, network, storage, et cetera. So, on storage, IBM, non-IBM, you know, file, block, object, virtualization layer, virtualization allows you to abstract that layer, include management, include built-in analytics, include the ability to protect the data that is being created by those applications, and then have the application physically interact with one interface, that software interface, and you can move data around, systems, you can deprecate systems, you can pull a system off for maintenance, you can add systems, you can, you know, do whatever you want, underlying it, with zero downtime from the application, because, you know, that's that software layer's job, right? Is to protect the applications from all the turmoil that could be happening underneath, filling up volumes, you know, hardware going off of lease, or being deprecated, or being, you know, full, or, you know, tier one versus tier two versus flash, being able to move the workloads around from, you know, hot data to cold to long-term to on-premise to off-premise, so, you know, create that abstraction layer so that the applications, you know, continue to run. I mean, that's what it's all about, right? It's covering from all that stuff. Yeah, it's 100% availability of the things accessing those physical layers, so, you know, virtualization of storage, you know, IBM has the patent on storage virtualization. Okay? 30 years ago, right? 25 years past before the real commercial implementation of a storage virtualization system called Sandvium Controller, SVC, which is now part, you know, which has been in the market for a decade, and is now part of Virtual Storage Center, which incorporates virtualization management, snapshot protection, et cetera. But still, less than, you know, a paper was just published last week from ITG that says less than 20% of all storage resources are virtualized, less than 20%. We all know that more than 50% of all computational resources, you know, compute, VMware, ZVM, you know, eight p-hype on power, you know, Hyper-V, over 50%, but on storage, you know, data's continuing to explode. Customers, you know, can't figure out why. They don't want to throw anything away because, you know, we all know the minute you delete something is the minute you need it for whatever reason. And so the trick here is that to get data centers, customers, line of businesses, applications, databases comfortable with, virtualization on storage has as much or more benefit than on the computational side on, you know, like VMware or ZVM or things like that. We've had a, it's been in the market for a decade, right? And as part of that report that was just released last week, if you take one terabyte of data on tier one and can actually move it to tier two based on policies within that virtualization line, you can actually save $13 million over five years. It's a lot of dough, large. And capacity utilization right now, in what we call the, you know, the work, you know, the systems of record, the workloads that are kind of out there today, not the new era workloads, but 11 million bucks can be saved by simply improving the capacity from an average of 3% right now to 70 or 80% with a virtualized environment. Inside of IBM, you know, we had this experiment that we did four years ago, first of the kind, built in analytics, throw in virtualization and be able to move data around, increase the capacity, move data between systems. And we actually cut our storage costs inside of IBM, the IBM data center, cut our storage costs in half. So we took all that stuff, analytics, virtualization management, commercialized it. We now call it virtual storage center, VSC. It's available to anybody out there today to be able to take advantage of these savings that can be had with capacity, with tiering, with what I call the indirect or hidden costs of support and software, administration costs. Just think about your storage admins. If you had three vendors' storage, you typically had to use three GUIs to manage each individual storage from each vendor if you're a lucky only one. With a virtualization layer, you can have one interface, one admin console to manage them all, right? You manage virtual storage center and it manages everything underneath it. Yeah, well, can I poke at that a little bit? So with server virtualization, we basically got standard Intel chipset underneath it. With storage, one of the biggest challenges we've had is if I buy storage from IBM, storage from HP, storage from NetApp, very different feature functionality. And there was that balance of do I dumb down what's underneath or how do I abstract that out? How do I pass through that information? And there's also not necessarily the same consolidation. The low hanging for the server virtualization was underutilizing what I have and I put it up. Storage is very different. I'm growing, growing, growing. I always need more capacity. So why is it different now? So there's a couple of points there, right? One is the pure virtualization, the pure abstraction of the underlying systems and then there's another part around management, right? So there are things that from a management perspective, you may have to go to that specific device to be able to do it. There's tools out there around Tivoli Productivity Center that can dive deep into the characteristics of IBM and non-IBM, right? Understand the difference between block and file. Understand tier one versus tier two. Understand the policies of the workloads that are coming in and where they need to be. So virtualization is just a piece of it. But the technology of virtualizing, like you said, it adds value but unless you have built-in management with analytics and the ability to automate what is learned through the dynamics of the underlying infrastructure, you're right. It'll take a lot of horsepower, manual horsepower, or extreme knowledge of lessons, learns, or best practices to be able to do that kind of thing. If you can incorporate those analytics and that automation into that virtualization and that management layer, the trick that we find is that most storage administrators, while they know the technology can do it, they're very apprehensive to allow the systems to do it for them. So what we typically do is we say, look, volume or storage device A is at 80% capacity, would you like me to move workload X, Y, Z to this other system? And you look at it and you do your own analysis and say yes, it enter, dynamically moves. Over time, every time you, so, you know, when you press, you know, X or whatever to close a window and that error, that pop-up comes up, are you sure? You know, how many times do you wish that it would just never happen? Because you never say no, yes, yes, yes, yes. If the same thing happens with storage, right? Capacities at 80%, do you want to procure more boxes, do you want to move the workload, right? What do you want to do? All right, so, you know, I think at Wikibon, we would agree that orchestration which you own is really the next battle. It's where the software-defined environment is created. That's right. So I'll give you a little bit of a loaded question. With the Lenovo deal, you know, you're taking the hardware for the compute and the networking, you know, off to a good business partner in taking the people. If the value's on the software and that's why you own, you know, does IBM still need to have the storage hardware? What's the value in owning that piece still and why that is kept while the other pieces are gone? So, you know, so you're right there, the business dynamics around what the commoditization is of some of the computational and the network things, right? And we've got a great business partner with Lenovo and the reality is, you know, whether we leverage our supply chain or our partner like Lenovo supply chain, the fact of the matter is we got parts from everywhere, right? We bundled them together, right? May not have been, you know, a business that we wanted to be in very long-term and so a trusted partner to do the supply chain and the aggregation. You may have heard, you know, our CEO, Jean Rametti, you know, data is the next world's currency, right? And so he or she who owns the data, creates the data, manages the data and protects the data is a very trusted advisor and confidant with anybody that's creating data and storing data. So we know that how we store the data, how we do it in an efficient way and do it in a way that allows customers choice whether or not they want to use our, you know, our things that generate heat or someone else's and store the bits. We want to be able to abstract that layer to allow the users and the applications to know that their data is protected, it's well managed, it's optimized, it's moved around, it'll be available when you need it to be available and maybe millisecond or maybe multi-second depending on how much you want to pay versus against an SLA and be able to have that accessible to those, I'll just call them a line of business, but it's a user and application, those things that are generating the data and trying to abstract the data from the underlying infrastructure. So data is a world currency, right? I mean, data is the new currency of the IT industry and we want to make sure that we continue to have our control over where it goes, how it goes, how it's managed and be able to do analytics and protection of it. Yeah, so in the keynote, one of the things I heard was that, you know, CIOs are tired of being jealous of Google and the like, you know, we track the hyperscale players and many times people will talk about, oh, they use commodity gear, they build their own stuff, but it's really operational model that separates, you know, the typical enterprise versus the hyperscale guy. You know, if you talk about Facebook and managed 20,000 servers with a single admin, the enterprise is lucky if they can do, you know, three to 700, you know, how far can we go today with kind of the operational solutions that you deliver and you know, how far can the enterprise close that gap and stop being jealous? Look, on the storage side specifically, I know right now, I mean, right today, we've got many clients out there that are managing multiple petabytes with one admin, right? So being able to implement a virtual storage system, be able to include the virtualization capabilities in addition to the management, the trick is exposing characteristics and analytics to those admins to make them very, very efficient and then providing a single user interface, a single console to those devices to be able to do what you want to do and take advantage of automation once you realize that a best practice, you know, incorporates process one, two, three, four, go ahead and do it. Once I hit 80%, move workload over, you know, kick off a procurement process to go buy more spending disks or more flash, right? And be able to do that. So, you know, I don't think anytime soon there's going to be a whole lot of slowdown in the growth of data. So it's incumbent upon us as vendors in the industry to help customers get control of the infrastructure that's storing all these things because there's a tremendous, there's a large amount of techniques right now that'll allow you to understand what data's not being used, what data might be ripe for deletion, right? Just defensible disposal, you know, security razor, whatever you want to call it. But they're just very apprehensive to do that. So it's incumbent upon us to help them manage that growth, understand the analytics of that data until they get to a point where they start to become comfortable that says, you know, I can actually, my data's growing at 30% of a year, I can actually get rid of 15% of my legacy data that may not have any, you know, benefit to me as a business. But just getting over that hurdle that says, I can and I'm willing to get rid of that data because I know the risk to keep it will outweigh the benefit of anything I might do in the future. So what you were talking about, you know, the human intervention factor, do you really want to do this? Do you really want to do this? It calls to mind sort of a new way of doing things. And I've got a key on something that I'm Bush said last year on theCUBE, he said that this notion of policy-based management is going to go away, that it's just too hard for humans to sort of define those policies. Things change too fast. IBM and others, me included, are often fond of saying, hey, everything we've seen today, we've seen before in the mainframe, virtualization, everything else. I think, you know, system managed storage, all this automation, everything else. But one of the things that SMS didn't do is when the data characteristics changed, those didn't ripple through to the policy. Is that finally that nirvana of as the environment changes based on the data, are we going to be able to place the right data on the right device for the most optimal cost savings to your earlier examples so we can save tens of millions of dollars? Are we there? How far away from that? So technically, if we're not there, we're really, really close, right? There's the ability to do it today and 80% of the workloads is already there. It's always the other 20% that we always tend to worry about instead of worrying about the 80%. So today, I mean, you can actually, the built-in analytics will say, look, we know that the usage patterns of the infrastructure have these five characteristics. And based on these five characteristics, I recommend as a system, you do this. We have that today. We do. You can move it from tier one to tier two. You can move it from this system to that system. You can actually move it from on-premise to off-premise, to a trusted public cloud provider and have this hybrid approach, right? All within the technology that exists today. The policies exist today in the technology and the best practices known, and the system can actually learn. So, you know, the beauty of Watson, right? Is that the environment changes all the time, right? So just think of a mini Watson as part of this, you know, analytics engine and this virtualization platform that we have that says we know based on world events or the typical usage of this type of data structured, unstructured, you know, this application versus that application, the use of this database versus that database, this e-commerce app versus that e-commerce app. This is what a typical environment will look like. Time of year, time of month, time of year. All of that stuff, right? Yeah, exactly. So, you know, Black Friday for retail is a little bit different than, you know, July the 15th. I mean, you know, there's a little bit different, you know, twice a month for pay periods for customers, right? The spike in how you access data comes twice when there's payroll coming out or when people have to sign up for their healthcare benefits or, you know, 401K comes due or, you know, all of these things, right? You know, you can only imagine the first two weeks of April what the systems look like in the US at the IRS, right? I mean, there's a bit of a spike. You know when the next big spike is? It's in August. Yeah. Why? Because the extensions are due. The extensions are due. How would I know that? Yeah, exactly. But, but, but, but, but, but do you think the IRS has the similar deployment model in January or in September that they do as, you know, first two weeks of April and in August? No. They move stuff around. They have data. They know the pattern based on history, based on geographies. I mean, there's a large preponderance in the Northeast of extensions. So they know that the data centers in the Northeast have to be more aware of these extensions northeast in California. But, you know, the Midwest, man. You know, most of it comes in in April. But they know that. And it's smart and it's built in. Awesome. All right, well, we got to leave it there. Thanks so much for coming back on theCUBE. Thank you for having me. It's an awesome show at Edge. Well, appreciate it. What do you got next? Well, you know, we're in the process of releasing lots of good technologies for customers to, you know, eventually take advantage of, right? So as with any good technology provider, we're always ahead of what the customer can actually implement. It's just a matter of them getting comfortable with adopting latest and greatest technologies. All right, we'll try to help you catch up. All right, keep it right there, everybody. We'll be right back to wrap. This is day one. This is theCUBE. We're live from IBM Edge. We'll be right back.