 Live from Las Vegas, Nevada. Extracting the signal from the noise. It's theCUBE, covering IBM Edge 2015, brought to you by IBM. Welcome back to Las Vegas, everybody. This is theCUBE, this is IBM Edge 2015. I'm Dave Vellante with Stu Miniman. John Choego is here, he's the management partner of Choego Partners International. Consultant, extraordinaire, storage guru, and provocative dude. And also the pen and pen and teller of storage. John, great to see you again. It's a pleasure to be here again. So this is your third edge, I think, our third edge. Yep, I actually have been here since the very first one in Orlando. Right, yeah, us too, and I've said frequently, as it relates to IBM's storage business, it's got a shrink to grow. Well, they have the shrink part down. They've done that. The system's business is rationalized. Jettison, the X86, kind of out of the Gerstner playbook. And have we hit bottom? Has IBM hit bottom? Let's wait and see. I'm looking to see, I would say the big announcement that they made other than TAPE, which went to 220 terabytes on an LTO cartridge just about a month ago. Yep. I would say the other big announcement that they've made is sort of the rebranding of a lot of their storage product line into spectrum. So, you know, their software defined storage is actually XIV software off the XIV array, which is now a spectrum product. They have Spectrum Accelerate, which I believe incorporates their flash stuff. They have Spectrum Archive and Spectrum Protect. That's very interesting. And I made a little quip about it in a blog recently. I said, you know, there are a lot of ways to talk about integrated technology. And one way is just list all the products on the same brochure, and it's integrated at the brochure level. I'm hoping that we're integrated beyond the brochure level. And I'm here to find that out. We were at EMC last weekend, and somebody joked the only time the products all come together is on the PO. And so, that was a pretty funny line. That's true, and it's also, it comes together when they're packaging the company for sale. But it's not just EMC, right? I mean, this is sort of a symptom of our entire business. So, okay, so IBM's, from a storage perspective, sort of redoing the portfolio. Jamie Thomas, now in charge. He's a software executive. You're sanguine about the software-defined strategy. Yeah, actually, I've been tracking software-defined for a long time. To me, there are a couple of different ways to look at it. One, obviously, is that it's a back-to-the-future moment, okay? I mean, when I first started in this business 30 years ago, it was a mainframe, and SMS, system-managed storage, was what we were working with. So, we had these dumb, direct-attached devices called DASD, and we had all those chewy goodness of the value-add software was in a software layer that lived on the operating system of the mainframe. Now, we're back to the future. We're going back to that model again. I'd say that the one thing that has me a little concerned is that for all of the, I think, intelligent engineering that's being done to migrate value-add software off the array controller and stick it in a layer of software on the server itself, they're ignoring, in most cases, capacity management, that functionality is staying on the array controller. That's a huge mistake. I mean, if you look at what data core software is doing, they virtualize the capacity. You look at sand volume controller from IBM together with the XIV software, they join those together, they virtualize the storage capacity as well as all the services that go with it. So you talk about back-to-the-future, you use system-managed storage as an example, but part of the drawback of, as I recall, the system-managed storage, you would physically allocate a data set to a class of storage and then if things changed, you'd have to manually reallocate. So there wasn't that sort of intelligence. Is that changing? Actually, I think that we're talking about a slightly different era, though. Maybe that was the 70s. In the 80s, we saw DFHSM and we saw the migration of data based on whatever parameters you associated with that data from date-last access, date-last modified, whatever, and you'd move it from one tier of storage to another. The thing that really has me concerned right now is I look at all the woo around hyper-converged infrastructure. You take a commodity server and commodity storage and you gloom together with some software to find storage and you sell it as an appliance, right? What they're doing is that and maybe Hadoop, they're flattening out the storage infrastructure. There's just one speed of storage, one capacity of storage, and that's for everything. And how do you tier in an environment like that? I mean, I've never thought that tiering was a solution as good as Archive, but we're going to have to get a lot smarter about how we do this. Now, you look at a product, an object storage product, like a Keringo. Keringo has something they call Darkive and Darkive takes the data that's gone to sleep, moves it to a set of spindles and then spins those spindles down. I don't know that, I love that, but it's better than nothing and it's sort of a shelter in place idea, okay? We're going to have to start thinking in those terms for virtually all of our data if we go to a flat infrastructure model. Yeah, so John, actually, if I could respond to the hyper-converge piece, I think it gets boiled down a little bit too much of the appliance and many customers will, especially in the mid-range, buy a couple of nodes and do that, but our vision for that space is really, you're creating a new platform for storage. The services can be layered on top of that. You've got many of them are flash and disk, some are doing some all flash, but it is really allowing you to flexibly add and remove components and services and applications. Almost like, you probably don't like the term cloud, but it's trying to make it simple as opposed to saying, okay, I've got some infrastructure and I need to upgrade it. I need to migrate to the new thing. It's really kind of, we called it server sand, which was taking that old, early benefits of network storage and just bringing it closer to the compute layer, but scalable, really distributed architecture is what we say it's need, that it happens to come in an appliance format today. It is just what makes sense, but it's things like .NET's storage services for Microsoft and others that are going to make this kind of new generation of storage devices. And don't get me wrong, I don't disagree with that concept at all. I think that anything we can do that will drive cost out of the storage infrastructure and drive efficiency up is a good thing. I think that there's still a debate that needs to be had about what services must still persist at the array controller level. I don't think we've really gotten to the end of that yet. It might be something as simple as a raid or as simple as a ratio coding needs to be done close to the data. That's a question that has yet to be answered. I think it's a good subject for a paper. What I look for right now when I look at hyperconverged is two things. And that is, one, am I flexible in terms of what workload I can support? Right now, most of the hyperconverged products that are out there are isolated behind a specific hypervisor. So you're going to have a proprietary stack of technology behind VMware, a proprietary stack of technology behind Hyper-V, although I was at Microsoft at night last week and a Microsoft guy said, oh no, no, no, we're open. And I said, really, you're open? And they said, yeah, we're open. You can take those VMDK files and migrate them right over into the Hyper-V, a software-defined storage pool, cluster storage spaces. We give you a little utility that converts them from VMDK into VHD. Yeah, absolutely. Absolutely, imparting them into Azure's open. The door's always open for your application to come your way. But an interesting point, you mentioned, Microsoft's been pretty open lately. They're doing a lot with Linux. Azure's pretty open. I mean, we're here at an IBM show. I mean, IBM, I still remember when they put billion dollars down at Linux. Agreed. Microsoft's doing pretty good and open these days. I agree, I agree. And the other thing that I see as an encouraging sign is Nutanix, which used to be sort of an exclusive VMware play. I don't know if they had a falling out with Master Blaster or what, but they've now announced they're going to have their own little hypervisor sitting on top of their storage rig. So instead of migrating all the software-defined storage over to the server side, we migrate the server down to the storage. Well, you've all riled, right? That when we announced, of course, all the guys from Nutanix and other hyper-converged guys would say, hey, that's validation. Behind the scenes, they were like, hey, that's fighting words. Strategically, that is a concern of mine, is can I support all workload? Because we already know we're going to have, if you believe Gartner and IDC, we're going to have 75% of applications virtualized by the end of the year, okay? And another 25% that are high-performance transaction processing stuff that nobody wants to virtualize. So you've got at least two different storage targets now. Two different work sets, two different data sets that are going to have to be protected, hosted, et cetera, et cetera, et cetera, right? And you take the flip side of that, the survey data that was started coming out last year, and we discovered that in 80% of cases, companies that were surveyed, said that they were going to diversify their hypervisors. So now they're going to have more than one hypervisor. Now we're going to have multiple stacks of data that we have to manage, and no common way to do it, that's ridiculous, okay? So what I'm looking for, the SDS guys that will play the hyper-converged solutions that'll play across multiple worker, and then I also want to see on the Y-axis, if you will, hardware flexibility. I don't want to be joined in the hip to proprietary hardware. You look at Evo Rail, which you mentioned, you got a set of hardware selections, you can have it in any color as long as it's black, okay, versus a Starwind or somebody else, any of the third-party guys who are pretty much open, they're trying to embrace the broadest range. Well, they kind of have to be, right? Why not? They don't have a proprietary install base. Well, now hyper-converged works for them because all the server vendors who've been getting their butts kicked is commodity product, are now trying to join in alliances with anybody that'll join with them to create their little appliance for hyper-converged. So you got Huawei making a deal with Datacore, and you got X-Bite Technologies now doing a deal with Starwind. Everybody, his kid sister, lining up to get in the process for being certified for HP and for Cisco, UCS, and for Lenovo. So everybody wants to be relevant. This may provide the glue that will allow them all to come back in again. If I can come back, you brought up a really interesting point that seems to be overlooked by most, is that most hyper-converged solutions are tied to that virtualized workload. And there are plenty of workloads, not just certain legacy workloads, but some things like Hadoop that were not virtualizing today, and therefore I need bare metal. When IBM bought SoftLayer, that was one of the things that was a little bit compelling, is, you know, it's physical. I can do virtualization if I want. I can do something else, and that does tend to get glossed over a bit. I get it. And frankly, that's where all my research and my attention is going these days, because it seems to be the only area of momentum in storage right now. I know Flash is still out there, but Flash has suddenly gotten sane. You know, I mean, you talked to Eric Eiberg here from IBM. Eric isn't pushing Flash as the solution to everything, you know? Well, go ahead. No, I would say early on there was a lot of oversell. Flash was being posited as the reincarnation of the savior or something, which is totally false. If I'm going to do, for example, a write buffer, I'm not going to use Flash for that. I'm going to use DRAM for that, because it writes faster than Flash does. And we just have to simply acknowledge that. Also, I would mention just tying it back to the hyperconverged and software-defined, Flash is terribly misused by both VMware and by Microsoft currently. They're not optimized to use Flash, probably. In the case of Microsoft, when they do their deduplication process, they write it to Flash, and then they're doing all the small block affixes to the dataset, and they're hammering the Flash over and over and over again with small block writes. And it's like, how do we burn out Flash, you know? Let's do it fast. Well, we'll take care of that with wear-leveling. Okay, but so Flash, let's talk about the economics of Flash, which are starting to get really interesting when you not only deduplication and compression, which everybody talks about, but the data sharing aspects of it. In other words, the number of copies that you don't have to create physical copies on different storage devices that you can serve out of Flash. A lot of people think that Flash is going to be less expensive than virtually any spinning disk. Already is less expensive at the high end. So back to your point about tiering, why not have just a Flash layer and a bit bucket one-way trip to a tape land? I've suggested that, I think on our last program together. I don't want to be one of these disk as dead guys anymore than I want to be a tape as dead guy or a Flash as dead guy. I think there's room for everybody in all these things. You know, what we do basically, we use in particularly storage technology, we use it as a mechanism to spoof, okay? That's what it's really all about. There are only two ways to speed things up. The real way, parallelization, or the phony way, spoofing, making it look like it got faster. NetApp's been doing it for years. They stick a Pam card or a memory chip in front of the disk in the back because the disk is slow. So you write your data and it says, okay, we got it. Actually just wrote it to a memory buffer before and it's waiting in turn. It really doesn't have it. It's got a bottleneck waiting to happen. Exactly, but what they basically have is a spoof going on, okay? We're doing it in mainframes with channel extension for years. You know, you write to a device locally that pretends to be the device you're writing to. It's actually an emulation. Writes it across a WAN. There's a device on the other end. It pretends to be the mainframe and it writes to the local devices that are attached. That's spoofing, okay? And there's nothing wrong with spoofing. It's a time-honored engineering tradition. And that's a role that Flash has, I think, been used for optimally in a lot of respects. Flash has really kind of become a spoofing layer. And just as disk was prior to that, in front of tape, we were writing to disk first because tape was a little slower to take the right. Not so much anymore. I mean, I'd say tape is probably faster than everything except Flash, right? Yeah, that's right. It's a time to last bite. Right. You know, it's going to be faster on tape if you put it in the buffer properly. But you talk about back to the future, I think of MVSXA and expanded storage. And the problem was it wasn't persistent. Now with Flash as a memory extension, you actually can get a persistent version of expanded storage, if you will. Do you think that has potential? I do. And frankly, I see Flash being inserted into a lot of roles where it simply provides that buffer, provides the additional space to allow other componentry to catch up. And it basically keeps your infrastructure synchronized with the latest speeds and feeds requirements of the workload. What I really look for, though, in storage, and I think people are, you know, the economy may be showing signs of turning around, but I think people are still very nervous about what stuff costs. And to a certain extent, the Flash numbers look very appealing. They look enticing. But I think the bigger issue is software-defined storage. You know, when I look back at the original dated domain deduplication rig, it was 301 terabyte hard drives, cost 79 bucks a piece on NewAge, and they were selling the rig for $410,000 because it had this wonderful, chewy goodness value-add deduplication software sitting on it, right? It's ridiculous, and that had to change. So if we could take all that chewy goodness stuff that the vendors were using, artificially prop up the price of the model here, extract it out and throw it into a software layer, I think we're doing the Lord's work. And give me an API until I can programmatically, you know, manage. Exactly. Well, I don't want an API. I want REST. You want REST, yeah. Absolutely. I know that's formulated as an API, most of the users, but, you know, unfortunately, it hasn't really been jumped on with the vigor that one would expect, mainly because hardware vendors see no value in improving common management, right? All right, John, we're out of time, but I love having you on as a guest. You can really, you connect the dots in the business. You pay attention to the little guys and where the innovation is actually happening and try to push that, you know, on the larger guys and help customers, you know, see the way through. So, thanks very much for coming on. Thank you very much. I appreciate the opportunity. Keep right there, buddy, we'll be back with our next guest. This is theCUBE, we're live from Edge 2015. We'll be right back. Dub has been called the ESPN of tech, and really our vision is to cover every event that's out there. We really...