 Live from Las Vegas, Nevada, it's the Cube at IBM Edge 2014, brought to you by IBM. Welcome back to Las Vegas everybody, I'm Dave Vellante with Jeff Frick. John Twigo is here. He is an author, a consultant, a pundit, an IT practitioner, drunkendata.com is his blog. If you don't know John, check that out. We're here at IBM Edge. This is day one. We're here on wall-to-wall coverage. This is the Cube. We go out to the events, we extract the signal from the noise, we have great guests like John on. John, welcome back to the Cube. It's good to see you again. Thank you. Dave, I appreciate being invited back out again after our last encounter. Well, you're always a great guest. Controversial, you don't pull any punches. You don't put up with vendor BS and we love independent thinking on the Cube. So what's new? We're going to talk about software defined, tape renaissance. You've got a new organization called ITSense.org, but let's start out with Edge. What are you doing here? Well, first of all, I came out here at the invitation of IBM. This is the third year of this conference and the third year that I've both been a presenter in TechEdge and also participant in their social media activities because I like to tweet and be snarky and make observations about what's going on and what should be going on and what I'm hearing and what I'm not hearing. And apparently I haven't offended enough people yet to get barred from this conference so they keep on making the mistake of inviting me back. Well, let's see what we can do about that. So the big meme is software defined and here are all kinds of sort of funky stuff about separating control planes and data planes and commodity storage and object storage. You are an advocate for IT practitioners. That's your brand. You've always put that forth. What are you telling IT practitioners about this whole software defined buzz? Well, first of all, I think that we're all in agreement on the general purpose of software defined. The concept is so alluring of a Lego building block approach to building infrastructure, to allocating resources on the fly to a specific requirement, getting to atomic units of deployable technology and being able to shape those into molecules that we need in order to do our jobs. That would be a wonderful world of wonderfulness. It also flies in the face of, what is it, nearly 60, 70 years now of vendor proprietariness. Where vendors realize in the storage realm, for example, everybody's selling a box of Seagate hard drives, but they don't want anybody to know that. So they deliberately raise obfuscating measures that they call value add software to make sure that their box will not interact with their competitor's box. Try moving data from an Icelon platform at EMC over to a NetApp. You get a real clear picture of what I'm talking about. Try moving data off of a Centera. Thank God they're decommissioning that platform now. You end up with Data Roach Motel's Islands of Automation. That's a huge problem and it's an impediment to our ability to respond in an agile way to the needs of the business. So yeah, in concept and theory, a software defined is the way it should be. The problem with software defined is, number one, a syntactic problem. I've been doing software defined for 30 years that I've been in this business. I decided what part of the business process I was going to automate with application software and then I looked at that application software and I deconstructed it and figured out what kind of infrastructure I was going to need to support the software. It's always been software defined for me. This is just another spin on something different. This is a spin on cloud. It's a spin on as-a-service. It's a spin on virtualization. All of which are just change-ups of the same nomenclature every six months. There's a leading vendor out there, I won't say which one, EMC, that likes to just change the terminology every six months because it really isn't doing any innovating at all. So they just change the nomenclature, make the sound like it's the sexy new thing and they got a bunch of fanboys that will go along with them. The real problem that I have with this generation of usage of software defined is that we are told certain things, I think, unrealistically about it. One that now, I'm covering this while I'm here at Edge, disaster recovery, the need for that goes away. Because software defined is all about high availability architecture. We're building in rather than bolting on the recoverability of the infrastructure. I read that in articles that have been placed, usually by vendors, in most of the major pubs that are out there right now. DR is dead, the death knell of DR, it's sort of like what you hear about tape. Tape is dead, the death knell of tape. We've heard that from how many pundits, how many different analysts and how many different times, right? What they're really saying is that they have decided that in their internal infrastructure, they're going to put some high availability elements. That's great as soon as, assuming that the high availability architecture works, which in the case of a VM where my experience with a site failover, it works about 40% of the time that I try it inside the same sub network. So I'm not really sold that it even works there. But we've always had high availability as part of the set of options that were available for doing disaster recovery. It was always one of the capabilities we could bring to bear. It was just so expensive to be, to replicate everything somewhere else that only the companies with the deepest pockets could afford to do it. So you had banks that were building redundant data centers in the 80s, and they were failing over between two different sites. However, if you asked a mom and pop shop whether they were going to pursue that method, they were going for hot sites, shared facilities, or they were doing laissez faire, next box off the line, because that was what they could afford. Disaster recovery isn't just about perfection. It's also about budget. And your ability to protect your assets is directly proportional to how much money you've got in your budget to spend on it. And whether management wants to spend money on a capability that in their view, in the best possible circumstances, will never need to be used. So I'm concerned that the software-defined rhetoric is beginning to make it sound like DR business continuity planning is yesterday's news, isn't something we need to concern ourselves with anymore, which plays, of course, into the receptive ears of people who don't want to spend a lot of money on IT, but it isn't consistent with reality. To trap. So, but how will or will the whole notion of DR and business continuity change as a result of this notion of software-defined? Right now, I would say that I would have to create a continuity plan simultaneously with whatever program I'm using to roll out any kind of software-defined. Why? Because I already spoke about some of the deficits of the virtualization side of the house. The processor pooling and processor abstraction, the virtualization software, the VMwares, the IPaVs, all that, they are bringing in a new mix of potential problems that are a new challenge for disaster recovery. We've seen it before. 20 years ago, they were doing service bureau computing on mainframes and they hadn't invented Prism yet and one application would fail in a multi-tenant mainframe and all the applications would fail, okay? And it wasn't until Prism and logical partitioning was introduced that we were able to isolate workloads so that one workload that was problematic could fail off and the rest of them would keep on humming. VMwares powered by Jenga, one app crashes, they all come down. Now you've got a major hit to an organization at a time where we're so dependent upon automation for our businesses that even a small short-term outage is a major disaster for us. Times have changed and I would not even venture into that realm if I didn't have a disaster recovery plan already in place. I think it does nothing but accentuate the need for continuity planning, not eliminated in the least. I want to follow up again on the software-defined piece. You took a couple shots at EMC before but EMC would put forth this notion of software-defined. I've interpreted it as a way to sort of rationalize the stove pipes. Is that goodness? I don't think that it's goodness at all. In fact, you've hit on one thing. When you get into software-defined storage, which is a subset of that category, I'm talking about software-defined data centers. Storage is a subset. There's a war going on right now between EMC paid bloggers and other bloggers who are saying on the one hand that, oh no, software-defined storage isn't storage virtualization. Storage virtualization aggregates capacity, surfaces, virtual volumes, assigns services to them from a centralized service provider and then associates them with workload. Software-defined doesn't do that. Software-defined allows the storage to remain stove piped and just centralizes some of the services that are available from that storage. That's a very different thing and what it really is is protecting the core value of a company like an EMC that wants to protect its strangleholder and its customers for its hardware. So differentiate that because customers will buy into that strategy. Of course they will. Now differentiate that from IBM. IBM's obviously got SVC and they've had it for years. IBM has real potential to do better than that and I think they're a mixed bag, okay? I love IBM, don't get me wrong. I used to use them in many data centers but I think that you've got competing interests at IBM. We do have the capability, they went out and bought XIV which was software that was designed to be used across all your storage. It was an infrastructure level play and they joined it at the hip to a proprietary controller on one array and when you run out of space on that array, you can't expand it. You got to stand up another array with a separate instance of that software, okay? That's stove piping. That's wrong, okay? If they had taken that and joined it at the hip to the sand volume controller, that would have made it a set of services extensible to all storage. Isn't that what IBM's software defined is all about? That's what I would hope that it becomes. However, it wasn't as of last year and I haven't heard their announcements yet this year. Okay, I think that's the direction that they're going from what I saw last year. Yeah, so a lot of players stopped there, they're trying. Well, and what I saw in Boston last week as well. And then I also want to talk about Open. OpenStack or OpenAPIs. Because you have a set of companies that are giving lip service to OpenStack and maybe it's more than lip service. And you've got others who are actually contributing. What's your take on the importance of openness as it relates to software design? Well, again, we were talking at the very outset about atomic units of technology. If you can get to an OpenStack platform that is truly open and that any vendor can plug and play into and has a set of available APIs that anybody can write to, you've just done the world a big favor. That means that when I need to roll out 50 more seats of an ERP package, I can pretty much predict in advance how many more servers I'm going to need, how much more network I'm going to need, how much more storage I'm going to need and just roll it out without a problem. OpenStack is a critical enabler of that. If it holds muster with all the vendors who claim to be supporting it. I look more at the software defined network space for the example of what can go terribly wrong. We have OpenFlow and we have Open Daylight and now we have Cisco's own thing and Cisco basically makes a legitimate point. They say, while we understand control layer being separated from data layer and everything and then it creates a bunch of generic edge devices that don't even have, they're kind of clones waiting for some personality to be dumped into their brains. The fact is we make our money off of selling proprietary hardware. You're asking us to slit our own throats to join an open initiative? That's a race to the bottom. What company is going to develop the next quality of service guarantee standard if they're not going to get compensated for it? If it's not something they're going to earn heavy money off of. Cisco's done a lot of work. They've contributed a lot of their stuff to OpenStandard's communities, IETF. Some people would argue as a Cisco beast, okay? But the fact is that they take the stuff that's kind of become generic and they trip it over into the public domain after a while, but the stuff that adds real value that they feel is going to generate revenue for them, they want to hold on to it and I can't blame them. And how is that different from EMC? Because EMC's not making his money? I'm not disagreeing with you at all except that I think we've reached a point, we already reached the point with servers where servers are so much the same, there isn't enough of a differentiator and you're just gouging the consumer by trying to make this server seem like it's better than that one. I think we've reached that point with storage too. I think storage is disk, it's a box of disk drives with the same adapter cards, with the same plumbing, with the same cabling, and now you get into the flash realm, you got an overabundance of flash vendors, there's going to be an implosion in that market. They already use cutthroat pricing right now in order to gouge each other to get the customer and all of them are basically buying and selling logo art at this point. I see that whole market is crashing very, very soon. Jeff and I were talking about that earlier about the sustainable differentiation in storage products. As it currently stands, however, I've had great meetings with people who are working on next generation flash ram that doesn't have the where problems. I had a meeting over in Germany with a guy who won the Nobel Prize for Conductive Polymer. He says pretty soon we're going to be creating chips, memory chips out of plastic that you can manufacture for two cents each. Okay, that will radically change everything. He says imagine your network cable going away and all your data is simply replicated from chip to chip to chip to its destination. It's a viable thing. They're already putting them in teddy bears. That was his big thing. We've got them in teddy bears. When they hit toys, you know it's commoditized. It's not the Tuma. It's a teddy bear. How important is it for a customer, from a customer standpoint, to be able to make, for instance, an API call to be able to provision, let's say, come back to SDS, capacity and performance and guarantee a level of quality of service and be able to do that in some kind of open API. To me that's always been a goal for any customer. When I look at the model for a software-defined data center which is essentially trying to create a set of open APIs that you can cobble together and write scripts to and provision that way, you look at the NIST model for infrastructure as a service, looks a hell of a lot like a traditional data center. That's what it is. We can call things by different names but we're still confronting the same problem. Did virtualization software give us some new tools to play with? Probably. Are they fully refined yet? No. If we could do things say the way Datacore does them in the storage virtualization space, we would have already solved most of the problems of storage. Okay, Datacore surfaces a virtual volume. I'm concerned if I'm VMware that I need the volume to move with the virtual machine as it moves from platform to platform. I can do that with a virtual volume that I can already create in a Datacore sans-symphony volume but nobody wants to use that. Why? Because it guts the value case of the storage vendor. He doesn't want it virtualized. He never wanted it virtualized. Why isn't IBM making more noise about SVC? Because it would eat their own profits. Storage cartel. Absolutely. So, let's talk about tape because it's interesting. You talk about the resurgence thing and I hate tape. But we've done some research on Wikibon and I had to agree with you. There is a tape resurgence going on and it's not only for cost. There's a nuance here. Actually large files, large objects on tape are actually going to perform better than disk. And this notion of flake, have you heard this? Yes. Actually, look it up in the urban dictionary. It has three meanings. Guy wearing a flag is a cape. It's meaning one. Aggressive flirting. Pinching the waitress on the tokus, right? That's flapping, okay? And then you figured out. And then there's flash and then to tape. My feeling about tape is that tape is undergoing a transition in how we look at it. For one thing, the generation that's currently in IT never dealt with tape. We've actually succeeded in creating an entire generation. I gave a talk over in London about, two months ago, I was talking about tape libraries and afterwards a very sincere young man came up to me and he said, I need to ask you a question. Why do we need to build a room with books in it to hold a tape system? He had never heard the term tape library. He thought I was talking about a physical library that you were going to put a tape system in. Okay. At least he knew what a book was. He never saw the book. He's never bought a CD or a DVD to watch a movie or listen to music. Most of the dissing people have done of tape is either based on 10 year old technology where tape was kind of flaky. It went through a period of hard times back in the 90s, for example. Or it's based on a conflation of the problems of backup with tape as the media to which you were making the backup. Most tape backup issues come down to mismatches that are created by backup software, which is inherently brain dead. Okay, I take a whole bunch of sources that I need to write data from to a tape library and the computer software that I bought for tape backup creates what's known as a super string. It's a braid, a braid out of my hair. If you have long hair, which you don't, you're not following the challenge though. That's a good thing. You know that the shorter hairs fall out of a braid and the braid unravels. Okay, the same happens with a tape job. The shorter jobs finish first and then the super stream unravels and the job that was supposed to take four hours and you left on Friday night to go drink. You come back on Monday morning and it says it's still got 20 hours left to go. Why, because you're back hitching and shoe shine. You're not operating the tape and it's ready to stream. It's not because tape sucks. No, not at all. It's always been because the users of tape suck. Okay, the carbon robots are the problem, okay? Not the robot. So anyway, I always thought the tape was taken on the chin but then when they came out with LTFS and a file system for tape that's viable finally, that integrates well with the GPFS stuff, which is now part of the Sonas concept from IBM. Now all of a sudden we have the capability of creating what I would call a hybrid NAS, okay? A company that's doing real work in this space is Crossroad Systems. They're preparing to make a big announcement. I won't steal their thunder on their Strongbox product that has all kinds of new features and functions in but the point is that you've got basically a NAS head with some internal disk where you can store some of your files or stubs of those files and then the back end is tape, which is limitless capacity. One of my clients is a pharmaceutical house. They were required after people started dropping dead from Vioxx, the FDA decided you needed your clinical test trial data in a near online state so that they can check and see if they missed something if people start having no consequence, right? So the guy said, look, I've got more money than God. I own the patent on Viagra, okay? I can go out and buy whatever storage array I want. But he's in Connecticut and he said, I can't get any more energy dropped into my data center. In his area, the grid is saturated. So tape provided an option with LTFS for him to write all of his clinical test trial data out. It's hardly ever re-referenced. It's never modified. It's a perfect candidate for tape based active storage. And he stored it all on there. He said, I got like a hundred petabytes of storage on a couple of raised floor tiles consuming less than three or four light bulbs of electricity. Show me a disk platform that can be there. So we got to leave it there but you start thinking about extracting metadata off the tape up to that layer. And remember, disk heads aren't getting any faster. Tape heads will allow bandwidth progressions. So you're going to see performance, not time to first data, but time to last data. Better on tape than on disk for large files. Most of the time you don't have to worry about time to first byte because you're talking about data that's hardly ever accessed. It's the worldwide weight and we're all used to that. John Choygo, thanks very much for setting the record straight. We really appreciate you coming on. Keep it right there. We'll be right back with our next guest. This is theCUBE. We're live from IBM Edge.