 Hi everybody, we're back, this is Dave Vellante with Wikibon.org and this is the CUBE Silicon Angles production of IBM Pulse. We're here live at the MGM in Las Vegas. We've got two days of coverage. The CUBE goes out, we extract the signal from the noise. Good friend Jason Buffington is here. He's an analyst with ESG, a very well known organization. His focus is on backup and strategies around backup. Jason, welcome back to theCUBE, it's good to see you again. Thanks for having me. Yeah, so Pulse, good show, 10,000 plus people. I don't know if you had a chance to catch the keynotes this morning, a lot of energy, a lot of cool stuff, great music. These tech shows are becoming Las Vegas shows. Like a Broadway production, absolutely. Broadway production, pretty amazing. But so what's your initial take of Pulse? I like Pulse because there's enough data protection in it that really keeps it interesting for me. And the fact that you can't go anywhere without talking about cloud is nothing but goodness. It's a good show for us. So what's new in your data protection world? What have you been tracking lately? What are some of the things that have been exciting you? Maybe some of the challenges that you see companies finally stepping up to an address and some of the practitioners, same. Sure, so lately one of the things I've been really passionate around is helping people understand what I call the data protection spectrum. And what I mean by that is that data protection really ought to be thought of as kind of the umbrella with things like backup and snapshots and replication and archive and availability. Those are all like colors of a rainbow, right? And so what you ought to be thinking about when you're thinking about a data protection strategy is when was the last time you saw a rainbow didn't have green or blue, right? In the same way your data protection strategy really ought to include backup, plus snaps, plus replication, plus archive, plus ability. It's that whole range of solutions that are there. And I think also one of the things that people are starting to finally realize is that that kind of a hybrid approach, those mechanisms for data protection, should also be thought of within the context of disk plus tape plus cloud. Today's data protection really ought to have all options on the tables. They're thinking about how do you want to recover first and then pick the color of the rainbow, that spectrum line that makes the most sense for that. Yeah, so the basic premise there is you've got some staples that you should have in your portfolio as a practitioner, red, green, and blue. Yeah. And you know, it's interesting. I mean, there's always been a big spectrum of backup, right? And there's been guys who've said it's all going to go disk space, guys at the other end of the spectrum, I mean, back in the day, right? I mean, you had guys like storage tech pushing tape into new dimensions and new applications. And you've just always had this, you know, bevy of colors and flavors. It's been quite a rich palette. That's not changing. If anything, it's getting more diverse, isn't it? Yeah, it really is. You know, some of the things, so we saw folks that said, you know, everything is going to go disk, it's not all going to go disk, right? So tape is still in use in a little over half of all environments today. And in fact, some of the new innovations we see around tape durability and the flexibility of things like LTFS, I expect tape is actually going to get a little bit more bump on that. One of the pieces of research I recently looked at was around the convergence of backup and archive. And it used to be that people would say like, well, you know, backup was to disk and archive, that was to tape, and we don't see that as much anymore. In fact, a couple of things I thought were really interesting when we asked folks, what are you actually archiving, right? That's one of the colors of the rainbow, right? So what are you actually archiving? I would have expected to hear people say like, well, I'm storing data for seven years, 10 years, 15 years, that's actually not true. So the average age of data coming back out of the archive is under 24 months. The average size of data coming back is typically a gig or less, 75% of the time is less than 100 gigs. And by the way, the expectation for how quick data should come out of the archive is typically measured now in seconds or minutes. When you start taking those kind of characteristics together, okay, so data under 24 months, small data sizes, and really fast recovery and retrieval times, all of a sudden tape isn't really that obvious for that solution either. So all of a sudden disk and cloud also become just as viable for those kinds of solutions. So the palette is really rich and the thing is, is that it should not be around, well, if I'm doing this solution, I should use this media. There are no rules for that. I mean, it really has to fit. What is the domain of tape then? I mean, there's certain economic advantages to tape, if you can find the right use case. I guess there's some bandwidth advantages if you can find the right use case. What are those use cases? Is it deep archive? Is it retention? Is it mobility? So I like the deep archive, but frankly, today's traditional data protection in the robo type environments, tape is still a more than adequate solution for some of those distributed environments as well, where for whatever reason, they can't afford de-duplicated disk. And the de-dupe is actually a big part of that, right? So if you're just going to go with disk-based backup and you're not de-duplicating, all of a sudden that gets really expensive. But for those environments that can't do that, tape is an interesting scenario there. I'll tell you frankly, so I have an LTO5, LTFS drive at home, and it's my T drive, right? Like copy files to and from it, I drag and drop, and it literally is, it's T drive, which totally blows my mind, right? So 15 years ago, we wanted better backup than what tape could give us. So we wanted to go to disk, but oh by the way, we didn't know how to write to disk, so we made disk look like tape, and that's how we got VTLs, right? Now fast forward 15 years, and tape actually has some new legs to stand on, but we don't know how to write to tape anymore. So now we put LTFS on it and make tape look like disk. Totally flips that equation around. So you remember sort of the whole tape sucks, right? And that was- I've heard that rumor before. It tapes around still. I think maybe the bumper sticker was, should have been backup sucks, right? Which is great for guys like you, because it means you- It's not solved yet, that's true. It's not solved, it's all, I've been in this business a long time, back it's always been one of the hardest problems to solve. So tape you're saying, still used in backup for small, mid-sized businesses and remote offices, and what, last resort DR, or not necessarily, or anywhere else? Is that it? Is that the last domain of tape? I don't, I wouldn't want to overly simplify and say that's the last domain because I think there's some other portability use cases where tape has legs on it too. My point is just that the same way that disk is not the be-all, end-all, silver bullet of data protection, tape is just as much not entirely dead. And oh by the way, cloud absolutely has a place. One of the things that we were looking at though from a cloud perspective is there's a lot of folks out there that think that cloud is the silver bullet, right? As long as I write a check and just, all my backup problems go away. That is not true. There's a few cases where that happens to be, but really for your backup problems to go away, it's not around cloud as a backup service, it's around the expertise and the consultancy coming in there to actually take over the management of it. But for most cloud-based data protection solutions, you're still going to run it. You're still going to have the admin, you're still going to be pushing the agents out. You're still going to be invoking the restores. At that point, cloud-based data protection just becomes the mechanism by which you deploy it. Nothing else really changes. So we're at re-invent, we had theCUBE there and listening to Andy Jassy, he said that Glacier was the fastest growing product in the history of AWS. Now, Redshift was one of the other fast growing products. I can't remember, one is revenue, one is capacity, but so anyway, fast growing product. You're all good. So they're all good, right. So Glacier, does, in your view, Glacier sort of confirm the need for a deep archive, low cost, not intense RTO type of medium. Right, yeah, so if you think about the cloud, right? The cloud is just a deployment model that says that we believe that we can do things more efficiently because we're doing it at scale and we have deep expertise. In a nutshell, that's really what cloud means, right? And so if that's the case, and your use case is, I need to store data for an extremely long period of time and I want to have a long shelf life on that, then that's a pretty good equation for tape. And so the difference just becomes, instead of you managing your tape, they're managing their tapes instead, but the rest of the architecture kind of stays the same. Here's the thing that I think people need to understand about cloud as far as what comes to data protection. I'm going to take Glacier aside for a second, but in the more common sense of how data protection works, we looked at what the SLA expectations were of people that were going to the cloud, and by the way, the SLA's are about the same as people doing on-premise protection, right? So just because you go to cloud does not absolve you of that rapid recovery, the expectation of fast protection, all those other rules still apply. So because of that, we're seeing that really it's going to always be a disc to disc to cloud world for almost everybody. Now there's a few exceptions to that where you get some really aggressive WAN acceleration, but for the most part, you're always going to have an on-premise, intermediary appliance for fast recovery before you go to the cloud. I did a podcast a couple of weeks ago. And that's a, sorry to interrupt, so that's a snapshot to a separate physical device. Well, it could be a snapshot, it could be a backup. Or a backup to a separate, but typically a separate physical device, or not necessarily, yeah. So for best practice, separate physical device, just to cover your bases a little bit, and then get it off site as fast as you can. Exactly. So it's a disc to disc to cloud with an intermediary appliance there first. But here's the thing, I did a podcast a few weeks ago and in that I talked about basically disc to disc to cloud is like having one box of Legos and three or four different instruction manuals. So I have suns, right? And so we grew up on Legos and I am convinced that the box and that little bag of Legos, depending on which manual they stick in the box, you could turn into a motorcycle or a jet airplane, right? And disc to disc to cloud starts to look very similar to that because you could do a cloud-based data protection service, right, which is where it's all driven from the cloud and oh, by the way, they just put a caching appliance in there. You could just take your traditional on-premise backup that you do today, right? So your dedicated backup server, your dedicated backup appliance, whatever, and then just stretch it to the cloud. Either way, it's disc to disc to cloud. Really, it's just a matter of where does the management experience come from and what's the OPEX model. So what's IBM's plan all this? What's going on with IBM? What's going on at Pulse? I mean, where are they in this conversation? So I'm really excited about where they're going. Now, I'm a little blurry on what all was announced today versus what I happen to know on Roadmap. So, but I am really excited. Yeah, so we should stay away from that then. Yeah, we're not supposed to talk about that. But here's the parts I think are really interesting. So from a TSM investment perspective, I think 2013 was a year when IBM really said that, you know, this is not your daddy's backup. Or arguably, in IBM's case, this is not your granddaddy's backup, right? I mean, they've been doing this for quite a while. And they still got a big tape business. Absolutely, and they've got a disc solution and they're coming into cloud really strong. I mean, so they've got all those right pieces. They also, by the way, I think a lot of people forget, they actually fill out that whole data protection spectrum, right? I mean, they have snapshotting. They have backup. They have archiving. They have replication technologies. They do it across disc taping cloud. And by the way, they do it across not just X86, but other platforms as well. I think a lot of people forget the breadth of that solution set. One of the things I'm probably most excited about for what we saw today was around Operation Center and that new UI. You know, you really want to prove point around this is not your daddy's IBM, it's the new UI. One of the things that, particularly for things like virtualization, so easily one of the most important workloads to protect right now is that highly virtualized software-defined data center, private cloud, yada yada. When we tested what are the problems that people have in that kind of environment, five of the top six problems in protecting virtualized environments are visibility. Right? So I mean, you and I have been in this business a long time, right? So in the old days, if you wanted to protect a server, you went over to the physical server, you put an agent on it, you could walk the copper across the room, you saw how the backup server was, and you were done, right? You could figure out what was going right or wrong. Today, if I want to figure out what's wrong with a VM backup, we'll see which cluster off of which host, with which data store across, which software-defined network, virtualization is, the abstraction is powerful in a lot of things, but better data protection is not always one of them. It abstracts a lot of those details. And so some of the things that we saw around Operation Center where we're actually seeing, and these investments actually started in 2013, where they really made some investments around performance because they were behind, right? And then some of the workload stuff around virtualization and the integration with VMware's vCenter stack, all of a sudden, now they're in this. So it ought to be an interesting 2014, particularly as they start to more aggressively put cloud into the mix. Now how about mobile, Jason? We've been hearing it at this conference at Pulse, a lot of talk about mobile, mobile first, everybody wants to bring their own devices, internet of things even. What are you seeing on mobile? What are you seeing on just endpoint backup? Fair enough. So people, for some strange reason, there is a group of people out there that have figured out that because I bought the laptop, I'm supposed to be responsible for backing it up. And that's just foolish, right? The IT Pro, they are the custodian of data, right? And whether that data lives on a corporate paid file server or whether it lives on a laptop I paid for myself, it is corporate data, right? How can you try to absolve yourself of data protection for it just because it happens to be on a drive that you didn't pay for? So I think that's just flat dumb. And yet it's really scary because when we actually did some research around this, we found that IT Pros were having a very different strategy on whether to protect endpoint devices at all, right? And the complaints there was, well, it's not strategic enough and I'm concerned about storage sprawl and I think it's going to be too much effort. It's corporate data, right? That doesn't go away. But then for those posts that got past that and said, okay, yeah, you're right, I should probably protect it, they had different strategies for protecting the corporate-owned devices than they did their BYOD devices. Which is really pretty goofy if you think about it. It's corporate data. And if you take that one step further, recognize that demographically, the folks most likely to have a BYOD device as opposed to a corporally-issued device is senior management, right? You know, it's the folks at the top that said I want that new spanky laptop of product X, right? So if you're not protecting that data, not only are you selectively not protecting some of it, you're not protecting the executives' data, right? Can you imagine the CFO coming down the IT Pro and saying, I've lost my data and the IT Pro going, I'm sorry, that's not my problem, you bought the laptop, right? So yeah, it's a crazy world we live in. What about the challenge of copy creep, I call it, right? I mean, you've got zillions of copies, you're emailing stuff around, you've got your backup of backup, you've got stuff on your mobile device, you've got stuff up in Google Drive, you've got stuff up in Evernote, maybe that's somebody else's problem because it's in the cloud. But if you look at the anatomy of a file and the number of times it gets copied. Oh yeah, 12, 16, 20 iterations. Could even be more, right? Who knows? Yeah, but definitely dozens, right? We can agree on that. What do you see in the marketplace? It's clearly a problem. Do you see some guys attacking it, like Actifio obviously trying to go after that problem? How real is that sort of approach and what are others doing? Yeah, so I like the Actifio approach. If for nothing else, then they've really driven a new story, right? The story of copy data management. And it's kind of that ugly secret that so many folks in IT have just kind of ignored for so long, which is if I'm snapshotting and I'm backing up and I'm replicating and I'm archiving, all of a sudden I've got dozens of copies as file. And all of the IT vendors that are in the space trying to solve this, they're trying to figure out, I've got this really expensive data protection infrastructure out there. How do I shrink that footprint down so I can reduce my costs, right? Because otherwise I'm paying for this behemoth of infrastructure that has no value until the backup happens. So we're seeing a couple things out there. We're seeing some interesting approaches around reducing those copies. A lot of that comes from converging that data protection spectrum. Just because you have five or six colors worth of data protection should not mean that you have five or six different storage silos, that you have five or six different management UIs, that you have five or six different data protection strategies. We should be seeing convergence there. And the more you converge the backend storage, that's goodness, right? Because at least that's reducing your major catbacks. But then we're also seeing on what can those vendors do to actually make that data more useful? Can I represent it so for test dev or for running reports or for in-user access enablements? So there's a lot of interesting momentum right now on how do I make the value of that infrastructure better by providing more agility and more solutions than just if something goes bad I have the ability to recover it. So Jason, last question. You got this spectrum, this rainbow. Yep. IT pros got to deal with this stuff. Backup is always like the insurance. It's not the high priority, high ROI project. It's one of those, if it's something goes bad, then I'm screwed. So that's the snake bit and ROI. What's your advice for IT pros that are struggling with the spectrum? They got virtualization visibility issues. They're interested in the cloud, but they're afraid of it. They're hanging onto some tape and maybe for good reason, but they're getting pressure to get rid of the tape. They're worried about dedupe appliances, getting out of control and being too expensive. They got others telling them just use snapshots and create data protection as a service. What's your best advice to IT pros that are confused? Yeah. So I guess the best advice I would say is that the data protection spectrum does not have to be a complicated set of autonomous components. There's a lot of great software solutions out there that do let you do replication plus snapshot management plus backup in a single UI to a single data store from a single administrator's point of view. The problem is a lot of people actually don't wake that up. The storage guy wants to do his own backups and the compliance guy wants to do his own archives. And if they just woke up and realized that we have a software solution with a single UI that all three of them could use to a storage silo they can all three benefit from, all of a sudden it doesn't have to be that hard. I would also say that every solution should be considering the cloud as part of it, just not pure cloud. It should always go first to disk on-premise and then go to the cloud as that tertiary tier. And when you start to do that all of a sudden new options start to come in that. In fact, when ESG looked at what was going to be the primary use cases of cloud for the next couple of years data protection was the number one cited plan use case and disaster recovery was number three, test dev was number two. So it's not the production workloads, it's the other stuff that you know you need to do and cloud gives them a new way to do it cheaper. So I said last question, but I have another one. Data protection as a service, reality in the next year or so or is it going to take longer? So I like data protection as a service. The one thing I would tell you is why Baz when you can dress? What do you mean by that? So backup as a service just means the files go to that other site, right? Dress is where I'm taking whole VMs and I'm moving those VMs into the cloud and then if I need to recover I can spin those VMs up, right? DR as a service. And interestingly enough, most of the plumbing that makes DR as a service happen, most of that plumbing is actually built on backup as a service functionality. So you get all the benefits of backup as a service and you get DR to a secondary location. It really is taking transcending backup as a service into data protection overall. Exactly. Great. Any, let's see, blogs that people should go to, where do they get more information? So follow me on Twitter, JBuff, J-B-U-F-F and my primary blog is technicaloptimist.com. All right, Jason Buffington from ESG. Thanks very much. Always a great guest. Really appreciate you coming on. Thanks for having me. All right, everybody. Keep it right there. I'll be back with John Furrier right after this. We're live from the MGM in Las Vegas. This is Paul, so this is theCUBE.