 Live from New York, it's theCUBE. Covering Big Data New York City 2016. Brought to you by headline sponsors, Cisco, IBM, NVIDIA, and our ecosystem sponsors. Now, here are your hosts, Dave Vellante and Peter Burris. Welcome back to New York City, everybody. This is theCUBE, the worldwide leader in live tech coverage. David Richards is here. He's the CEO when Disco, a longtime CUBE alum. Great to see you again. Great to be back. It was good fun hanging out with you last night. Good surprise at the IBM event. It was a good action across the street. Yeah, you're both looking surprisingly well actually. Ha ha, yes. Well, we also heard about the when Disco versus the CUBE golf tournament that apparently theCUBE just did really, really well and then when Disco went running away with their tailbone. Well, I talked to Furrier last night. I said, David Richards was telling me they kicked your butt on the golf course. He goes, that's true actually. I think I've got some video proof that he actually gave me 20 bucks live on that app because of course his wallet was empty. It was blowing the dust off of it, alligator arms. So David, it's again great to see you again. You guys have been in this business since day one and things are evolving. How are things changing for when Disco? So when we first came into this market, back in the mid 2006, 2007, and then we obviously made a bunch of acquisitions around 2011, 2012 that took us head along into the big data marketplace. We pretty much had a completely different business model to our business model now. Then we had a product called non-stop name note. My God, can you imagine that? That was very focused on the Hadoop marketplace because at that time we believed like everybody else that Hadoop was going to take over the world. People were going to move to commoditized servers, open source software and solve the huge storage problems that they were going to have from both a cost and efficiency perspective. What I think has happened or is happening right now is this evolution and it really is more of a revolution than the evolution is taking place where workloads and we were discussing this last night are moving at massive scale to cloud and people are really skipping that step where we thought they were going to have five, 10,000 server clusters on-premise but now they have some clusters on-prem but the bulk of the workloads are actually moving into cloud. I was just discussing with George off-camera a few minutes ago how wide that is happening and there's a lot of applications that are very efficient. The cloud kind of packs are up there, ready to use off the shelf and it becomes very simplistic and to be quite frank, do we really care anymore about all these different open source components? Is the CIO waking up in the middle of the night thinking, oh my God, am I going to use Ignite? Am I going to use Spark? Am I going to use Pig? I'm going to use Hive, et cetera, et cetera, et cetera. Of course they're not. They really just want to, I mean, let's inverse the question to ourselves. If you were going to start a competitor to Uber tomorrow, would you go and build a data center? Or would you just throw up a thousand servers up in the cloud and have done with it and use all the apps that are up there? Of course the answers are simple. So that's really what's happening. Well, one of the things that I, I wrote a piece of research a million years ago in which I prognosticated the dictionary word of the day that the value of middleware was inversely proportional to the degree to which anybody knew anything about it. CIOs are waking up and asking those questions today, which is an indication that they're creating a problem. Infrastructure has to do no harm in the organization. That's, I had a CIO friend for years who still asks his chief CTO to what degree is infrastructure creating a problem for me today? And it's creating a problem. It's a problem. You don't want to have to know about this stuff. And so to what degree are you helping companies mask some of those of that visibility so that you people can spend less time worrying about the infrastructure? So what we're focused on is a, a business model has gone from direct where we were hiring out very large direct sales force enterprise, the classic enterprise sales guys that would go knock on doors, knock deals down, you know, go and sell to the global 1000s to an indirect model and we announced that OEM recently with IBM, IBM Big Replicate is under the covers is One Disco Fusion, which is a great deal for us. So our focus very much is on data movement and data movement between data centers for companies that want to stay on-prem, between data centers and in and out of cloud seamlessly and the word there is seamlessly. So we work very hard for the past 18 months in our product such that anybody can go to, if you want to go to the AWS marketplace, you can in a few clicks begin to replicate petabyte scale in and out of cloud. And we think, and we were discussing this last night, that the hybrid cloud model is really fascinating. So the ability to take data on-premise, query it in cloud, get complete consistency between on-prem and cloud, but also have all the efficiency in the cloud economics, the elasticity, all the applications that exist in cloud. And I think that model is really interesting. What's interesting is I'm not sure that the little guys can execute in that model other than like we're doing via an OEM an indirect model. So I'm not sure whether or not, as we just to go back to the conversation, CIOs are as concerned as they used to be about which Hadoop distribution, for example, they're using, I never hear that question anymore. That question was a 2012, 2013 question. What CIOs are now concerned about is the economics of cloud and how do I get that less than five bucks per terabyte of data economics that I get in a cloud environment? But also increasingly they're talking about, they're talking about the use cases. They want to get their people, they don't want to replicate the Linux or UNIX versus NT wars of the 1990s. They want to, which was made possible because they were focused on what accounting package am I going to run? Am I going to run on this or that? You know, it was known process, unknown technology. Today's universe, it's unknown process and they don't want to know as much about the technology. So they're focused on how do I get my men and women focused on use cases that are delivering value for the business? Exactly, and the economics question is really simple. Am I going to build a massive, partially used elastic infrastructure on premise? Or am I just going to go and use the elastic infrastructure that already exists in cloud? That's a no brainer, that's already happening. The good news for us, the good news for Wandisco is it's precisely what we do, it's a data movement problem. Now I'm bound to say that, but it is actually a data movement problem. And this idea that you have data that changes, active transactional data as we call it. So the active transactional data movement is a really hard problem. You can't just take a snapshot, right? A file scan, then a snapshot and then move the data. And that's the problem that all the other data replication guys have got. That's why IBM, OEM, that's why we've got strategic partnerships with companies like Oracle, like Amazon, what I'm sure will be announcing things in due course with the other cloud vendors like Google, for example, and Microsoft with their Azure products. They all have that problem. So data movement in and out of cloud, if it's batch, if it's static, if it's archival data, easy problem to solve. There's a million and one different replication products. You can use our sync if you really wanted to do that. But active transactional data, data that changes, data that moves, a petabyte scale, hard problem. Because you've got speed of light problems and you're exposing yourself to data loss, something goes wrong. And eventually a problem. An eventual consistency replication model doesn't work. You can't, you know, if I'm querying, we've got a customer that's trying to look at cardiographs, right, in and out of cloud. I mean, would you really feel comfortable in your, your cardiograph, eventually getting into the cloud and being analyzed? You know, you've got to be absolutely crystal clear that the data is completely consistent from the stuff that I'm generating on-premise versus the stuff that I'm, and the models that I'm building in cloud. It's vitally important. Well, I would imagine there are regulations in certain industries anyway, that require that eventual consistency doesn't fit. Yeah, well, I mean, at the moment without us, that's all you got, I'm afraid. Well, so I'm on a mission. Let me, I want to, I want to get your take on it. That we always talk about elastic infrastructure, which is a given workload being able to scale up and scale down. I think it's time to start talking about plastic infrastructure, where a given workload, but a reconfiguration of how that workload is applied because of the value of data, because of integration, because of the need to be able to move in response to business needs. So we talk about plastic infrastructure, where we are reconfiguring based on policy and rules and some other things. What do you think about that? I love it. The reason I love it is because, just to take a step back, the definition of hybrid cloud, right, is you would imagine it would be relatively simple. To me, hybrid means that you have, you know, it's a bit like a hybrid golf club. It's neither a driver nor an eye and it's somewhere in between. So you have the same workload that can exist both on-premise and in the cloud. I can use it, I can use both the cloud and on-premise, you know, interchangeably. What hybrid cloud actually means for all the vendors is that, and this is the dirty little secret, it means that you have some workloads running against some data in the cloud and others that will run against some data on-premise. Now, why do they do that? Because they have to, because they can't guarantee complete consistency between on-premise and cloud. Our definition of hybrid cloud is exactly the same data if you want between on-premise and cloud. And I love this plastic phrase, the idea of repurposing all of those applications and they can live anywhere. It doesn't matter because it's the same data. Yeah, so we have two terms we have to copy right here. Plastic infrastructure was the other one we heard. And data portfolio. Data portfolio. Plastic infrastructure. Run the page back. Plastic infrastructure. I'm going to steal it. Because please do, you know. But the key thing is, as these technologies get more deeply embedded within business and how the business runs, it's incumbent upon the technology leadership to be able to rapidly be able to reconfigure the infrastructure in response to business needs. That's not elasticity. That's plasticity. I love it. Absolutely. And I think you're touching on something that's changing and what we discussed earlier, which is that CIOs are waking up in the middle of the night thinking am I going to use PIG or Hive or any of those other open source components. They're thinking about the applications that they're going to build. How am I actually going to start using this data? And I think the agenda's kind of moved on. And walking around the hall, there's still a little bit of confusion. You still have people talking about infrastructure, like it really still matters. I'm not absolutely sure it does. Well, so let's talk about that. We've got a few minutes. It matters when it breaks. What's that? It matters when it breaks. It sure does matter when it breaks. Otherwise, nobody wants to think about it. No, yeah. Because like I said earlier, it's the degree. We have time, but I want to explore the new distribution model as well. Yeah, go ahead. Let me do that. Get that, tick that box if I can. Help me understand, David, how it all works. So you, the partnership with IBM and others, you mentioned Amazon, how does it work? You are in there, the IBM cloud offering. IBM is actually selling that offering. Is it branded IBM product? So it's in the big data analytics and cloud offerings. So at the moment, IBM are very focused, as you know, on owning the platform. IBM as a company have to own the platform. Absolutely. I'm delighted to say that we're embedded into their platform. Now they had a big launch of some products last night. I know that they were talking about IBM Big Replicate, which is 100% white label OEM of WANDISCO fusion to solve some very specific problems, primarily around data movement. So the hybrid cloud. How do I punch data out into cloud, run analytics against it and be sure that I'm going to get the right results? That's what Big Replicate solves. And also, they're moving into mixed environments, whether they're NetApp, kind of Teradata environments, NFS based environments, or whether they're, you know, a customer already has an existing distribution of, say, Cloudera or Hortonworks, so they can live alongside that. So we can replicate data between existing deployments where they may have already made a strategic decision to go with one of those distributions and also be able to migrate, not just into IBM Big Insights, but also into their cloud offerings. So that's a great deal for us. We're not, they're selling it themselves. I mean, obviously we've done a lot of field enablement, trained 5,000 or so IBM sales reps. And you know, a small company like WANDISCO, well a small company like virtually any of the vendors in there that are not in the global 1,000 list, you have to, the go-to market has to be indirect. And so you're, totally agree. And so you're basically, if I understand it quickly, you're moving what are conventional filers into the cloud. Customers are doing that. How fast is that happening and why are they doing that? I mean, we have not announced this product yet, but we're in the middle of launching it. It's at scale, moving petabyte scale data from, and this is transactional data. So it's a hard problem to solve, right? So it's an active data, it's an active transactional data replication problem. So a lot of, the dirty little secret in the cloud is that a lot of those NFS filers have not moved yet. And why haven't they moved? Because they can't. Because you can't just, you know, if you're a travel, one customer's banks and travel companies, they can't press pause in their organization. Do a file scan that's going to take six months and then turn it back on again and hey, presto, it's in the cloud. You can't do that. So you kind of have to, every single migration of those filers of any sort of data is a hybrid model. So you have to be able to run both on-prem and cloud while that migration is happening. And there, I can tell you, are a lot, a hell of a lot of NetApp filers that are going to move very soon. Okay, so that's the problem that you solve. Otherwise, you'd have to freeze everything which would kill your business. You can't do it. Yeah, so when we, when human beings imagine things, we always imagine small use cases, small sets, like moving a few files into Dropbox or something. And that's okay that I can't edit those files for the few seconds it takes to move. I took a look at a deal the other day that was three billion files, right? Three billion. You can't even, my brain can't even calculate that, right? That's a three to six month data movement over the internet. And Amazon, for example, got this product called Snowball, which nobody ever, no techie ever believes the story. But of course, they FedEx a box of ruggedized hard drive to you essentially, ruggedized survey. You pour your data into it and then you mail it back to them and they can put it there. That doesn't work, of course, for transactional data, for data that changes all the time. These are hard problems to solve. And I go to market, getting back to your question, is all about indirect. So AWS, you know, strategic partnership there, Oracle strategic partnership there, IBM. And as I said, I'm sure that we'll be doing things with Google and Microsoft soon. And they're the five partnerships that I really care about, to be quite fair. And this comes back to this notion of infrastructure, the value of infrastructure, just to touch upon it for a second. So many years ago, when we were doing client server, we would test it on a local area network and deploy it on a WAN and then wonder why it blew up. That the realities of the speed of light and the practical limitations have a real impact on design. And so where infrastructure still matters is we still have to worry about design. We still have to worry about legacy financial assets, however deploying those assets. And I want to come back to this because we were talking earlier about data as an asset, the value of data within the business. And you don't want to be limited by the legacy as you try to find new ways of generating value out of your data. And what you guys are trying to allow is that the data can be moved in response to the use case as opposed to the use case not being made possible because of the legacy decisions about what your data. That's precisely it. I don't think any CIO in their right mind wants to continue the huge maintenance costs, maintenance payments they have to make to some of those vendors, some of those NFS based vendors, that they need to shut them down. They have to figure out a way to move them into cloud so you get cloud economics. And also be able to query the data in a massively efficient way. You simply cannot do that at the moment. They simply cannot do that at the moment. So as I said, as we continue to launch these products in the marketplace, I'm sure you'll see at scale some pretty large companies surprising, the two that spring to my mind, the regulators in the US, in the UK, FINRA and the FCA, have both announced that they're moving all into cloud, 100% into cloud. And I would expect to see that trend continue. I mean, the re-invent, I don't want to talk about another, we're here at Strata, but the AWS re-invent, I would expect to see several major financial service companies announcing cloud strategy. And FINRA is a big user of the AWS cloud. Talk about it pretty aggressively and really interesting use case there. So yeah, so we've got to end. What's next for you guys? You've mentioned you're going to be at re-invent, you're going to be at World of Watson, where are we going to find you next? Both of those. Obviously, the white label with IBM is a really interesting deal for us. I can't talk about deal flow yet because it's around a quarter at the moment, but I can tell you that they're doing a pretty damn good job of selling. So we're in execution mode at the moment where we've already announced some key partnerships. There'll be more key partnerships to come, I'm sure. We're obviously chasing deals down with some of the other cloud vendors. And I'd expect to see us announcing some interesting new customer wins in the coming days and weeks. Great, well congratulations on the momentum and sort of the renewed strategy. I love it and appreciate you coming to theCUBE. Always a pleasure. All right, keep it right there, everybody. We'll be back with our next guest. theCUBE, we're live at Big Data NYC, Strata, and Hadoop World, right back. theCUBE, we're live at Big Data NYC, Strata, and Hadoop World, right back.