 San Jose in the heart of Silicon Valley. It's theCUBE, covering Big Data SV 2016. Now your host, John Furrier and Peter Burris. Okay, welcome back everyone. We are here live in Silicon Valley for Big Data Week, Big Data SV Strata Hadoop. This is theCUBE's Silicon Angles Flagship Program. We go out to the events and extract the signals and noise. I'm John Furrier. My co-host Peter Burris, our next guest is Rithika Gunnar, VP of Data Analyst at IBM and David Richards, the CEO of Wendisco. Welcome to theCUBE, welcome back. Thank you. It's a pleasure to be here. So, okay, IBM and Wendisco, why are you guys here? What are you guys talking about? Obviously, partnership, what's the story? So, you know what Wendisco does, right? Data replication, active, active replication of data. And for the past 12 months, we've been realigning our products to a market that we could see rapidly evolving. So, if you asked me 12 months ago what we did, we were talking about replicating just Hadoop, but we think the market is gonna be a lot more than that. I think Mike Olson famously said that this Hadoop was gonna disappear. And it was kind of right because the ecosystem is evolving to be a much greater stack that involves applications, cloud, a completely heterogeneous storage environment. And as that happens, the partnerships that we would need have to move on from just being, you know, the sort of Hadoop-specific distribution vendors to actually something that can deliver a complete solution to the marketplace. And very clearly, IBM has a massive advantage in the number of people, services, ecosystem, infrastructure in order to deliver a complete solution to customers. And that's really why we're here. I think I'll talk about the stack comment because this is something that we're seeing. Mike Olson's kind of being political and he says making it invisible. But the reality is there's more to big data than Hadoop. There's a lot of other stuff going on called stack, call it the ecosystem. A lot of great things are growing. We just have Gauravon from Snap Lodge who said everyone's winning. I mean, I just love that's totally true, but it's not just Hadoop. It's about all data and it's about all insight on that data. So when you think about all data, all data is a very powerful thing. If you look at what clients have been trying to do thus far, they've actually been confined to the data that may be in their operational systems with the advent of Hadoop, starting to bring in some structured and unstructured data, but with the advent of IoT systems, systems of engagement, systems of records, and trying to make sense of all of that, all data is a pretty powerful thing. When I think of all data, I think of three things. I think of data that is not only on-premises, which is where a lot of data resides today, but data that's in the cloud, where data is being generated today and where a majority of the growth is. When I think of all data, I think of structured data that is in your traditional operational systems, the unstructured and semi-structured data from IoT systems, et cetera. And when I think of all data, I think of not just data that's on-premises for a lot of our clients, but actually external data, data where we can correlate data with, for example, an acquisition that we just did with an IBM with the weather company, or augmenting with partnerships like Twitter, et cetera, to be able to extract insight from not just the data that resides within the walls of your organization, but external data as well. The old expression is, the old expression is, if you want to go fast, do it alone. If you want to go deeper and more comprehensive, do it as a team. That expression can be applied to data, and you look at the weather data, you think, hmm, that's the outlier type acquisition, but when you think about the diversity of data, that becomes a really big deal. And the question I want to ask you guys is, and Rithik, we'll start with you, is there's always a few pressure points that you've seen in big data. When that pressure is relieved, you've seen growth. And one was big data analytics, kind of stalled a little bit, the winds kind of shifted, eye at the storm, whatever you want to call it, then cloud comes in, and cloud is kind of enabling that to go faster. Now, a new pressure point that we're seeing is, go faster with digital transformation, so all data kind of brings us to all digital. And I know IBM's all about digitizing everything, and that's kind of the vision. So you now have the pressure of, I want all digital, I need data driven at the center of it, and I got the cloud resource, kind of the perfect storm. What's your thoughts on that? Do you see that similar picture? And then does that put the pressure on, say, when, disco, say, hey, I need replication, so now you're under the hood? Is that kind of where this is all coming together? Absolutely. When I think about it, it's about giving trusted data and insights to everyone within the organization at the speed in which they need it. So when you think about that last comment of at the speed in which they need it, that is the pressure point of what it means to have a digitally transformed business. That means being able to make insights and decisions immediately. And when we look at what our objective is from an IBM perspective, it's to be able to enable our clients to be able to generate those immediate insights, to be able to transform their business models, and to be able to provide the tooling and the skills necessary, whether we have it organically, inorganically, or through partnerships, like with WANDISCO, to be able to do that. And so with WANDISCO, we believe we really wanted to be able to activate data where that data resides. So when I talk about all data and activation of that data, WANDISCO provided us to us complementary capabilities to be able to activate that data where it resides with a lot of the capabilities that they're providing through their fusion. So being able to have and enable our end users to have that digitally infused set of reactive type of applications is absolutely something. It's like, David, you know, we talk about it. Maybe I'm oversimplifying your value purposes, but I always look at WANDISCO. It's kind of like the five nines of data, right? You guys make stuff work. And that's the theme here this year. It's just people just want it to work, right? They don't want to have it down, right? Yeah, we're seeing certainly an uptick in understanding about what high availability, what continuous availability means in the context of Hadoop. And I'm sure we'll be announcing some pretty big deals moving forward. But we've only just got going with IBM, you know? I would, the market should expect a number of announcements moving forward as we get going with this. But it's a very interesting question associated with cloud. And just to give you a couple of quick examples, we are seeing an increasing number of global 1,000 companies, Fortune 100 companies, move to cloud. And that's really important. If you'd have asked me 12 months ago, how is the market going to shape up? And I've said, well, most CIOs want to move to cloud. It's already happening. So FINRA, the major financial regulator in the United States, is moving to cloud, publicly announced it. The FCA in the UK publicly announced they're moving 100% to cloud. So this creates kind of a microcosm of a problem that we solve, which is how do you move transactional data from on-premise to cloud and create a sort of hybrid environment? Because the migration, you have to build a hybrid cloud in order to do that anyway. So if it's just archive systems, you can package it on a disk drive and post it. But if we're talking about transactional data, I stuff that you want to use, so for example, big travel company can't stop booking flights while they move their data into the cloud. They would take six months, circa six months, to move petabyte scale data into cloud. We solve that problem. We enable companies to move transactional data from on-premise into cloud without any interruption to service. So not six months. Not six months. Not six months. Six hours. And you can keep on using the data while it is in transit. So we've been looking for a really simplistic problem to explain this really complex algorithm that we've got that doesn't have actual replication stuff. That's it, right? It's so simple, nobody else can do it. So no downtime, no disruption to their business. No. And you can use the cloud, or you can use the on-prem applications while the data is in transit. So when you say all cloud, now we're on a theme, all data, all digital, all cloud, it's a nuance there. Because most enterprise, we hit Cora from SnapLogic, talk about it. There's always going to be an on-prem component. I mean, probably not going to see 100% everyone move to the cloud, public cloud. But cloud, you mean hybrid cloud, essentially. It's an on-prem component. I'm sure you guys see that with Bluemix as well, that you've got some dabbling in the public cloud. But ultimately, it's one resource pool. That's essentially what you're saying. And I think it's really important. One of the things that's very attractive about the WAN disco solution is that it does provide that hybridness from the on-premises to cloud and that being able to activate that data where it resides. But being able to do that in a heterogeneous fashion, architectures are very different in the cloud than they are on-premises. When you look at it, your data lake may be as simple as Swift Object Store or as S3. And you may be using elements of Hadoop in there, but the architectures are changing. So the notion of being able to handle hybrid solutions both on-premises and cloud with the heterogeneous capability in a non-invasive way that provides continuous data is something that is not easily achieved, but it's something that every enterprise needs to be taken to take into account. So, Rithika, talk about why the WAN disco partnership. And specifically, what are some of the conversations that you have with customers? Because obviously, it sounds like the need to go faster and have some of these replication active active and kind of five nines, if you will, of making stuff go not go down or non-destructive operations, whatever the buzzword is. But what's the motivation from your standpoint? Because IBM is very customer-centric. What are some of the conversations? And then how does WAN disco fit into that, those conversations? So when you look at the top three use cases that most clients use for even Hadoop environments or just what's going on in the market today, the top three use cases are, can I build a logical data warehouse? Can I build discovery, areas for discovery or analytical discovery? Can I build areas to be able to have data archiving? And those top three solutions in a hybrid heterogeneous environment, you need to be able to have active active access to the data where that data resides. And therefore, we believe from an IBM perspective that we want to be able to provide the best of breed regardless of where that resides. And so we believe from a WAN disco perspective that WAN disco has those capabilities that are very complimentary to what we need for that broader skills and tooling ecosystem and hence why we have formed this partnership. Unbelievably in the market, we're also seeing, and it feels like the Hadoop market's just got going, but we're seeing migrations from distributions like cloud era into cloud. So those sort of lab environments, the small clusters that were being set up, I know this is slightly controversial and I'll probably get darts and I'll be by Mike Olson, but we are seeing pretty large scale migration from those sort of labs that were set up initially and as they progress and as it becomes mission critical, they're gonna go to companies like IBM really, aren't they? In order to scale up their infrastructure, they're gonna move the data into cloud to get hyperscale for some of the use cases that Ritika was just talking about. So we are seeing, you know, a lot of low migrations. So basically Hadoops, there's some silo deployments of POCs that need to be integrated in, is that what you're referring to? I mean, why would someone do that? They would say, okay, probably integration costs, probably other solutions, data. If you do a roll your own approach where you go and get some open source software, you gotta go and buy servers, you gotta go and train staff, we've just seen one of our customers, Big Bank, two years later, get servers. Two years to get servers, to get server infrastructure. That's a pretty big practical barrier to entry versus, you know, I can throw something up in Bluemix in 30 minutes. Dave, you bring up a good point. I want to just expand on that because you have a unique history. We know each other, you go way back. You were in theCUBE when I think we first started seven years ago at Hadoop World. You've seen the evolution and heck, you had your own distribution at one point. So, you know, you've successfully navigated the waters of this ecosystem and you had great IP and then you kind of found your swim lanes guys are doing great. But I want to get your perspective on this because you mentioned Cloudera, you've seen how it's evolving as it goes mainstream as Peter says, the big guys are coming in and with power. I mean, IBM's got a huge spark investment and it's not just, you know, lip service. They're actually donating a ton of code and actually building stuff. So, you've got an evolutionary change happening with the industry. What's your take on the upstarts like Cloudera and Hortonworks and the distro game because that now becomes an interesting dynamic because it has to integrate well. Well, I think there will always be a market for the distribution of open source software. And as that sort of that layer in the stack, you know, certainly, you know, Cloudera, Hortonworks, et cetera, are doing a pretty decent job of providing a distribution. But the Hadoop marketplace, and I've reticulated this on pretty thick as well, is not Hadoop. Hadoop is a component of it. But, you know, in cloud, we talk about object store technology. We talk about Swift, we talk about S3. We talk about Spark, which can be run standalone. We don't necessarily need Hadoop underneath it. So the marketplace is being stretched to such a point that if you were to look at the percentage of the revenue that's generated from Hadoop, it's probably less than 1%. So the market's gonna, I talked 12 months ago with you about the whale season. The whales are coming. And they're here right now. They're mating out in the waters. Deals are getting done. I'm not gonna deal with that visual right now, but you're quite right. And it's, you know, I love the Peter Drucker quote, which is strategy is a commodity. Execution is an art. We're now moving into the execution phase. You need a big company in order to do that. You can't be a 500,000 person. Is Cloudera holding on to Dogma and with Kadoop, or do they realize that the ecosystem is building around them? I think they do, because they're focused on the application layer. But there's a lot of competition in the application layer. There's a little company called IBM. There's a little company called Microsoft. And the little company called Amazon that are kind of focused on that as well. So that's a pretty competitive environment. And your ability to execute is really determined by the size of the organization. Awesome. To be quite frank. So we have Hadoop Summit coming up in Dublin. We're gonna be in Ireland next month for Hadoop Summit. We're more and more coverage there. Guys, thanks for the insight. Congratulations on the relationship. And again, I went and just go, we know you guys and know what you guys have done. It seems like prime time for you guys right now in IBM. We just covered you guys at Interconnect. Great event. Love the weather company data as a weather geek. But also the Apple announcement was really significant. Having Apple up on stage with IBM, I think that is really, really compelling. And that was just not a barney deal. That was real. And that was the fact that Apple was on stage was a real testament to the direction you guys are going. So congratulations. This is theCUBE bringing you all the action here. Live in Silicon Valley here for Big Data Week, Big Data SV and Strata Hadoop. We'll be right back with more after this short break.