 Live from New York, it's theCUBE. Covering theCUBE, New York City, 2018. Brought to you by SiliconANGLE Media and its ecosystem partners. Okay, welcome back everyone. This is theCUBE, live in New York City for our CUBE NYC event, hashtag CUBE NYC. This is our ninth year covering the big data ecosystem, going back to the original Hadoop world. Now it's evolved to essentially all things at a future of AI, Peter Burris, my co-host, gave a talk two nights ago on the future I presented in this research. It's all about data, it's all about the cloud, it's all about live action here in theCUBE. Our next guest is David Richards, who's been in the industry for a long time, seen the evolution of Hadoop, has been involved in it, has been a key enabler of the technology, certainly enabling cloud recovery, replication for cloud. Welcome back to theCUBE, good to see you. It's really good to be here. I got to say, you've been on theCUBE pretty much every year. I think every year we've done nine years now. You made some predictions and calls that actually happened like five years ago. You said cloud's going to kill Hadoop. I think you didn't say that off camera, but it might have been. Maybe you said that on camera. But we were pontificating, but also speculating, okay, where does this go? Been writing a lot of calls. You also were involved in the Hadoop distribution business back in the day. You got out of that quickly. You saw that early, good call. You guys have essentially a core enabler that's been just consistently performing well in the market, both on the Hadoop side, cloud, and as data becomes the conversation, which has always been your perspective, you guys have had a key in part of the infrastructure for a long time. What's going on? Still doing deals? Yeah, so the history of Juan Disco's play in Big Data and Hadoop has been, as you know, because you've been with us for a long time, kind of an interesting one. So back in 2013, 2014, 2015, we built a Hadoop-specific product called non-stop name node, and we had a Hadoop distribution. But we could see this transition, this change in the market happening, and the change wasn't driven necessarily by the advent of new technology. It was driven by over complexity associated with deploying, managing Hadoop clusters at scale, because lots of people, and we were talking about this off camera before, can deploy Hadoop in a fairly small way, but not many companies are equipped or built to deploy massive scale Hadoop distributions. Sustain it. They can't sustain it, and so the core that I made, and actually speak louder than words, the company rebuilt the product, built a general purpose data replication platform called Juan Disco Fusion, that yes, supported Hadoop, but also supported object store and cloud technologies. And we're now seeing use cases in cloud certainly begin to overtake Hadoop for us for the first time. And you guys have a patent that's pretty critical in all this, right? So the real IP. Yes, but people often make the mistake of calling us a data replication business, which we are, but data replication happens post consensus or post agreement. So the very heart of Juan Disco, our 35 patents are all based around a PAXOS based consensus algorithm, which wasn't a very cool thing to talk about now with the advent of blockchain and decentralized computing. Consensus is at the core of pretty much that movement. So what Juan Disco does is a consensus algorithm that enables things like hybrid cloud, multi-cloud, poly-cloud as Microsoft call it, as well as disaster recovery for Hadoop and others. Yeah, as you have more disparate parts working together, say multi-cloud. I mean, you're really perfectly positioned for multi-cloud. I mean, hybrid cloud is hybrid cloud, but also multi-cloud, they're two different things. Peter has been on the record describing the dirt between hybrid cloud and multi-cloud, but multi-cloud is essentially connecting clouds. So we're on a mission at the moment to define what those things actually are because I can tell you what it isn't. A multi-cloud strategy doesn't mean that you have disparate data and processes running in two different clouds. That just means that you've got two different clouds. That's not a multi-cloud strategy. Two cloud silos. Yeah, correct. That's kind of, that's creating problems that are going to really be bad further down the road. And hybrid cloud doesn't mean that you run some operations and processes and data on premise and a different siloed approach to cloud. What this means is that you have a data layer that's clustered and stretched, the same data that's stretched across different clouds, different on premise systems, whether it's a dupe on premise and maybe I want to build a huge data lake in cloud and start running complex AI and analytics processes over there because let's face it, banks, et cetera, ain't going to be able to manage and run AI themselves. That's, it's already been done by Amazon, Google, Microsoft, Alibaba and others in the cloud. So the ability to run this simultaneously in different locations is really important. That's what we do. All right, so let me just answer directly so we're filming and we'll get a clip out of this. What is the definition of hybrid cloud and what is the definition of multi-cloud? Take, take, plane both of those. The ability to manage and run the same data set against different applications simultaneously and achieve exactly the same result. That's hybrid cloud or multi-cloud? Both. So they're the same? The same. You consider hybrid cloud multi-cloud? For us it's just the different endpoint. It's hybrid, hybrid kind of means that you're running something that implies on-premise. A multi-cloud or poly-cloud implies that you're running between different cloud values. So hybrid is location multi-source? Correct. And so, but let's- That's a good definition. Yeah, so let's unpack this a little bit because at the end of the day what a business is going to want to do is they're going to want to be able to run the, apply their data to the best service. Correct. Increasingly that's what we're advising our clients to think about. Don't think about being an AWS customer per se. Think about being a customer of AWS services that serve your business or IBM services that serve your business. So, but you want to ensure that your dependency on that service is not absolute and that's why you want to be able to at least have the option of being able to run your data in all these different places. And I think the market now realizes that there is not going to be a single dominant vendor for cloud infrastructure. That's not going to happen. Yes, it happened. Oracle dominated in relational data. SAP dominated for ERP systems. For cloud, it's democratized. That's not going to happen. So everybody knows that Amazon probably have the best serverless compute lambda functions available. They've got millions of those things already written or in the process of being written. Everybody knows that Microsoft are going to extend the wonderful technology that they have on desktop and move that into cloud for analytics-based technologies and so on that Google have been working on artificial intelligence for an elongated period of time. So, vendors are going to arbitrage between different cloud vendors. They're going to choose the best and breedable. They're going to go to Google for AI and scale. We're going to go to Amazon for robustness of services. They're going to go to Microsoft for this week. They're looking at the services. That's what they need to do. And the thing that we'll forget that we don't at WANDISCO is that that requires guaranteed consistent data sets underneath the whole thing. So where does Fusion fit in here? So, how is that getting traction? Give us some update. Have you worked with Microsoft? I know we've talked about Amazon. What about Microsoft? So, we've been working with Microsoft. We announced a strategic partnership with them in March where we became a Tizero vendor which basically means that we're partnered with them in lockstep in the field. We executed extremely well since that point and we've done a number of fairly large high profile deals. A retailer, for example, that was based in Amazon. Didn't really like being based in Amazon. So, had to build a polycloud implementation to move petabyte scale data from AWS into Azure. That went seamlessly. It was an overnight success. And they're using your technology? They're using our technology to do that. There's no other way to do that. I think the world is now, but Microsoft and others have realized, CDC technology, change data capture doesn't work at this kind of scale where you batch up a bunch of changes and then you ship them log shipping or whatever every 15 minutes or so. We're talking about petabyte scale ingest processes. We're talking about huge data lakes that that technology simply doesn't work at this kind of scale. We got a couple minutes left. I want to just make sure we get your views on blockchain. You mentioned consensus. I want to get your thoughts on that because we're seeing, obviously, blockchain is certainly experimental. It's got, certainly it's powering money, Bitcoin and the international markets. It's certainly becoming a money backbone for countries to move billions of dollars out. It's certainly in the tank right now about 600 million below its mark in January. But blockchain fundamentally is supply chain. You're seeing consensus. You're seeing some of these things that are in your realm. What's your view? So, first of all, we, at WANDISCO, we separated the notion of cryptocurrency and blockchain. We see blockchain as something that's been around for a long time. It's basically the world is moving to decentralization. We're seeing this with our lines, with supermarkets and so on. People actually want to decentralize rather than centralize now. And the same thing is going to happen in the financial industry where we don't actually need a central transaction coordinator anymore. We don't need a clearinghouse, in other words. Now, how do you do that? At the very heart of blockchain is an incorrect assumption, right? So, most people think that Santoshi's invention, whoever that may be, was based around the blockchain itself. Blockchain is pieced together technologies that doesn't actually scale, right? So, it takes a game-theoretic approach to consensus. And I won't get, we don't have enough time for me to delve into exactly what that means, but our consensus algorithm is really proven to scale, right? So, what does that mean? Well, it means if you want to go and buy a cup of coffee at the Starbucks next door and you want to use crypto, you want to use a Bitcoin, you're going to be waiting maybe half an hour for that transaction to settle, right? Because the miners got to create a block. You know, all that stuff has to go on. So, a game-theoretic approach, basically... Bitcoin's running 500,000 transactions a day. Yeah. There's two transactions per second, right? Between two and eight transactions per second. We've already proven that we can achieve hundreds of thousands, potentially millions of agreements per second. Now, the argument against using Paxos, which is what our technology is based on, is it's too complicated? Well, no shit. Of course, it's too complicated. We've solved that problem. That's what WANDISCO does. So, we file a pattern. So, you have abstract to complexity. That's your job. We've extracted the complexity. So, you solve the complexity problem by being a complex solution, but you're making it abstract. We have an algorithmic, not a game-theoretic approach. You're solving a scale problem. Correct. Using Paxos in a way that allows real developers to be able to build consensus algorithm-based applications. Yeah, so 90% of the blockchain is consensus. We've solved the consensus problem. We'll be launching a product based around Hyperledger very soon. We're already in tests and we're already showing tens of thousands of transactions per second, not 2,000. The game theory side of it is still going to be important because when we talk about machines and humans working together, programs don't require incentives. Human beings do. And so, there will be very, very important applications for this stuff. But you're right. From the standpoint of the machine to machine, when there is no need for incentive, you just want consensus. You want scale. Yeah, and there are two approaches to this world of blockchains there. Public, which is where the Bitcoin guys are and the anarchists who firmly believe that there should be no oversight or control. Then there's the real world, which is permission blockchains. And permission blockchains is where the banks, where the regulators, where NASDAQ will be. When we're trading shares in the future, that will be a permission block chain that will be overseen by a regulator like the SEC, NASDAQ or London Stock Exchange, et cetera. David, always great to chat with you. Thanks for coming on. Again, always on the cutting edge, always having a great vision while knock them down some good technology and moving your IP on the right waves every time. Congratulations. Thank you. Always on the next wave. David Richards here inside the Q. Every year it doesn't disappoint. The Q, bringing you all the action here. QBNYC, we'll be back with more coverage. Stay with us. A lot more action for the rest of the day. Blue right back. Stay with us for more after the short break.