 at Big Data SV 2014 is brought to you by headline sponsors, WAN Disco. We make Hadoop invincible and Actian, accelerating Big Data 2.0. We're back, this is Dave Vellante of Wikibon and this is Silicon Angles theCUBE. We go out to the events. We extract the signal from the noise. This is Big Data SV. We're running concurrently with the Stratoconf. We're here across the street from the Santa Clara Convention Center. We're at the Hilton, so stop by and see us. Our good friends are here. David Richards, who's the CEO of WAN Disco and Jigain Sandar, who is the CTO and Vice President of Engineering for the Big Data side of the business, which is the one that everybody's excited about. Gentlemen, welcome back. Good to see you again. Thank you, Dave. It's a pleasure to be here again, Dave. So we saw you last October. We were at Big Data NYC. Give us the update. What's new for you guys? Well, we continue to make very good progress. I mean, today we announced a new product in our Hadoop stack, which is a non-stop H-base. And we've got a fantastic demo, which we're doing on the show this week, a live demo. And I think Brett Rudenstein will be on the show later this week, doing a live demo for your viewers, of showing a streaming Twitter feed, taking servers down, putting servers back on. So we're continuing to use our unique Active Active Replication Technology that we've had now for the past six, 12 months in the Hadoop space for HDFS. And we've now applied that on Gigane and this team. They've done an amazing job in applying that to H-base. Yes, sir. Well, in 2013, Jeff Kelly just came out with his Big Data Report, the third Big Data Report. And he referenced the WINDISCO non-stop name node in there as one of the key milestones for the industry to really, you know, the active, active technology to really put forth that infrastructure that CIOs need for, you know, make sure that they're comfortable, they're sleeping, that they're going to put their mission critical applications on there. So that was a key milestone. Maybe, Gigane, you could talk a little bit about that and how it applies to H-base. Certainly. So the non-stop HDFS product that we introduced last year is specifically for the file system. And what we've done is continued to build on top of that by approaching the next layer up, which is H-base, the database layer. The little known secret there is that the region server of the H-base system is a very bad single point of failure. And we've solved that problem by applying our patented active-active replication and creating multiple active replicas of each region so that should a region server go down, service is uninterrupted, you can continue to run your H-base applications without any interruption. So H-base is sort of the de facto database for no SQL database anyway for Hadoop. And we have some experience with H-base. John Furrier has talked about it a lot. We used H-base extensively and obviously had some challenges. So what are you seeing, David, from the customer base in terms of how they're using H-base? I mean, we were using it for sort of a little side project which has grown up. What are you seeing in terms of H-base adoption and where do you guys fit? So we're seeing a mix, I'm thinking of one very large company in the semiconductor space, we're about, what was it again, 30, 40% of their deployments are HDFS and the rest are H-base for streaming-based applications. So it's everything from stock feeds through to doing, we're doing an example with Twitter feed analysis. So it's those kind of streaming applications. I mean, H-base, of course, was designed after the Google product Bigtable and that's really what it's designed to do. So as you're going to say, we've moved a layer above the file system, we're now into streaming real-time-based applications. Yeah, and so you're right. So Google used a Bigtable, Facebook is big H-base consumer, we use it at CrowdChat and have used it in the past. And so we've expected it to go mainstream but there's a lot of competition in that no-SQL space. So do you see you guys knocking down sort of other no-SQL database types or is it sort of coalescing around H-base? What can you tell us there? Well, I'd like to go step up, actually, and let's take a view of the whole market and what we're seeing. I was saying off-air before we came on that we're seeing. I was with the CIO of one of Europe's largest companies a couple of weeks ago, who you would never expect to be talking about doing wholesale, large-scale replacements within their data center of traditional technologies. And I know that some of the other vendors in the space are part of some of those traditional vendors like Teradata, like Oracle, et cetera. We're not, so we can pretty much say what we want. And we're seeing a great desire from CIOs to do wholesale replacement of those traditional technologies, replace it with the lower-cost, higher functionality. Lower cost is not enough, you need higher functionality as well. Products like Hadoop and specifically Hadoop. So the desire certainly to have the no-SQL stuff with H-base and the HDFS products in one place from one vendor is certainly that. Well, I want to ask you about that because I made an observation, I saw Jeff's report again and it's sort of littered with the big whales or sort of co-opting a lot of the space and I say to myself, wow, are we sort of already reaching an equilibrium where the guys disrupting the market are sort of being subsumed by the larger guys as a distribution channel and there's no question and you alluded to it. You didn't say it directly, but I'll say it, there's no question that it somewhat waters down the messaging, that sort of, hey, we can do something 10x better, 10x cheaper and much, much faster. You're starting to hear in the marketplace that's much more temperate. Well, we don't want to say that about whether it's Oracle or Teradata or IBM, whomever it is, you guys, it sounds like you see the world differently. You really, I mean, in the early days of big data it was all about disruption, it was all about really changing the way in which organizations operate, making them data-centric. We still hear that, but you're hearing it tempered down. I wonder if you could comment on that and give us your point of view. So analyzing the market, which is kind of your job, is difficult. It's really tough, right? So one of the measures, and I was at the Barclays Big Data Conference, which is a public company conference yesterday and I was speaking at it, I wanted to analyze sort of said, well, I'm looking at one of the proxies for big data may be the disk manufacturers, hard drive manufacturers, and they're up about 15%. So is data intergrowing at 15%? I don't think it is. I think what we're seeing in the early part of the market, and this is a classic adoption curve, right? So we're in the early part of the market. There's certainly a lot of divisional, to use the word POC, I think is doing the world in injustice. I think these are trials in production. I mean, we're doing something with a hospital that's hoping to save people's lives. And I wouldn't want to tell a patient in that hospital, this is POC. Well, we're hoping to keep you alive. Kicking the tires. Yeah, just going to press it and see if it works. No, these are real production environments. And I think that we're seeing, nothing's ever as fast as we all want it to be, but this is happening. It is going to happen. And those proxies are only really, I mean, the big whales that we're talking about in the marketplace, yeah, the market's dominated in the early phases of those companies, but we are going to see the likes of Hortonworks and Cloudera, et cetera, really come to the fore. And let's face it, Splunk overtook Teradata in market capitalization in the past six months. I'm glad you brought that up, because you certainly see Splunk, CliqueTech, Tableau, are really disrupting, certainly on the visualization side, for Tableau and CliqueTech, and Splunk in the core IT space. And so we would expect that similar things would happen in sort of big data infrastructure, unless either they get taken out, or big guys are able to convince their customers because they have such huge distribution channels and such a inertia around their install basis that, oh yeah, we can get there too, and let's make incremental 10% a year improvements. Yes, I mean, this isn't new, right? It's called Innovator's Dilemma. And those traditional vendors have a humongous problem. They've got shareholders that they have to appease. They can't fundamentally change their business models overnight and suddenly support a product that's 10% of the price of theirs. That's gonna be pretty darn difficult. So I would expect, in the same way that mainframe moved to three tier client server, when I talk to customers, I go, mainframe, three tier client server, and then what? Well, then what is big data? It's gonna happen, it's gonna happen at the same pace that three tier client server overtook mainframe. We agree with you. So the premise that we've always used in this space is that the customers are gonna create more value than the vendors. And when that happens, what's gonna happen is huge disruption in the customer base, by industry, and then their competitors are gonna become less competitive and they're gonna have to come back to guys like you to hang on to that innovation or tap into that innovation. But we're in that space right now where you sort of have this equilibrium. As I said, you're seeing a lot of partnerships announced. I mean, you guys are making announcements. You've got a fundamental innovation. You're solving a hard problem. Frankly, we'd like to see more of that. We'd like to see this area of the world maintain its aggression. So how was the Barclays big data conference? So was that up in San Francisco or? Yeah, it was. It was up in San Francisco. It was organized by the investment bank. They had, you know, everybody, Mike Olson from Clutter presented. Herb from Hortonworks presented. John Schroeder from MapR was there. A whole bunch of those kind of companies were there. I didn't actually see any of the traditional vendors presenting. Oh good, okay. So it really was a pure play sort of big data conference. It really was. And what we're seeing, by the way, and what I expect to see about disclosing anything, I would expect to see companies that traditionally invest in public companies have to move down the stack and come to private companies to invest. I mean, amazing things are happening. You know, a bit like the Goldman Sachs deal with Facebook before they did the IPO. I would expect to see a whole bunch of deals like that where quite simply those big investment funds can't get exposure in the public markets. I mean, we're about it to be quite frank in the doob space as a publicly traded company. I would expect to see those types of companies do massive investments above venture but below public. And you know, when you start to see those deals and some of the valuations that I expect to see coming out of those private investments in the non-public markets by public investors, I mean, the world's gone crazy, right? But I would expect to see those kind of deals happen in the next few weeks. I got a question from a journalist today asking me, are we in a bubble? And why or why not? And why does that matter? And I said, it's not a broad-based bubble, but there are pockets that are kind of bubble-licious. You know, what's your take on that? I mean, you've certainly benefited from the excitement around big data. You know, investors are very interested and intrigued and excited about the big data side of your business. Are we in a bubble? I think, as you said, in certain sectors, we are. But when I, you know, just to allude back to what I think is going to happen in the big data space, we ain't seen nothing yet. This is just the beginning of a fundamental shift, a tectonic shift in the marketplace. Is every company going to win that comes with the word big data alongside their name? No. Are they going to be, you know, is there going to be a change in the order? As we move from mainframe to three-tier client-server, we saw companies like Oracle, like EMC, like Intel for Chips be born. I would expect to see those fundamental shifts in the marketplace. So I got to ask you, so UK-based company, you guys got to report finances once every decade, I think, right? So no, I think you're coming up to another reporting season, right? Shortly, is that right? Yeah. You've announced when you're reporting? Late March. Okay. So I know your colleague, Richard Branson at Davos, was saying, no, it works in the UK. Once, twice a year is good. Silicon Valley, you know, Americans, four times a year, it's too often because it takes your eye off the strategic ball and it focuses too much on short-term results. I presume you agree with that because you don't want to have to report more frequently. It's somewhat onerous, but let's talk about that a little bit, that reporting cycle, four times a year versus two times a year, do you have an opinion on that? So I like reporting quarterly, actually, believe it or not, and we do. So we do an update to the city, which is the London Stock Exchange, every quarter, and that's voluntary because normally we'd only have to report every six months, but we like doing quarterly reporting. We recently hired a new CFO, a guy called Paul Harrison, who is the CFO of SAGE, one of the biggest software companies in the world that joined us, that wanted to be part of this big data wave. So we're not scared of suddenly doing quarterly reporting. We've got our eyes on certainly, you know, moving to different exchanges, moving up the charts, et cetera. I know that, and you can get bits and pieces of information that you can't typically get from European companies, and I agree with you. I think it's actually helped US companies. I've heard for decades that the short-term focus is bad, but you look at the dominance that US tech companies have, is that how is that bad? Well, you know, we are a quarterly driven company. My sales team are very quarterly driven. I just do not think that's a bad discipline to get into. So Jigain, what are some of the things that you're tracking from the technology side? What are some of the things that excite you these days? So HBase, as we've spoken about already, is front and center in our awareness. We're seeing good adoption of non-stop Hadoop, which is our HDFS product. Particularly, the van addition for disaster recovery is very attractive to customers. There are no equivalent solutions out there, and we're easily able to show the flaws in existing hacks, such as disk CP, and win on the basis of that. HBase is also very interesting because people make the mistaken assumption that it's the HBase master that's the single point of failure. Turns out that's not as, it is also a single point of failure, but it's the region server that really kills you. Applications will freeze, data is likely to be lost, so region server failure is a bigger problem than the HBase master, and at the lowest level, that's the harder problem. There are various open source efforts that are active standby. We don't believe in that. We always do active, active servers. That's what our product does today. We'll talk about that a little bit. We've talked about this a bit on theCUBE, just kind of to refresh my memory. So you're talking about the HBase master versus the region server, and we've discussed recovery. That's always, when something goes wrong, how do you get it back, and how do you know that it's accurate, there's a single point of control? So in that world of HBase master and region server, how does that all work? How does recovery work without WinDisco and how do you change that dynamic? So if you look at the region server, all current efforts are towards keeping standby servers hot sometimes, cold sometimes, cold meaning they have to read the edit logs and come up to speed and start serving content. That's a flawed approach in many ways. We have the ability to do coordination before data goes into the system. Therefore, we have exact replicas. This is not an eventual consistent system. This is exact replicas of the same content of the regions in multiple servers, and some of those servers can be in a different data center spread across the world. That's our key value, and we've applied the same distributed coordination engine to solving this problem. So how does the conversation go, David, with your customers? So I presume you're going at a fairly high level at talking about sort of the need for, whether it's organizational change and the potential of big data and the like. And then, but you guys sell some pretty geeky stuff, right? So you got a big spectrum of audience that you have to sell to. So I wonder if you could take us through sort of how that whole motion works. So in key four, we announced two very important strategic partnerships to us, which were two big boys in the Hadoop space, which are Cladir and Hortonworks. So we go in behind the Cladir and Hortonworks distribution. So both CDH4 and CDH5 support a nonstop technology we're certified against CDH4.4, what I expect to be certified soon against CDH5. And HTTP 2.0, 2.1, we're also modified to support our active-active replication technology. Ad conversations typically happen with the CXO level, to be quite frank. So you'll see a CIO say, okay, we're gonna move to this Hadoop stuff. We're gonna be able to do things that we've never been able to do before with data. I'm gonna be able to query all of my data instantaneously across my entire organization. What happens if it goes down? And that's when we get a call. I'm very fond of saying, well I say to my VP of sales all the time, we sell Bibles, not religion. My goal is not to do religious conversion of the world to convince them to use Hadoop. I think Hortonworks, Cladir and others are doing a fantastic job of doing that. I think that's gonna happen naturally. We wanna go in behind and sell continuous availability, of which CIOs, by the way, already understand the importance of that. Now, as for the geeky stuff, that's interesting. I was in a meeting with one of the preeminent Silicon Valley companies last week, and Jagain was doing, Jagain and Dr. Konstantin Shwasko were doing a presentation. And you should see the eyes of these guys light up. They said, just a second, this is one virtual cluster that we can spread across the world. He said, yeah. Now, most people think that's impossible. Yeah, how'd you do that, right? That gives, it gave me goosebumps, just watching these guys go, really? That's one virtual cluster? Yeah, you can do that. That's interesting how you described it, because I wrote down as you were talking, you're not evangelizing Hadoop, you're evangelizing quality Hadoop. And that's sort of the discussion that you're having. And I don't claim to fully understand all the innards, but my colleague, David Floyer does, and we talk about this stuff all the time. We've been watching this market space for 20 plus years, 30 years even, and this is a hard problem that you're solving. And I can see why sort of the Technorati get excited about it. So always a pleasure having you guys on. Thanks very much. Really appreciate the insights and the support. And we'll see you around. We'll be watching. You know, you guys do a fantastic job. You really bring this conference to light. It really is great watching you. I love watching the rest of the conference on your Silicon Andal TV shows. Thank you. All right, appreciate it guys. Keep it right there. We'll be right back with our next guest. This is Dave Vellante. We're live. This is theCUBE.