 Okay, we're back live inside our cube here in San Francisco. I'm John Furrier, the founder of siliconangle.com and our continuous coverage of the Hadoop ecosystem going way back to the original days at Cloudera when I was down hanging out with those guys now and then the original Hadoop world, Hadoop world one, two, H-Base conference. This ecosystem's exploding. It's fantastic. It's exciting to bring you our exclusive coverage of this amazing now vertical going mainstream. My next guest is Jonathan Gray with Continuity. Formerly a Facebook, last time you were on the cube at the Duke World, you were with Facebook. I was. That was my last talk as a Facebooker. I just went public, a lot of action there. They give a keynote here. Jonathan, welcome back to the cube. Thanks for having me. So Continuity startup. We've had Todd Papiano on the cube. You've both been on the cube. Well, you've been on the cube three times. Twice as stealth, where you talk about big data and nothing about the company. And then once you're at the company, but you couldn't go into detail because Gigome was breaking the story. So we're still not going into detail. Unfortunately, we're still not going into detail. Give us a teaser. Show a little leg. I mean, just talking about where I've been. I mean, at Facebook, we did a bunch of really great stuff to talk about in the keynote this morning. But the point of my new company is to say, look, Facebook was super successful building that. They got 15 of the best and brightest engineers in the world working on it. And the question for us and our company is how do you enable one or two developers to build the exact same types of applications? Maybe at a slightly lesser scale, but still a significant scale. And just make their lives easy, powerful, all that kind of stuff. So when you were at Facebook, I noticed watching you do your thing, you were very actively involved in the HBASE community, opening up the doors, having meetups in Facebook, very collaborative hackathons. What has, how has that grown within Facebook? What's happened since those hackathons? I know a lot of people were hired, they're growing their team. You have any visibility now on how big the teams are over there? Not a whole ton, they keep it close to the chest. But there's a whole bunch of guys I don't know anymore. So it's definitely so growing. And there's definitely all that stuff going on. They've actually moved some HBASE guys into New York, I know that as well. So there's California and New York HBASE guys. So what is happening in this conference? Why is this conference so important? And then I want you to share with folks, what's impressed you about the HBASE community in terms of people and functionality? So what's happening here? And then what's impressing you about HBASE? What's here and what's just completely awesome is that this conference is a use case conference. It's all applications. You were at the First Hadoop Worlds and you saw the First Hadoop Worlds were actually not really about applications. And Hadoop Summit especially, it's about architecture, it's about scalability, it was about underlying bits and nasty things about infrastructure. The Hadoop World was geeky, the first one was very geeky. It was super geeky. And there is a little bit of geek into this conference but not that much. I mean, you have eBay and Gap. They're all practitioners here. They're all in the trenches. I mean, they said it, Michael Stack, the head of HBASE said it this morning in his keynote, they were going for applications in production. And that's amazing that they basically filled the entire schedule and said no to a bunch of people just with in-production applications. And that's super exciting. People are actually solving real problems in production today. You go to Facebook, you go to eBay, you go to Gap. What you're doing is being stored and retrieved out of HBASE. And so for someone like me that's super exciting that people are actually getting a lot of value out of this. In the days of HBASE when you were involved in the community, what were some of the core issues that you've well overcome and look back on and say, wow, we thought those were hard problems? Yeah, I mean, what's been really exciting is when I got into HBASE around 2007, it was part of Hadoop. It was the same thing. It wasn't a separate project. And it was really just still a batch-oriented system. And what's been exciting, and the tagline for the conference today is real-time year Hadoop. In 2007, that wasn't actually what HBASE was about. It wasn't about real-time access. It wasn't about serving data. It was really just about having a place to store your web crawl data. And since then, we've really moved it. My first company, Streamy, built our entire thing on top of HBASE. And we were one of the first applications to say, look, we're going to write into HBASE, and we're going to read, and we're going to serve all of our data out of it. And we had a hell of a time doing that. So it's interesting, today in the news, today in the news, there's two stories, one in the Wall Street Journal and one in the New York Times. The one in the Wall Street Journal is basically dissing Facebook because of all this public social data. And the only one in the New York Times actually more provocative, written by John Markov, talks about the challenge how social scientists really are having a hard time coming to grips with all the data. And one's a data glut, and one's saying, there's just two, I don't know how to deal with it all. So in a way, mainstream press, negative articles, but really, as I tweeted, huge entrepreneurial opportunity. I mean, did Todd coin the term social exhaust? He definitely didn't. I think Google did, digital exhaust. Digital exhaust is the coin that's around. Social data, all that's out there. And by itself, it doesn't look very compelling, but with big data, you can actually make sense of it. And H-Base has been really good architecture back. Can you explain to the folks why H-Base is so good for this new class of data? Yeah, absolutely. I mean, there's two main kind of core components of what makes H-Base, this really critical component into the Hadoop ecosystem. So Hadoop is great for cheap storage of your events and bulk processing of those events. But what it's not great at is any kind of real time online thing. So people use Hadoop today to build targeting models, for example, right? But when I want to actually target to my website, I can't serve that out of HGFS. I have to put that into some other system. And today, people are putting that into a relational database or something like that. With H-Base, you say, I'm doing my analysis on a cluster, and I'm serving directly out of that cluster too. We have one system that allows you to do all the same batch kind of analysis we're doing, plus the serving. The other side, I would say, is, and they talked about this in the keynote around application development, what's happening now is application development. We're moving up the stack and everyone's focused on actually building applications, delivering value to their business. Hadoop is a file-based thing that no one can understand. I have flat files. I can write to the end of them and I can read them. And that's not really a very powerful, it is powerful, but it's a limiting set of things. H-Base has tables with rows and columns. This is something that people understand. This is something that's been well understood for 30, 40 years. And so that table model of random access gives applicability to all of the existing kind of applications that people have been building. So people in the database world, I know Mike Stonebreaker, who started Vertica, Mike Olson knows very well. Column their stores and the different approaches. How do people make sense of that? Is there, how does H-Base different from these column, the stores, database only? I know it's still a different row of column situation going on with the H-Base. But why is H-Base so unique as one example? And then you've got other things, Mongo and other approaches. How do you talk to people about that? It's a complicated issue, you know? Some politics. And there's absolute politics and there's dogma and all of the good stuff you have in computer science. But what I would say is that the main advantage of the H-Base, one thing when you compare it to the other NoSQL stuff is the tight coupling with Hadoop. That means is I want to run MapReduce jobs, I can run them on H-Base, I want to run Hive jobs, I can run them on H-Base. And so they're in the same ecosystem and they speak the same APIs. Is it also timing too? I mean, a lot of times in the tech business, I mean, I've seen many innovation cycles and sometimes the best product doesn't always win. Is it fact that H-Base was kind of hobbling along and just happened to be a nice place untouched like a piece of clay that hasn't been shaped yet and as the world goes real time, is that, could that be a reason you think? Yeah, I mean, H-Base is super low level, very powerful platform that doesn't, it gives you a bunch of great primitives and things like that, but doesn't actually solve your problem. You have to build an entire thing on top of it. But what's happening now is there's companies like Wibidata, which is one of the companies here that I'm really excited about, which is to say a developer doesn't want to speak in three dimensional biteray cubes, this very low level, hard to understand, architectural kind of detail. What they want is more friendly APIs with some kind of schema and some kind of types of data. They want data types and things that make sense and they've always been there in the relational world. Those are being built now on H-Base and so it's making it much more accessible. Do you think the demand on the developer side is to have these prefabricated kind of software environments right now? I think that, is it the tools that are in the most need for, is it more? So there's two things, there's the APIs and then there's another piece and this is part of what Continuity is trying to do and it's to say today if I want to build something on H-Base, I can go to Apache website or I can go to the Cloudera website or the Hortonworks and get a packaged version of it and then I have to put the software somewhere and I have to run it and I have to scale it out and then if it crashes I have to look at the logs and understand what happened and so it's a really manual intensive infrastructure type operation to just build something on H-Base. And so what we're trying to, what Continuity wants to say and I think where things are moving is why should an application developer be dealing with operations and management of infrastructure? How do we abstract away the infrastructure and instead provide a really good environment? Powerful but easy to use environment for developers where they don't have to worry about their name note going down and they don't have to worry about H-Base versioning and they don't have to do a bunch of that stuff. We take care of that. And so I think what you're gonna see and you're gonna just see the application development takeoff is APIs and hosted solutions and those things are gonna combine to just enable this massive way of application development. Yeah and you can provide that framework that's available to come on board because it's about complexity and simplicity and making it less complex and it's more simple to do development. Absolutely. So that's basically a good question. So when you're basically talking about it, software guys aren't normally infrastructure guys. Yeah, right? So DevOps has been a big conversation. As Strata, Theo Schlottenegel said, it's not DevOps, it's OpsDev. So depending upon how you look at the room from the side of the room you're on, if you're an Ops guy, it's OpsDev. If you're a Dev guy, it's DevOps. Developers make mistake and can reboot a cluster. You can't reboot an operation. Some operations have five nines. I'm sorry, we're down, we just lost millions of dollars in business. But okay, I buy that argument. But in reality, you guys at Facebook built your own operations. You were developing, you had DevOps was like the core job description. What is DevOps in your mind? And two, what is going on in that market right now? It's really interesting. I only graduated to college six, seven years ago and so I haven't been out here long. But when I started, there was no DevOps, it was just Ops. And there was a wall between engineering and Ops. And I've been really, being in one environment where that existed, it sucked. Were you on the Ops side? Oh, never. Oh yeah. No, I was a software engineer, a computer engineer. And so I was always, even below the software sometimes. So you have shadow operations, you ran your own operations, right? So you do Ops, you do Ops. I mean, so at Streamy I did everything. Ops and development. We built HBase and then we also operated it. At Facebook, what we had was nested Ops, right? Well, they were DevOps, but they sat with the engineering team and that was really hugely important. One thing I think that's happened with DevOps, why there is DevOps, continuity, we just hired our first DevOps guy. Sixth employee. Sixth employee, it's totally incredible. Sixth employee and the startup is a DevOps guy. Folks, DevOps is real from cutting in steps. Go ahead, continue. We're building a hosted platform so we need DevOps. But what it really is, and I think this points to talk about how hard and how immature the technologies are, is that, you know, you can have a MySQL Ops guy and he doesn't have to even know what a B-tree is. Yeah. You can't have an HBase Ops guy that doesn't understand basically the architectural details of how the system actually is working. Well, eventually if HBase becomes popular, you will have those abstractions and instead of the seed separation. Exactly. And I think you'll see Ops, a true Ops guy show up in the Hadoop ecosystem. Who's got a quote, HBA, you know, not DBA, but you know, some sort of equivalent certification from Cloudera. Exactly. And so you see Cloudera, right? They're trying, they make DBA tools now basically. And you potentially, there could be someone who doesn't understand how MapReduce works that runs a Hadoop cluster today. Same cannot be true for HBase. You've heard it here first on theCUBE, HBA, the new job title. Well, it's exciting. Tell Todd, we said hello. We really appreciate you. Absolutely. You know, you've got an interesting perspective right out of college only a few years ago. You didn't have to live that baggage of the old life of when we, it's to be all the way really. Hey, I started as a PostgreSQL contributor. So I've gotten my hands dirty in the relational world, but happened to be out of it. So when did you leave Facebook? So you just recently left Facebook. I just left it at the end of November. So what do you think about the IPO? You gotta be pretty excited. You have friends there. What's the inside baseball on Facebook right now? What's the vibe? I was there yesterday, actually. I'm happy to say I had no idea that they just IPO'd. So they're not jumping around, popping into champagne, one day, they gave themselves one day. Yeah, so they're back to work. They're back. They always say 1% done and they still got that going and have a lot of faith in them. Final question. This is a Facebook kind of question to tie into your current thing. Give the folks out there some insight into what you did at Facebook and how that translates into what's happening in the age-based future roadmap of the general market in general. Some of the innovations that you did at Facebook and how that's going to translate to a broader market. I mean, age-based at Facebook took age-based to the largest scale it is. They talked about it in the talk this morning. They're growing at 10 petabytes of storage a year on their age-based cluster. That's 10 petabytes a year of addressable random messaging data. Every chat message right in Facebook right now goes into age-based. And so they have thousands of machines and it's running in a real-time 24-7, 365 environment. They were the first company to do that. They did it on a massive scale and it's been wildly successful over there. And so I think with that use case they've shown that you can do real online serving applications at a massive scale on president on top of this infrastructure. The other thing they've done is made it operable. And so they have an age-based ops team. They're very dev ops-y, they're really smart guys who know how to, who are basically engineers. And they've done all kinds of stuff to make this stuff usable. I mean, when you're operating thousands of machines you just have to do stuff to make it better. And so they've made so many improvements around performance, operability, all this kind of stuff. It's nice to have a nice big name like that. They're now a big name, but back then it was a couple years old, leading the market. What other use cases are you seeing out of this technical conference here that's popping out going, wow, I like that. Some good use, do some work there. Give some promotions and shout-outs to some use cases that you're hearing in the hallways here. Yeah, I haven't heard all of them yet, but what's exciting for me is I've always been on the kind of consumer internet side. So we call it consumer intelligence and that's kind of this first really emergent pattern in big data, taking consumer signals and doing ad targeting or deal targeting or whatever kind of stuff. But what's exciting is to see there's medical stuff here. People are doing genome analysis. People are doing environmental type stuff. There's a bunch of location things. It's just the diversity of the types of applications. It's not stumble upon in Facebook who are kind of like two or the leaders just in their applications, right? You've got government here too. You've got some financial services. You have GAP, you have a consumer retailer, you know? I mean, it's just really great to see like this diversity of applications, diversity of companies. I think what you're still seeing is larger companies and what's going to be really exciting is when you have a bunch of small companies who still have data and still want to do stuff with it but they're not the e-bays and gaps of the world that can take 10 people to go build a product. Awesome. Jonathan Gray, last time on theCUBE in Hadoop World was with Facebook. Now he's the co-founder of Continuity. I guess still stealth startup, we heard a little bit. Looks like they're providing managed hosting solution. Hopefully with a lot of H-grade software wrapped around us for some margin. Oh, there will be some H-bays there for sure. Gray insights, thanks so much. We'll see you later. We'll be right back with our next guest after this break.