 Okay, we're back here live in Strata, New York City. This is SiliconANGLE.tv, SiliconANGLE.com's theCUBE, our flagship program, we go out to the events, we extract the silver from the noise, and our next segment is going to be the winner of the startup showcase last night at the big event where they've featured all the top startups in big data, and the winner is here with us on theCUBE. I'm John Furrier, the founder of SiliconANGLE.com, and I'm joined by co-host. I'm Dave Vellante of Wikibon.org, and as John said, we're here with Daniel Abadi, who's the chief scientist and co-founder of Adapt Daniel, welcome and congratulations on last night. Yeah, thanks very much guys, it's good to be here. So we had you on theCUBE last year when you came on, and no one ever heard of Adapt, but like, hey, you know I've heard of Yale, heard of Database, heard of Column stores, I've heard of Vertica and these other companies doing a lot of work, but in that one year, you guys have come out with the product, had a big launch, good successful launch of the product, great reception. You have the talk of Strata this year, a lot of people are throwing your name around with Cloudera, Hortonworks, we had the Tableau guy on, and so that's great, how do you feel about that? I mean, as a co-founder, you got to be excited. It's been a huge year, as you said, we had our first product being released 1.0 earlier this year, we just announced 2.0 this week, and it's really so cool to see this technology that started off as just a research project four years ago at Yale University to go from there to this real product that real people are using. Is it because of Yale University? Because now the Brown guys, as you see the people know, Brown, Yale, Stanford, Cal, Berkeley, Carnegie Mellon, all jockeying for who's going to have the best start-ups. I think it's more like East Coast, West Coast thing, really, like I think all the East Coast guys get along pretty well, like Stan Sedoneck at Brown, and Mike and Sam at MIT, I mean, I came from MIT, so obviously I'm friends with those guys, so I think we got the East Coast thing, we got the West Coast people, there's some over there too, but it's a little bit sort of less intermingling, so. All kidding aside, that was just an inside baseball joke and the big data geeks, the computer science programs are booming, but seriously, let's talk about the technology, because one of the big themes in the marketplace is race to build out the middleware and the technology and the tools at the same time simplify. The demand for the products and solutions are high, so talk about the dynamic between you building the company, the rocket ship, at the same time, in terms of tech, and also making it an easy to use product and deliver solutions. So sure, yeah, so it's essentially what we're trying to do is take these ideas from two different and very important communities and sort of bring them together in one product, and obviously then make it very easy to use. The basic two communities we're looking at are database products, it's a DBMMS, so traditional national technology has been around for 30 years, that I come from the database community, so we've had three decades of research trying to make these things go really well on structured data, just to be able to produce very fast query results when you're dealing with transactions or analysis over structured data. So we want to combine that community with the Hadoop community, which came much later from originally from Google's MapReduce project, that's very good at processing on structured data and being able to go through text or write generics of MapReduce functions over any data that you really like, but doesn't do so well for, I mean you can do structured data in Hadoop, you can't really get high performance and certainly not interact with queries on that structured data. So what we're trying to do is bring some of these ideas from the database community to Hadoop. These ideas that my research group at Yale and many other research groups around the country, we know how to deal with fast query on structured data very well. So we want to take that and bring that to Hadoop. So that was originally what the HadoopDB project was four years ago, and that was commercialized and had apt also more than two years ago now. And that's what we're trying to do all about. Why the success, why the rapid success? Obviously yesterday you guys were voted as the hot start up by the crowd, not just by the event organizers. Why the success in your opinion? Well, I think it's a, this is something that people need. I mean, I think being able to have structured on structured data in one platform without connectors, really natively in one real platform to do any type of analysis you would like. I mean, you know, it really, this is more than just sort of big data where you have big data of many different systems to help you with it. It's like it's one system for all big data. I mean, that's, you know, that's a message that the people, you know, I mean, it's something that people need. They want to be able to, I mean, it's a massive, you know, big data market is now, was it going to say, 30 something billion? 50 billion. 50 billion. 50 billion. 50 billion. 50 billion. 50 billion. 50 billion. 50 billion. 50 billion. 50 billion. 50 billion. 50 billion. 50 billion. 50 billion. 50 billion. In a one system without connectors. So Danny, but your thesis essentially became Vertica. Did you know, so when you were doing that initial thinking and work, was there something in the back of your head that said, okay, this is good, but I got to go back and fix this other problem that I know is going to emerge or did you not see it at the time? No, I mean, I think, sort of, as you said, it was my PhD thesis, right? So the thing with the PhD thesis is you have to go in a lot of depth down one subject. Once it becomes befesare, you can have a lot more breadth and do multiple things at once, But at PhD, if you want to graduate at a reasonable time, not take 13 years, like some people at MIT do, like if you want to get done, you have to find one thing and just go very deep on that thing. So my thing was column stores. But figuring out how to, when you have relational tables and rows and columns, rather than storing it row by row, store a column by column. So that was the thing, and we went very deep in that. We built a career execution engine, a compression engine, an optimizer for columns, all the way up the entire column store stack. And that was, as you said, a commercialized to Vertica. And that was great. But I think you're right. I think from the beginning, I think even, this was in 2004, basically, when this all started. Even by then, it was clear that text had some value. There was important information in textual data that you can't really store in rows and columns. So even though we didn't really focus on that particular problem, when we did the C-store project at this point eight years ago, wow. Yeah, so even though that wasn't what we focused on, it was a known problem in the database community that somehow we'd have to integrate text in some way, or at least more unstructured data. So once Vertica was sold to HP, actually a little bit before it was sold to HP, I'd already moved on to Yale, and then we started this HadoopDB project to really focus on the next, in my view, the way I view the database market was we had transactions. And so in the 70s and 80s, we really focused on doing very good performance in transactions. You get to the 90s, it's much more about analytics and to be able to do fast analytics, but still it was all on relational data. And then in my view, where the market's going and why we found it had happened two years ago and started working on this even before that, was really just once, I think the next level of research, the next big wave of research in database systems is being able to open up to nominations, to be able to handle text data, image data, array data, especially scientific data. So all of the biology, they have these big genome sequences of three billion base pairs. That's not really fits in the Rosin column. That's very much of a, that's really sort of text data in some ways, because it's just like ACT and G's times a billion. So a lot of these science data sets are also a big direction for the market to go as well. So I think the variety part of the three V's is a big area of research in database systems and an area that really interests me, and that's why we did Hedapt. So I've got to ask you because Cloudera made a significant announcement today with their Impala product, with the folks out there, go to SiliconANGLE.com, we got a cover there, blog post, essentially it's about real time. Cloudera's moving from the red hat of Hadoop to essentially a big data platform, and their slogan is, you know, Cloudera, the big data platform. We have Mike Olson on the CEO after you, but I want to get your take on that because Hedapt is exactly the same concept. So you look at what you guys have as a product that you're shipping, and what Cloudera introduced. Concept-wise, it's the same thing. There's some differences, obviously, but real time and simplicity. Talk about the difference between what Impala has and what you're striving towards, or Havins are striving for. So let me just stop by saying I'm not a business guy. So as far as business side of it, I don't want to speak too much, but as far as the technology side goes, just as a tech guy, I don't know what it means for a business, but a tech guy, I love the announcement. It's really great to hear people finally embracing this idea of making Hadoop, so bringing SQL ideas to Hadoop. In the past, if you look at Cloudera's history, they sort of, I think the first MPP database to partner with them was actually Vertica, and we were talking about before, where Vertica was the first one to partner with Cloudera and build a connector between them. So basically, I think it was actually, I think it was at Hadoop World 2010. Vertica actually built that connector itself, I believe. Well, I think they actually had some people from Hadoop Cloudera actually. But Vertica certainly took the lead on it. So that was two years ago. So for two years, Vertica was the first, but then we saw a Teradata and a Teaser, Aster, Green Fund, they all built these same connectors right between themselves and Cloudera. And Cloudera had like five, six partnerships, maybe even more than that, with database companies that sort of allowed, you had a database, you had Hadoop, you had a connector between them. And they managed to sort of convince the market board of them and the database guys managed to convince everybody that this was the right way to go, to have two different systems, two different clusters with a connector between them. So you could sort of try and combine them over this wire, this basically this network to connect these two things. But finally now, this year, finally it's really great to see that now we have, Cloudera is saying, no, these connectors are not working. It's not the right way to build it. The right way to build it is to just directly build the SQL support and the sort of ability to handle sort of database technology directly in the Hadoop cluster. I mean, that's absolutely the right way to go and that's what we've been saying for four years. So from that level, it's really sort of, it's a real, I mean, I hate to use word validation because such a CEO thing to say, but in a way it's like a validation of our measures. But we're talking about platforms now. This is what the market wants, because applications have not morphed as fast from last year, but analytics is booming, right? So analytics is the killer app. So that's what people want, that's the simplicity question. So that's what people want. And simplicity is, when you say use word simplicity, the key thing is to have one cluster. You really just do not want to have Hadoop and database on different clusters. I have to maintain the both. You have data silos, I don't know where my data is at any point. It's just really ugly way to build something. And technology speaking, it's a really dumb way to go about it. It's just sort of like, we have two different clusters with a shared nothing MPP architecture where data is spread across each node and you're doing some sort of parallel processing on it, like it's just, it's so similar. There's just no reason why it shouldn't be one system. Well, we're going to have Mike Olson on shortly on their next guest. We'll ask him the same questions, but really good insight. Congratulations on your success. My final question for you guys is what's next? So for Hedapt and your positioning in the market, technically speaking, from a product and solution standpoint, what's next for you guys? Yeah, I mean, I think, you know, so we've got to stay ahead, right? So I mean, you know, with, you know, with other people now finally agreeing with us and sort of trying to bring SQL to Hadoop, you know, we've been at this for longer than anybody else. So we are ahead right now, but I think we have to keep on pushing that sort of, you know, sort of pushing the envelope, right? So we have to keep on, you know, staying as the thought leader in this space and, you know, build new product. You know, so I guess the main thing we have right now next is this, is this announcement that I think we had Scott on earlier today, like you talked about the sort of interactive queries hire thing. So now we've gotten rid of the Hadoop startup cost. So now, you know, we can have queries that run in less than a second. So I think that's, you know, bringing that out to market in the right way is, you know, is the most immediate concern. But then we have to keep on going beyond that and sort of, you know, continue to work on the performance improvements on our HDK and have some better analytics and just keep on going from there. Well, we always like to have you on the, I just want to add, so the implications of this I think are profound and David Floyer and Jeff Kelly on Wikibon just did some pretty detailed analysis on this. You know, chopping people understand the impacts on the existing data warehouse business and David got into sort of how it works. And so, check that out on wikibon.org. Daniel, great to have you on theCUBE when you were just a co-founder with the company you're building. You got great success. Also the product positioning is working and you voted the hot startup here at Strato this year. Congratulations. Thanks so much. It has to be there. We'll be back with our next guest. It's me, Mike Olson, the CEO of Cloudera right after this short break. Oh wow, that's quite a follow-up. 18 months ago.