 Okay, we're back here live in Silicon Valley, San Jose Convention Center. This is day two coverage of Hadoop Summit. The hashtag is Hadoop Summit. Tweet Hadoop Summit, hashtag Hadoop Summit. And we'll be following that and we'll answer any questions. If you have any comments or commentary you want to participate, use the hashtag. This is theCUBE. This is Silicon Angle and Wikibon's exclusive coverage of Hadoop Summit. It's our flagship program. We go out to the events and extract the signal from the noise. I'm John Furrier, the founder of Silicon Angle and I'm joined by my co-host. Hi everybody, I'm Dave Vellante of Wikibon.org. Scott Houser is here. He's the vice president of marketing at ADAPT. That's a company that we've been following for quite some time now on theCUBE. Born out of Yale, Daniel Abadi, Justin Borgman, moved up to Massachusetts. Scott, welcome to theCUBE. Thanks for having me. Yeah, good to see you again. I'm going to start right off. You guys are substantially different than anybody else in this community. You're very unique in the way in which you attack the problem of interactive queries on Hadoop. Talk about the problem that you're solving, how Hadoop solves it, and why you're different. Sure, so fantastic question. I think if you look at what we've done since the beginning, I think we sort of evangelized the beginning of this market trend. And what we talked about from the beginning was this notion of the database for Hadoop. And if you look at what we've been working on for the last couple of years, it's really about giving the value and benefits of a relational database engine inside with all the value and benefits of Hadoop. So we co-locate a real relational engine on every node where Hadoop exists, and give you the benefits of being able to work with all the multi-structure data that may exist in that query, or excuse me, in that cluster. Okay, so contrast that with, let's say, kind of a pure play query engine that you might see, the plain vanilla query engine, let's call it. How is that different from what you guys do? So I think one of the key differentiation there in that context is the query engines look at HDFS as one source of data, or potentially HBase, but they're tied to looking at and materializing data from those files at query time. And what we focus on is building a real relational engine that has all the benefits of a relational engine, things like indexes, materialized views, et cetera, and gives you all the benefits of SQL without necessarily being tied to some limitations. So if you think about our storage layer capabilities, they're really, they're really agnostic in that we can work with data inside of HDFS. We obviously materialize some data into tables that we manage ourselves. We also integrate natively with the HBase solar. I mean, one of the things that was talked about yesterday during Murph's keynote was this notion of text. We've been doing text for over a year. Well, search, right? Yeah, absolutely. I mean, search is all of a sudden the hottest thing going, and you've seen that for a while, actually. I mean, Oracle's acquisition of Indeka kind of woke the world up and said, wow, search must be really important and be a fundamentally embedded part of all applications. So now you guys obviously have been doing search for quite some time. I think you announced probably a year ago. So what I want to understand is from the standpoint of the Hadoop customer where, you know, search, everybody knows Google search, but what is it that's special about search in Hadoop and why is it that you guys were able to come out a year before pretty much everybody else? So I think one of the most unique things about what people want to do with search with respect to analytics is leveraging the unstructured free flow text and context within the text as an attribute and a query. So when you think about analytics, you think about all of the traditional things that are available to query, things that are very structured, right? But the ability to use search as a filter on a SQL query is a very powerful thing. And so we see in a bunch of different use cases, you know, one of the largest media companies in the world leverages us to be able to correlate events and behaviors via one channel and then effectively filter that based upon another channel of engagement with that same user. So what's the sentiment around things that the user may be commenting about in forums, in chat, et cetera? And how do I use that as a filter against a different channel behavior? In a, I won't say real time, but a near real time interactive. Interactive, yeah. I want to come back to that example in a second, but before I do I want to go back to the sort of, just really understanding the differences of ADAPT. So you guys, you said you bring a lot of the SQL-like capabilities to Hadoop. But there are some trade-offs. You don't bring everything. Can do things like, if I understand it, I mean, certainly search is an example, but you can do user-defined queries as an example. That's right. So we have a complete framework that we call the HDK or the Hadoop development kit that enables users to write their own analytic functions and we can call them as a procedure, if you will. But we also embrace the entire ecosystem of Hadoop and there's a bunch of things going on in the machine learning world that now we also make that accessible to the common business analyst via a SQL function. Okay, so what, when you guys go into an account and you're emphasizing your value proposition, what's the primary value proposition that you're selling? Is it performance? Is it simplicity? Is it integration? All of the above? And how does that translate into a business outcome? So I think, you know, great question. I think what we see in most cases is, there's this, you know, we talk about unification and we talk about bringing all these tools and technologies and making them accessible to an analyst via a common and well understood interface, which is SQL. But that's just the beginning, right? And so when you start to look at what people want to analyze and they want to get their arms around all of the data that exists in the enterprise, a lot of that doesn't fit into structured tables, right? So what we give people the ability to do is bring that into one unified platform and take advantage of it, irrespective of how it's being stored or what engine it's being used to store it and make all that accessible to an analyst via a well understood common language like SQL via tools that they know and love. Okay, now let's go into maybe some customer examples. You mentioned a media company. We talked about multiple channels. Let's unpack that a little bit. Talk a little bit more about kind of what the typical customer is doing, how they're using Hadoop, what they're doing in big data, what the life was like kind of before and after you guys. Sure, so I think in many cases, what I've seen in a lot of the folks using Hadoop is they've sort of been relegated to the ETL space, right? And so a lot of folks have focused on the two vendor approach where they say, you know, I've got an MPP analytic database and I've got Hadoop over here in the corner and I'm only doing batch-based ETL type workload on Hadoop and then if I find something interesting, I extract that and I push that over to the MPP platform. And what we advocate for in our customers and what we talk about a lot in different use cases is the merging of those worlds and the fact that with HEDAPT, you don't have to have a two-platform approach and you don't need connectors. We bring all of that together into the one platform. All right, so let me double click on that. So what's wrong with that two-platform approach? I think so there's a number of things that are wrong, right? So you end up propagating this notion of the data silo, which has been a problem for data management for decades. You also complicate things around enterprise features around availability, disaster recovery, scale, performance. There are a tremendous number of attributes that- It's complexity. It's hugely different. It's hugely difficult for the user. But it's also one that if the analyst wants to be able to truly manage all the data and get access to and query all that data and find the relationships that were otherwise previously undiscoverable, the only way you're going to do that is by giving them access to all that data. And if the analyst has to figure out which platform do I put what in and how do I move it around, right? You're never going to get the mass market adoption. So one of the things I like about having you in theCUBE Scott is your vice president of marketing, your marketing guy, but you're a practitioner. You're a formal practitioner. So you cut your teeth as an IT user of this technology. So you have some chops there. So let's put you in that position of a customer. So you've got this two-platform situation and you talked about some of the complexities. Can you give us a little bit more color on the kinds of things, tasks? What's the day in the life of that individual like if they've got that two-platform situations and how does their world change with ADAPT? Well, I think, you know, imagine that you have to operationalize two disparate platforms, right? It's double the work. So when you think about- So you need a bigger team. A bigger team and diverse skill sets, right? Because you're going to have to have specialists in these different areas and you're going to try and follow all the enterprise standards around availability, disaster recovery, accessibility, scope control and data management. And you're going to have to do that in two separate occasions. And what happens is there's this feedback loop that occurs between the business and the tech guys where the bigger those systems get, the more complex it becomes. And if I'm an analyst and I want to get access to some of the data, what traditionally would happen is, you know, there's been an ETL job that's been written. There's been, you know, some data's been called inside of Hadoop and I pushed something into my MPP database and now the analyst has found something that's interesting and they want to double click or drill down on that. But the data doesn't exist here. So now I have to go back to the IT guy, tell me, hey, can you re-run that job and here's some other stuff I'm looking for. Can you find that and give it to me? Whereas contrasting that in ADAPT now, you walk in and it's all the data lives in the ecosystem of Hadoop and is accessible to that same analyst so that as they find discoveries and they're doing this interactive, you know, sort of analytic workloads, they can dig into the raw detail of information and do all that without that communication loop, having to go back and involve somebody in IT, having to go back and write another job, having to try and materialize the data yet again. So what you're describing then, let's talk about the business process, you know, ripple effect of that. So you've got multiple databases and you're building business processes around each of those databases. And you're essentially wiring your business process to the limitations of your IT infrastructure. That's correct. So, okay, so now ADAPT comes along, you sort of unify all that and simplifies it so you get less people, you know, more focused skill sets, great. Is it your vision that essentially Hadoop becomes the data we just had Herb Cuniton from Hortonworks and their vision when they launched was 50% of the world's data will be stored in Hadoop. They're couching that a little bit now, saying stored or processed in. But okay, a lot of data is going to be stored in Hadoop. So is it your vision essentially that you can, you know, essentially replace traditional data warehouse system? Yeah, I think, you know, obviously long term that's our vision and our belief. I think, you know, for us, Hadoop is sort of, it's like our operating system. And I believe that Hadoop is effectively the operating system for big data. And I think what you'll see over time is that it will continue to eat away at the traditional infrastructures and change the way that people look at data management of the future. I think obviously as that occurs, you know, giving analysts and business people access to the information and letting them have the discoveries and insights is what makes it so critical for a tool like what we provide to the industry to take all that data that will reside inside of Hadoop and make it accessible and actionable to all the people who can, you know, change business outcomes based upon it. So the obvious sort of follow up question there is people think of Hadoop, they think of Batch. We've heard, we've been hearing all this week about Yarn and Hadoop 2.0. And we're certainly hearing a lot about interactivity around Hadoop. I like the way we talked to Merv in the cube yesterday and he basically said, you got Batch, you got real time and in between you've got Interactive and that's where a lot of the action is today. So in your view, Hadoop from a performance standpoint will be able to compete long term with the traditional data warehouse business? Absolutely. And I think, you know, we see, you know, in situations today and customer accounts where we're augmenting some of that workload now. And I think over time, we'll continue to eat away at that. I think there's been a lot of interesting research done in the financial services world where they talk about, you know, some of the incumbents and pressures that they're feeling in the market based upon these emerging technologies and some of the sea change that's occurring in data management. Yeah, well we certainly written about in SiliconANGLE and Wikibon, you know, some of Oracle's issues and have asked the question, is it tied to some of these, you know, the ankle-biter of Hadoop? And, you know, it's reasonable to assume that there's some kind of pressure, you know, on the traditional business. At the same time, those guys will likely respond, they'll buy companies, you know, they'll hold their ground, so to speak. But I want to dig into the economics a little bit because to me, to the extent that you can perform and do the same work, essentially, which is what you're saying long term, then it comes down to the economics. That's right. And it seems like a lot of the economics in traditional data warehousing is around the infrastructure, you know, pouring dollars into infrastructure. To proprietary systems. That, maintenance, and also just making the stuff go faster because of the architecture, right? I mean, one of our clients at Wikibon calls it chasing the chips. It's like a snake swallowing a basketball. That's how they describe their, you know, traditional data warehousing environment. So what are the economics of this new world? We're talking about, you know, cutting your infrastructure costs and, you know, buy a third, buy half, buy two thirds. I think there's a balance to that because it's not just about the cost reduction. It's also about the revenue opportunities that you're missing otherwise. And I think that's fundamentally going to leapfrog the savings. There's absolutely, I believe, significant savings in the core infrastructure. But I think by providing access to the data and the insights that are discoverable based upon access to all the multistructured information, I think the revenue curve dramatically changes for the customer. So you're arguing essentially that you can do things in a timeframe that you couldn't do. It's not just about time, right? It's about not even being able to do so. It's physically impossible. Like what, give me an example. So looking at multistructure data, right? Taking a look at and saying, I want to take and use search as a filter on a SQL predicate, right? That's not something you're going to go and implement inside of a legacy piece of infrastructure, you know, and anytime soon or in any easy way, right? Changing the processing paradigm that's been built over decades in products isn't going to happen overnight. And so for the customer, when you start to look at the disparity of the sources of data where they're finding value and the relationships that they're able to define based upon seeing these different dimensions, that's powerful. And you're not going to be able to do that in legacy pieces of infrastructure, you know, anytime in the near future. Okay, so my last question. So we've been watching you guys for a while now. We've seen you, you know, you did a raise, you've got, you know, building up your staff, you're getting the product out, signing up customers. What should we be looking for now going forward over the next, you know, six to 12 to 18 months from Hedapp? What should observers be focused on as measurements of your progress? I think the most important one is seeing the customers themselves speak about the things that we're doing. One of the biggest challenges that you have in this space right now is that some of the things that people are able to do are not just mission critical, but also very differentiating for them in the market. So they're very tight-lipped about the things that they're able to accomplish. But I think as we continue to push more mainstream and we take over, you know, more of these traditional sort of use cases, I think you'll see more and more in the public eye about the things that we are disrupting. And I think that will be very powerful in the next, you know, six to 12, 18 months. Awesome, all right, Scott Houser from Hedapp. Really appreciate you coming by theCUBE and good luck with everything. We'll see you around at all these events and keep plugging. Thank you very much. Right there, this is theCUBE. We'll be right back with our next guest. Avi Metta, I believe, is up next. And if you've never seen Avi, you should definitely watch him. Keep right there, this is theCUBE. Silicon Angles Continuous Production. We're live from San Jose at Hadoop Summit 2013. We'll be right back.