 Okay, we're back here live at Stratoconference. Day two, this is Silicon Angles, exclusive coverage of Raleigh Media Stratoconference in Silicon Valley. This is where all the action's happening in the big data business, the ecosystem, the industry. Big data is evolving from a few years ago, just a short few years ago, from some technology enablement with Hadoop, unstructured data now exploding into a competitive landscape where the big money players like EMC, IBM, HP are all coming in hard Intel with their own distributions and really advancing the innovation. This is theCUBE, our flagship program. We go out to the events, you extract the ceiling from the noise, again, this is our fourth Stratoconference, go to SiliconAngle.com for all the reference point of innovation and wikibon.org for research where they just researched it, put out there a second annual market sizing report. Go to wikibon.org slash big data and you'll see all the resources that we are publishing for free. Go enjoy the content. I'm John Furrier, the founder of SiliconAngle.com and I'm joined by my co-host today. Hi everybody, I'm Dave Vellante at wikibon.org and I appreciate you watching. Thanks for being part of our audience. You can tweet us, I'm at Dave Vellante. He's at Furrier, so please do that and we appreciate the questions. We're here right now with Josh Clark, who's the vice president of products at Green Plum, which is an EMC company. Josh, welcome to theCUBE. Thanks. You guys, big announcement on Monday. I remember now a couple of years ago at EMC World, John and I were there with our team. We had theCUBE and you guys actually made the first announcement of one of your earlier partnerships with MapR and really declared that you were going to get in the game. Now since then, you've really started to step up the investments and Monday's announcement really underscored that. So first of all, congratulations. I wasn't there, but John and Jeff were there. We heard some great things, fantastic marketing. You got some good press out of it, so congratulations on that and how do you feel? Oh, really excited. As I said at the press launch, it's actually, I think, the most exciting product launch I've been a part of. What we've done with Pivotal HD is essentially we've taken I think 10 plus years of parallel database development and we've brought that into the Hadoop ecosystem. So it was really, the launch for me was exciting, not because of the event and all of the press, but really to be able to go up there and demo a product that works at scale on really large amounts of data. And essentially the theme is me bringing SQL to Hadoop, which is something that we started to hear about last year. You guys, wait a little bit for your announcement. You wanted to come out, presumably, with something that had a little bit more meat on the bone. Talk about that a little bit and how you position, relative to some of the other guys out there. Yeah, well we've certainly been watching the evolution of the SQL on Hadoop marketplace. In the background, we've actually been working on this project for over a year and a half of bringing the parallel database technology into Hadoop. And like you said, we wanted to wait until we had something big and material to announce before we came out with it. But we're really excited about it for a few reasons. One is really, it takes a long time to build a scalable query optimizer that can actually handle hundreds or thousands of data nodes that you can talk SQL to and it can actually handle the complexity of the interactions with the data. And so I think that's an area where we've really differentiated is we're bringing in 10 plus years of development on these scalable query processing systems where the Hadoop EGAS system has started to bring database management technologies into Hadoop. But they're really just kind of beginning that journey. So we think we bring to bear a lot of technological capabilities that just aren't there in the market today. Josh, we've been covering the business marketplace since the beginning, going back to the original Hadoop world, now Hadoop Summit. And just recently at the last Hadoop Summit, Horton X puts on with that Yahoo, they talked about crossing the chasm. And that was a big theme where crossing the chasm is a book written that talks about evolution of markets and how you cross that chasm and then you hit mainstream. So with EMC's announcement, really that underscores, at least you guys were presenting it, underscores essentially the evolution of EMC and Pivotal HD has green plums, it's got Pivotal software, C to some other acquisitions you guys have done within VMware that's spinning out into its own company. So there's a future plan there. You guys are kind of light on details, but that speaks to kind of the future market. So the markets crossing the chasm, you guys have a direction, not yet announced with this spin out. So the announcement, the big news was, aggressive stance against Hive and Paula Clodera. So I want to ask you, everyone wants to know, why the aggressiveness? Obviously EMC is a big money player. It's not like you don't have any customers, you have a lot of customers in the enterprise. So two, this future direction, this new marketplace of big data that crossing the chasm. We'll talk about that dynamic and then talk about the aggressiveness of the announcement and why the stake in the ground's so hard. Yeah, and I guess I kind of view it a little bit differently, which is, the intent was not to be aggressive. It was really to look at, what is the state of the art in terms of SQL querying on Hadoop today? And I think Hive is that, especially for how do you enable data workers and folks that are able to talk SQL to write large batch jobs that scale and run on Hadoop. And it's a great tool for that stuff. So the intent was not to say, we're going to replace Hive. It's that there are some things that Hive does well. But you guys were pointing out some benchmarks, which is basically saying, hey, there's Hive and it's not performing well. And here's Impala and here's some of the benchmarks. And what we see today is, Hive is really good at solving some problems, but when our customers, and I don't think anybody would dispute this, and I don't think the folks at Valera would dispute this. When you try to use Hive to do interactive analysis, it's just not good at it. So the intent was to show, here's a series of use cases where you need interactive analysis and here's how it compares to what's in the market today. So it wasn't intended to be a knock, it was intended to be, here's how you can bench mark. It was an exclamation point. It certainly was an exclamation point. Well, you're trying to differentiate. And this sort of back to my original question is, how do you guys differentiate? So you've got Hive and that's really sort of Impala and Horton works as well. What about Hadaap? So they're really sort of pushing more closer to SQL. Are you saying that they're not as relevant to the interactive analytics piece, or is there, what's the nuance there with regard to Hadaap, if I could ask it that way? So I won't pretend to be an expert on kind of the inner workings of Hadaap. I actually think that Hadaap architecture is probably the most similar to the approach that we've taken, which is let's embed a database engine on each of the nodes that sits in your Hadoop cluster and let's create a federated query environment where you can go and kind of scale out those database engines in your Hadoop cluster. I just think we're farther down the line of solving the problems that you encounter when you need to start moving data back and forth between those query processing nodes. Why, because your database is more mature? Yes. Okay, so is that a two-sided coin, right? You've got the maturity of the database, but at the same time, Hadaap's been around for now a year, a year and a half, so they've got the modern mojo going. Is that a trade-off, or is 11 years just an infant in the database world? I think 11 years is not an infant. Or maybe I should say an adolescent. Maybe an adolescent. And I think there are certainly some benefits from starting and trying to solve architectural problems from scratch. If you actually look at some of the challenges we've had with the Green Plum database that we're running into, it tends to do with scalability of storage. So we do well with hundreds of nodes, but when we get to thousands of nodes, the storage scalability of the traditional Green Plum database, we started to have challenges. And so that's really where HDFS comes in and solves a big problem for us. So pretty much, I mean, things like user-defined queries, which you didn't get out of the box with Impala enough, you do now. I mean, obviously, you get that. With this Green Plum, it's pretty fundamental. But people really don't use NoSQL because they want to. They kind of use it because they have to. Do you see that changing with announcements like yours? I think NoSQL solves particular problems really well. It's really our approach is to try to figure out what are the business problems out there that haven't been solved that customers are asking us for? So if you look at the NoSQL movement, if you want key value stores, if you have programmers who are building scalable web applications that need access to those elements really quickly, I think NoSQL actually is a great solution. But the target for what we've launched with Pivotal HD and Hock are the data worker communities. So it's business intelligence, it's analysts, it's data scientists who are kind of struggling with data access on top of it. So do you think that the guys, you came from a BI background, right? You were inside of Yahoo, really providing a service to the organization with a lot of traditional BI tools, right? Yes. Do you think that this announcement or announcements like yours will get the traditional BI guys sort of off the mark? I feel like they've been sitting on their hands a little bit saying, all right, well, maybe they've crossed in their arms. Let's see what happens with this Hadoop movement. Is that first of all a fair characterization? And secondly, do you think this will sort of entice them to dive in? So I think the BI vendors certainly have recognized the rapid adoption of Hadoop. And if you look at all of the players, they've built adapters, so they've built connections to highs and they're trying to get into the game. But I think they've struggled because BI by its nature is interactive. And there really isn't a nice interactive platform on top of Hadoop. I think this enables you to start bringing the requirements of the business intelligence ecosystem, which is interactive query and really robust SQL support. It allows you to bring that into the SQL platform. So one of the things right now that's obvious, we talked to Bill Schmarzo, the dean of Big Data, one of our friends in the cube yesterday, and we talked to- I don't have his colleague in mind at Yahoo. It was, oh yeah, we have big fans of Bill. But we had an interesting discussion around the data warehousing business. And again, back to crossing the chasm, the model is changing. People are going to move from specific use case purpose-built queries that may work better with SQL and it's only one dimension of the market. So you're bolting on Hadoop into data warehousing to create a kind of a lower cost data warehouse and then provide really fast response time to queries. So that's great. So that addresses a great marketplace you guys are attacking. What specific strategies do you guys have to move into the market where you want to bring new use cases in there where you have to transform data across resources? Are you bolting on HD on top of HDFS across all those resources? Because that's one thing that Impala has that's interesting is that as data moves across the network, you can get those new use cases. So for the new questions, the new answers that aren't yet evolving, that seems to be a more of a data platform approach versus a straight data warehousing. And I think the way I view it is we're expanding the number of use cases that can be done on top of Hadoop. So with Pivotal HD, all of the use cases that I'd say Hadoop excels at and shines at today when it comes to batch processing and certain types of analyses and scaling to write MapReduce programs, you can continue to do that on Pivotal HD. But what we've seen with real customer implementations is those big Hadoop clusters are 10 to 20% utilized. We actually saw this at Yahoo that we had a huge, valuable data asset, but the ability to bring computational services on top of Hadoop, they weren't there. So we've just, I think, added a new use case that you can do in this environment in a seamless way. We're not saying use Hadoop for your data warehouse and not other stuff. It's really just expanding the use cases. Got it. So we're getting the time here because we're behind schedule because of the keynotes. My final question for you is, I want you to talk to the folks out there because obviously you guys have made a big splash in typical EMC fashion. You have a lot of customers, so it's not like you're making this stuff as you go along. You had a lot of engineering went into this announcement. There's a lot of people in the community want to know that know what the strategy is. A lot of naysayers out there, kind of like hymning it hard around EMC, making a lot of noise and claims, et cetera. So what do you want to say to those folks out there around EMC's approach in the community and the business strategy around this announcement? I'd say in general, I'm not sure what all of the naysayers are saying, but our approach really is we're trying to solve customer problems. When we go into the enterprises that we deal with, we listen to what they have to say and we're trying to bring software into their operations that solve their problems. We think Hadoop is a critical part of that and we want to participate in the Hadoop ecosystem. You foreshadow a little bit some of the stuff around the Pivotal Initiative. Our plan is that we will be active contributors into the Hadoop ecosystem, but we also think there are some compelling strategic differentiators that we have to offer and we want to bring that as a full package into our customers. So I think you'll see a continued investment in Hadoop, a continued investment in our SQL services and other differentiating services on top of that platform. And we actually want to, at the end of the day, give our customers something that solves our problems. So, just quickly to follow up on that, just address the point to folks out there to help them understand the word proprietary has been kicked around of the proprietary stuff and obviously this is some proprietary IP that you guys have. And where that sits, I mean we're calling it open source plus. Some are saying it's proprietary and open source will always win over proprietary. Just quickly comment on that view and try to give us a tease into what the roadmap might look for you guys as you extend out the functionality. Yeah, so I mean I would, I don't think I personally don't believe proprietary is a good thing or a bad thing or open source software is a good thing or a bad thing to the customer, right? What they want are things that solve their problems. And so I don't want to get caught up in proprietary, everybody I think in this space that is a profit-oriented vendor is trying to create some proprietary offering. It may be software, it may be services, but what they're trying to do is land-grab, is provide software that solves problems and get paid a fair price for it. So it's competitive marketplace as we pointed out in our new segment yesterday, it's very competitive right now. And Pat Gelsinger said on theCUBE three years ago, Dave, that they are entering the market to win when he was at EMC, now he's the CEO of VMware and I'll see it's notable that Paul Moritz is involved in this new initiative. So it's not like you guys are like nipping at the heels of the marketplace, you have real team and you have big customers. So winning is a definition. He also said there will be no red hat of a Duke. We're going to hear from Hork works in the next segment or the two segments from now. We're going to hear from Hedaph, another startup. But this is a really interesting opportunity for the marketplace sequel on Hadoop. Is it Hadoop in sequel? Is it cheap data warehousing? Is it going to evolve into enabled business intelligence? That's still to be determined. We believe that the business intelligence market is underserved right now. I think that's a good approach for Green Plum and where you guys go from here, we'll be watching. So you guys made some bold moves, great product announcement, a lot of sizzle and we're going to be chewing on the steak, as they say Dave was sizzling the steak with the EMC announcement. So thanks for coming on theCUBE. We appreciate it. We'll be right back with our next guest after this short break. This is EMC Green Plum inside theCUBE talking about their pivotal HD announcement and talking about all the future opportunities around it. And we'll be right back with our next guest. I looked at all the programs out there and identified a gap in tech news coverage. There are plenty of tech shows that provide new gadgets and talk about the latest in gaming. But those shows are just the tip of the iceberg and we're here for the deep dive. There's a difference between technology consumers and those who live the business day today and our viewers recognize that. The market begged for our program to fill that void. We're not just touting off headlines. Our goal is to provide you with a story but we also want to analyze the big picture and ask the questions that no one else is asking. Our guests aren't just here to provide commentary. We work with analysts who know the industry from the inside out. The tech business isn't new but many networks treat it as if it is and really barely scratch the surface on technology coverage. We follow the expansion of the cloud and the evolution of big data. We're covering new enterprise from startup to IPO and every move in between. So what do you think was the source of this misinformation and so you mentioned briefly there are several other. If that's the case then why does the world need another software as a service player? I like to think of us as a companion to theCUBE. We're here every morning trying to extract the signal from the noise. Where theCUBE excels in event coverage we're working to bring that experience to you consistently every morning. We use the top stories of the day to provide you with breaking announcement.