 Live from Boston, Massachusetts, it's theCUBE at the HP Vertica Big Data Conference 2014. Brought to you by HP with your hosts, John Furrier and Dave Vellante. Okay, welcome back everyone here live in Boston, Massachusetts with the HP Vertica Big Data Conference. I'm John Furrier with Dave Vellante. This is theCUBE, our flagship program throughout the events. We're going to distract the listeners in the noise. Our next guest is Colomani GM of HP Vertica. Welcome back to theCUBE. Thanks, John. Great to be here with you guys. Great keynote up there you guys had today. Great conference, packed house. Again, just last year again was the inaugural event. This year still packed, great names. Give us the take on what the event's all about here. What's the vibe? So the vibe is great. The numbers are up again. This is our second show, as you mentioned. And for us, it's probably a slightly different show than what most people are accustomed to. We really want it to be about the people and specifically our customers and our partners. And sharing information with one another about data. As you guys know, this industry's evolving very rapidly. It is a fairly complicated industry. There's all sorts of challenges that people are trying to solve. And so we've, in working the agenda of the conference, we've really allotted a lot of time for folks to network for them to talk to one another. And then also, you don't hear a lot from a marketing perspective at all from us. This is really about our technical engineers and practitioners sharing their knowledge, how things work, learning from our customers and partners. So it's intimate. It's grown a lot since last year, but it's still intimate. And we're back here in Boston. It's great timing to be here. It's great. We've got our CubeLess as a lot of customers. And it seems to be the big theme. And Jeff Kelley was just pointing out that he predicted three years ago that the big growth and the billions that he's sized up was going to come from not so much the vendors, but the output of the ecosystem being the practitioners, the deployment of Big D, the use of Big Data. Funny story, last night I was welcomed back from Legal Seafood here in Boston. Beautiful night. And I saw a guy from in Raleigh, North Carolina, customer. And I asked him, he was here for the event. We just bumped into him randomly. And he said, I'm here to learn more about the columnar store. And he's coming from an ETL background. And I asked him, so you should, he was absolutely, we're shifting over. We see Hadoop. We see these new technologies as transformative. And I'm here to learn. Is that a consistent thing that you're hearing that people are here to learn? And what percentages do you see in terms of vertical success here to learn versus I'm driving the Ferrari of columnar store, high end performance. What's the percentage of in production versus I'm here to learn of your base? There's a high number of production customers here. But I'd say even with production customers, they still want to learn. Even just because you're in production doesn't mean you're using all the features and the capabilities. Our platform changes. We just came out with another major release. And so I think there's a constant learning that goes on even when you're in production. And some people will start out and they'll do, let's speed up our traditional enterprise data warehouse or let's speed up our traditional reporting. But then they'll start getting into A-B testing or they'll start to get into deep data science work. And that's part of the power of a platform, especially the broader Haven platform. But whether it's Vertica or EIDL or it's Haven, you can start adding things on to what you already do. And so I think there's a good mix. So you mentioned in your keynote, if you go back to, it must have been like what, 2007, 2008, Vertica was essentially a feature company. Column store. And that's not necessarily a bad thing. 3-par was a feature company. Look what happened to 3-par. But you've now, then you laid out a picture of the platform today. But then here, John's story, people still buy in that feature. But you've got some place to take them. So I wonder if you'd add some color to that sort of roadmap that you laid out. Yeah, it's a great point, Dave. So I still see certain pictures that prospects will do. Well, they'll lay out a landscape and then they'll have a column store off in the corner. The reality is that a lot of the workloads that people are doing, a lot of the machine learning, a lot of the clickstream analytics, a lot of the things that are beyond just the type of product that's underlying, everyone has that need. And so we've certainly brought into the, the columnar was our flagship. It's what we were known for in the beginning, but we were operating and still do under the premise that one size does not fit all. And we knew that if we created the right foundation, the right architecture, we built it from the ground up, we could then add on the right capabilities, as opposed to I think a lot of competitors which are taking decades old technology that was never designed for the workloads that we have today. And they have a lot of functionality, but they can't figure out the scale and they can't figure out the performance, at least not economically. We figured that out first, and now through our partners, through our community ecosystem, and through the work we're doing ourselves, probably as HP, we're adding a lot more capabilities to it. So the recent Wikibon big data survey, you probably saw some glimpses of it. One of the questions that struck me the most is have you begun to take resources away from your traditional EDW toward Hadoop? And it's huge percentage, like 90 plus percent of the people said, yes, we either have or will by the end of the year. And you guys are in the middle of that transition. You're not traditional EDW and you're not pure play Hadoop. So what do you have to do to preserve and expand your total available market? I wonder if you could talk about that a little bit. So there's a lot of tech terms floating around that I think confuse the whole issue. The early days of Hadoop were all about MapReduce. MapReduce is pretty much dead. It's gone. They're now focused on yarn, they're focused on workload management, and they're focused on databases. Focused on SQL. Still running a lot of other types of processing on Hadoop. But Hadoop itself has evolved. I think Hadoop itself has expanded the same way that Vertica has expanded. And there is absolutely a convergence which is why it's always been part of our Haven platform. We see it at a lot of customers. But what we try to do, Dave, is we try to just focus on what is the customer trying to do? Is it a real-time interactive workload where you're gonna want sub-second response times or is it batch processing? Or is it some simple SQL that you're just gonna look for a simple query to run or you're gonna do some complex work? Is it machine learning? And if we understand that, then we guide our customers down the right path. And again, one size does not fit all. I think the reality of the market today is you need a little bit of this, a little bit of that, throw in some of that, and the solution's up. Let's talk about that reality about MapReduce. You mentioned MapReduce, kind of saying dead, but when I hear something that's dead, I always go, whoa, something dead. It's a blogger in the room, because something's always dead when you're writing a blog post. MapReduce is evolving, it's either dead or being abstracted away. And are you saying the industry's expanding their focus or moving away from MapReduce? Well, I think part of what happened with MapReduce is you need to be a decent programmer to do it. And what we're hearing about at all these shows is there's such a shortage of people that know how to interact with data. And while programming is great, not many of us are able to do the programming, SQL, while it's not perfect, is at least a common or more common construct where you can use a declarative language to get at the data. I think that's certainly why you've seen the Hadoop community shift towards Hive, shift towards some of these other SQL-based solutions. So it's not that maybe MapReduce is dead as an overstatement, but if you look at those Spark forms, it's a shift. It's not the focal point anymore, that's for sure. And now you hear about Spark and you hear about all these other things in and around the Hadoop world. And then if you strip it back, it actually looks very similar to the platform that we've been building here as well. So I think there's a convergence. I think there's an opportunity for everyone and you hit it on the head. I think the real folks that everyone's frustrated with is the traditional enterprise data warehouse. They're sick of paying ridiculous amount of monies, trying to lock everything down into one monolithic platform that takes months and months to load and process data. People want that instant access, controlled access, but instant access to be able to monetize. Yeah, traditionally WS is not working anymore. It's not fulfilling the vision. So should we think of the core Hadoop infrastructure, the Apache Hadoop, the MapReduce underneath as sort of the substrate and the value gets built on top in analytics? You call that analytics 3.0. You said log everything, analyze everything, infinite sample size, like minimal samples are dead essentially. There's another blog post for you, John. Segmentation of one and then you said embed data smarts. So two questions, two part question. One is, is that where all the value is and what do you mean by embed smarts in the data? Yeah, so the embed smarts in the data is if you look at a lot of software that's being designed now, not just software, but automobiles, airplanes, you name it, even servers. And IT has actually been doing this for the most amount of time, but we hear about internet of things. What's happening now is products and services are being designed not only as the product and service that's being delivered, but they're being designed in a way that they can actually manage information about that product and service to make it better. So think about a phone home capability, think about you're getting feedback about everything that's going on in a car so that when you go into service, the automobile, you've got all that sensor data there. Well, cars are being designed now knowing that engineers should put sensors everywhere and collect that information and send it back. That's what I mean by the embedded side and to answer the other part of your question, I absolutely think the value is all in the data above it. There's a substrate, there's platforms, there's a lot of value there, but what ultimately folks want is they want the tangible patterns and value that can be extracted from that that they can use. The other stat I liked in your keynote, you talked about for every two orders of magnitude, increasing the amount of data, there's essentially four orders of magnitude in terms of the number of patterns that you can analyze assuming you got the brain power to analyze it for you tweeted out your reference to the Wall Street Journal article this week in the Weekend Journal. What is, what are HP Vertica doing to close that gap? Well, we're trying to, a lot of what we focus on is, and we do a lot of this work with HP Labs as well, but it's how do you not require a PhD to go in, look at the information and actually derive the value out of it. And I think machine learning, there's a reason why it's all the rage right now. We need to augment and to a certain extent automate some of what we do through the processing power on the computers to be able to deliver those insights. Now it will never replace what humans can do from a human intelligence standpoint, but unfortunately we as managers, we as people, practitioners are getting inundated with information. So even if we can just help focus into a certain area of information and alleviate some of that work, then it makes everyone a lot more productive. Okay, we have a question from the crowd chat, which by the way, go to crowdchat.net slash HP Big Data 2014. We have a live crowd chat. This is our new innovation called the engagement container. We're containing all the conversation, recording it all and voting on it. But the question for Colin, what's the reasoning behind saying Matt produces dead? Sounds like a fluff CEO state, but he used to be the CEO of Vertica, and now he's just a GM, which is like CEO like an HP. So that's the first thing he says. And he says, especially for non-RDMS data, I've always thought it insane that for decades, IT always attempted to shove everything in RD-BMS. Yeah, so two things. Fluff statement versus the jamming everything into a relational database. Yeah, this is a good statement. So while I made the statement about Matt produce, I absolutely believe that all data should not just be in an RD-BMS, right? There's a distinction. There's a lot of value to Hadoop. There's a lot of value to putting the information in there. There's value to the processing that you can do there, but the way that information is being processed on Hadoop today is dramatically different than the way it was being processed on Hadoop three, four, five years ago. So it's not an insult in any way to Hadoop. We absolutely embrace that, but the core founding of what Hadoop started as, Google moved away from that produce five years ago. So they're long since gone from it. They're the ones that started it. So it's not a fluffy statement. If you do some digging, you'll find it out there as well. But I also don't believe that you have to force everything into an RD-BMS because it's just impractical. People don't want to structure information before they derive value from it, right? Nobody wants to be forced to do that. And so there's a lot of value to these alternative platforms. Final question. I know we got a break getting the hook here, but I want to ask you true or false statement. Vertica is a Ferrari of big databases because of its performance. Is that a myth? Is that positioning? I mean, we're seeing a lot of the high-end folks who have huge needs. Obviously, the big, you know, the Facebooks of the world, the high-end customers, I mean, are all endorsing your products. So that's a good testament to the product. But is it the Ferrari? And do people know how to drive this thing? Or is that true? Is that true? Or do you see it differently? Yeah, no, I definitely see it. I don't want to be accused of being the fluffy GM making the statement. But it's a Ferrari. It's very, very fast, but it's more than speed. I think the one message that I convey to people all the time is, yes, Vertica's fast. Yes, we have speeds and feeds and all that other stuff. But so much of our value is about the functionality too. It's about the analytics that you can do on the information combined with that speed. That's what gets us to where most of our customers really want to go. Do you worry people don't know how to drive the Ferrari? Yes. Hence... Well, not only do I worry that they don't know how to drive the Ferrari, I'm not sure that people have encountered in the same context an automobile quite like this. And this is actually more than a Ferrari. This is a massive freight truck train, whatever you want to call it, that happens to go handle as fast as a Ferrari. Just be careful you don't run over into your competition. That's been a trend we've been seeing lately. That wouldn't be too bad. Okay, this is the cube where he had lied in Boston. We'll be right back with our next guest after this short break. Thank you.