 Hi everybody, we're back. This is Dave Vellante from wikibon.org. We're live at Strata in Santa Clara, California. This is SiliconANGLE TV's continuous coverage of the Strata conference, O'Reilly Media. O'Reilly Media's a great partner of ours and thanks to them for allowing us to be here. We've been going all week, this is day three for us. I'm here with Jeff Kelly, Wikibon's the lead big data analyst and we're here with Jack Norris, who's the VP of marketing at MapR. Jack, welcome to theCUBE. Thank you, Dave. Thanks very much for coming on and we've been going all week. You guys are a great sponsor of ours. Thank you for the support, we really appreciate it. How's the show going for you? The show's been great. A lot of attention, a lot of focus, a lot of discussion about Hadoop and big data. Yeah, so you guys getting a lot of traffic? I mean, I hear there's 2,500 people here up from 1,400 last year, so that's good. Yeah, we've had like five, six people deep in the booth, so I think there's a lot of interest. So it's interesting, when we were here last year, when you looked at the infrastructure and the competitive landscape, there wasn't a lot going on and just a very short time, that's completely changed. And you guys have, you know, had your hand in that. So that's good, competition is a good thing. And obviously customers want choice, but so we want to talk about that a little bit. We want to talk about MapR, the kind of problems you're solving. So why don't we start there. What is MapR all about? You've got your own distribution of enterprise Hadoop, you're making Hadoop enterprise ready. Let's start there. Okay. Yeah, I mean, we invested heavily in creating a alternative distribution, one that took the best of the open source community with the best of the MapR innovations. And really it's about making Hadoop more applicable, broader use cases, more mission critical support, being able to sit in and work in a lights out data center environment. Okay, so what was the problem that you set out to solve? Why do we need another distribution of Hadoop? Let me ask it that way. Get nice and close so the folks can hear you. So there are some just big issues. Hadoop's not perfect? With Hadoop. What are those issues? Let's talk about those. There's some ease of use issues. There's some dependability issues. There's some performance. So let's take those in order. Right now, if you look at some of the distributions, Apache Hadoop, great technology, but it requires a programmer to get access to the data through the Hadoop API. You can't really see the data. So there's a lot of focus of, what do I do once the data's in there? Opening that up, providing a full file-based access. So I can look at it and treat it like enterprise storage, see the data, use my standard tools, standard commands, drag and drop from a file browser. You can do that with MapR. You can't do that with other distributions. So first you talk about mounting HDFS as NFS. Correct. As an example. Correct. And then just the underlying storage services, the fact that it's append-only instead of full random read-write, causes some issues. So that's some of the ease of use features. There's a whole lot we can discuss there. Big picture for reliability dependability is there's a single point of failure, multiple single points of failure within Hadoop, so you risk data loss. So people have looked at Hadoop traditionally as this batch-oriented scratch pad. We were out to solve that. We want to make sure that you can use it for mission-critical data, that you don't have a risk of a data loss, that you've got full high availability, you've got the full data protection in terms of snapshots and mirroring that you would expect with enterprise products. So take us back to when you guys were thinking about doing this, I'm not even sure you were at the company at the time, but the DNA was there and you're familiar with it. So you guys saw this big data movement, you saw this Hadoop movement and said, okay, this is cool. It's going to be big. And it's going to take a long time for the community to fix all these problems. We can fix them now, let's go do that. Is that the general discussion? Yeah, I think what's different about this is the first open source package, the first open source project that's created a market. If you look at the other open source, Linux, MySQL, et cetera, it was really late in the life cycle of a product. Everyone knew what the features were. It was about giving an alternative choice. The better Unix. Here the focus is on innovation. And our founders have deep enterprise background. Our CTO was at Google and charge a big table, understands MapReduce at scale, spent time as chief software architect at Spinnaker, which was kind of the fastest clustered NAS on the planet. So you recognize that the underlying layers of Hadoop needed some rearchitecture and needed some deep investment and to do that effectively and do that quickly required a whole lot of focus and we thought that was the best way to go to market. And talk about the early validation from customers. Obviously, you guys didn't just do this in a vacuum, I presume, so you went out and talked to some customers. We had hundreds of conversations with customers while we were in stealth mode. We were probably the loudest stealth mode company. And the heads were nodding, and I mean, what were they telling you at the time? Yeah, please go do this. What we address weren't secrets. They've been jeerous for open for four or five years on these issues. Yeah, but at the same time, Jack, you've got this purest community out there that says, I don't want to rip out HDFS. I want it to be pure. What'd you say to those guys? You just say, okay, thank you. We understand you're not a prospect and just move on. I think that Hadoop has a huge amount of momentum. And I think a lot of that momentum is that there isn't any risk to adopting Hadoop, right? It's not like the fractured NoSQL market where there's 122 different entrants, which one's going to win? Hadoop's got the ecosystem. So when you say pure, it's about the APIs. It's about making sure that if I create a map-reduced job, it's going to run on Apache. It's going to run a map bar. It's going to run on the other distributions. That's where I think the heat and the focus is. Now, to do that, you also have to have innovation occurring up and down the stack that provides choice and alternatives for customers. So when I'm talking about purists, I agree with you, the whole lock-in thing, which is the elephant in the room here. People worry about lock-in. Is that a pun intended, the elephant in the room? No, but good one, good catch. But you're basically saying, hey, we're no more locked in than Cloudera, right? I mean, they've got their own version of- Actually, I think we're less because it's so easy to get data in and out with our NFS that it's probably less locked in. So, and I want to come back to that, but for instance, when I say purists, I mean, some users in ISV, some guys we've had on here, we had Abhi Mehta from Tresada on the other day, and for instance, he's going to say, I just don't have time to mess with that stuff and figure out all that API integration. I mean, there are people out there that just don't want to go that route, okay? But you're saying, I'm inferring this plenty who do. Talk about that one. And by the API route, I want to make sure I understand what you're saying. Yeah, so you talked about, hey, it's all about the API integration. It's not about- It's about the APIs being consistent, 100% compatible, right? So, if I write a program that's going after HDFS and the HDFS API, I want to make sure that'll run on other distributions. Okay, so that's our- And that's your promise? Yeah. Okay, all right, so now where I was going with this was that again, there are some purists to say, I just don't want to mess with all that. Now, let's talk about what that means to mess with all that. So ComScore was a big high-profile case study for you guys. Yes, yep. They were a cloud-era customer. They basically, in my understanding, is a couple of days migrated from cloud-era to map-on. And the impetus was, well, let's talk about that. Why'd they do that? Performance, data protection, ease of use. License issues, there were some license issues there as well, right? Your maintenance pricing was more attractive? Is that true? I think it was mainly about price performance and reliability. And they tested our stuff. It worked real well in a test environment. They put it in production environment. Didn't actually tell all their users. They had one of the guys debug the software for a half a day, because something was wrong and finished so quickly. So it took them a couple of days to migrate, and then boom, they were done. Boom, and they handle about 30 billion objects a day. So the use of that, really high performance, support for streaming data flows, they're talking about, they're doing forecasts and insights into web behavior, and the earlier they can do that, the better off they are, so. Go ahead, Jeff. Well, so talk about the implications of your approach in terms of the customer base. So I'm imagining that your customers are more, perhaps, advanced than a lot of your typical Hadoop users who are just getting started, tinkering with Hadoop. Is it fair to say, your customers know what they want, and they want performance, and they want it now? And they're a little more advanced than perhaps some of the typical early adopters? We've got people who go to our website and download the free version, and some of them are just starting off and getting used to Hadoop. But we did specifically target those very experienced Hadoop users that were kind of stubbing their toes on the issues, and so they're very receptive to the message of we've made it faster, we've made it more reliable, we've added a lot of ease of use to the Hadoop. So I found this, let me interrupt, go back to what I was saying before, is I found this comment that I found online from Mike Brown, ComScore's SEO, I presume you might. He said, ComScore's MapR direct access NFS feature, which exposes Hadoop distributed file system data as NFS files can then be easily mounted, modified, or overwritten, so that's a data access. Yeah. You know, simplifies Hadoop. He also said, we could capitalize on the purchase of MapR with an annual maintenance charge versus a yearly cost per node. NFS allowed our enterprise systems to easily access the data in the cluster. So does that make sense to you that enterprise of that annual maintenance charge versus yearly cost per node? I didn't get that. Oh, I think you're talking about, yeah, if MapR doesn't understand. Some organizations prefer to do a perpetual license versus a subscription model. That's basically what he's referring to. Oh, oh, okay. The traditional way of licensing software. And that basically reinforces the fact that we've really invested and have kind of a product orientation rather than just services on top of some open source. Yeah, okay, so you go in, you license it, and then perpetual license. And then you can also start with the free edition. That does all the performance, NFS support. Kick the tires before you buy. Sorry, Jeff, sorry to interrupt. No problem at all. So another topic of a lot of interest is security. Making Hadoop Enterprise Ready, one of the pillars there is security. Making sure access controls, for instance, making sure. Talk about how you guys approach that and maybe how you differentiate from some of the other vendors out there or the other distributions. So we've got full Kerberos support. We link into enterprise standards for access, LDAP, et cetera. We leverage the Linux PAM security. And we also provide volume control. So right now in Hadoop, in Apache Hadoop, other distributions, you put policies at the file level or the entire cluster. And we see many organizations having separate physical clusters because of that limitation. And we provide volume so you can define a volume. And in that volume control, access control, administrative privileges, data protection class, and in a sense, kind of segregate that content. And that provides a lot of control and a lot more security and protection and separation of data. Jack, is that scenario, the com score scenario, common, where somebody's moving off an existing distribution onto a map bar or are you more going seeing demand from new customers that are saying, hey, what's this big data thing? I really want to get into it. How's it shake out there? Right now, there's just huge pent up demand for these features. And we're seeing a lot of people that have run on other distributions switch to map bar. A little bit of everything. How about, can you talk a little bit about your channel, you go to market strategy, maybe even some of your ecosystem and partnerships in the little time that we have there. So EMC is a big partner. The EMC Green Plum MR Edition is basically a map bar. You can start with any of our additions and upgrade to that Green Plum with just a license key. That gives us worldwide service and support. It's been a great partnership. Going well. We hear a lot of proof of concepts out there for that. Yeah, and then it just hit the news today about EMC's distribution, our distribution being available with UCS, Cisco's UCS gear. So now that's further expanded the footprint that we have map bar. Okay, so you have the EMC relationship, anything else that you can share with us? We have other announcements coming out. Nothing you want to pre-announce in the queue. Oops, did I let that slip? And you're right, it's live, so be careful. And so in terms of your channel strategy, you guys mostly selling direct-indirect combination? It's kind of an indirect model through these large partners with a direct assist. Yeah, okay, so you guys come in and help evangelize. Yep. Excellent. All right, Jeff, anything else? Have we got a roll here? Sure, just wondering if you could talk a little bit about you mentioned the EMC Green Plum. So there's a lot of talk about the data warehouse market, the MPV data warehouses versus Hadoop. Based on that relationship, I'm assuming map bar things, well, they're certainly complimentary. Can you just touch on that? And as opposed to some who think, well, Hadoop's going to be the platform where we can put all the data? Well, there's just, I mean, if you look at the typical organization, they're just really trying to get their, excuse me, their arms around a lot of this machine-generated content, this unstructured data that's just growing like wildfire. So there's a lot of Hadoop-specific use cases that are being rolled out. They're also kind of data lakes, data oceans, whatever you want to call it, large pools where that information is then being extracted and loaded into data warehouses for further analysis. And I think the big pivot there is if it's well understood what the issue is, you define the schema, then there's a whole host of data warehouse applications out there that can be deployed. But there's many things where you don't really understand that yet, having it in Hadoop where you don't need to define the schema is a big value. All right, Jack, I'm sorry we have to go. We're running a couple of minutes behind. Thank you very much for coming on theCUBE. Great story, good luck with everything. And it sounds like things are really going well and market's heating up and you're in the right place at the right time. So thank you again. Thank you to Jeff. And we'll be right back, everybody, to the Stratoconference live in Santa Clara, California right after this word from our sponsors.