 Okay, we're back here live in New York City for Big Data Week. This is SiliconANGLE.tv's exclusive coverage of Hadoop world strata plus Hadoop world big event big data week We just wrote a blog post on SiliconANGLE.com calling this the South by Southwest for data geeks and and It's my prediction that this is going to turn into a quite the geek fest Obviously the crowd here is enormous packed and an amazing event and we're excited. This is SiliconANGLE.com I'm the founder John Furrier. I'm joined by co-host of Dave Vellante Wikibon.org where people go for a free research and peers collaborate to solve problems and we're here with Jack Norris Who's the vice president of market marketing at MapR a company that we've been tracking for quite some time Jack welcome back to the Cube. Thank you Dave. I got to hand it to you You know we met quite a while ago now as well over a year ago And we were pushing at you guys saying well, you know open source and look we're solving problems for customers We got the right model we think you know, this is our strategy. We're sticking to it and watch what happens And like I said, I have to hand it to you guys are really have some great traction in the market You're doing what you said and so congratulations on that. I know you got a lot more work to do but yeah, yeah, and actually the the topic of openness is one that's it's pretty interesting and You know if you look at the different options out there all of them are combining open source with some proprietary Now in the case of some distributions It's very small like an odbc driver with a proprietary driver But I think it represents that that any solution combining to make it more open is is important So what we've done is make innovations, but while we've made those innovations We've opened up and provided api's like NFS for standard access like rest like Odbc drivers, etc. So so it's a spectrum. I mean actually we were in Oracle open world a few weeks ago And you know you listen to Larry Ellison talk about the Oracle public cloud makes are actually very strong It's open. You can move data. It's all Java. So it's all about the standards. Yeah, and yeah From an opposite, but it's really all about the business value. That's that's what the bottom line is So we had your CEO John Schroder on yesterday John and I both were very impressed with Essentially what he described is your philosophy of we we announced a product when we have we have customers when we announce that product and That's impressive And he was also given some good feedback to startup entrepreneurs out there who are obviously a lot action going on with the startup community And he's basically said the same thing get customers Yeah, and that's it that's all and use your tech But don't be so locked into the tech get the customers understand the needs and then deliver that so you guys have done great And I want to talk about the show here Okay, because you guys are have a big booth and big presence here at the show What what are you guys are learning? I'll see how's the positioning? How's the new m7 news hitting give us the quick update? so a lot of news first started on Tuesday where we announced the m7 edition and Yeah, I brought a demo here for me for you all Because the the big thing about m7 is what we don't have so We're not demoing region servers. We're not demoing compactions We're not demoing a lot of manual administration administrative tasks So what that that really means is that we took the stack and if you look at H base H base today has About half of new users Adopting H base. So it's a lot of momentum in the market And you know used for everything from real-time analytics to kind of lightweight OLTP processing But it's an infrastructure That sits on top of a JVM that stores its data in the Hadoop distributed file system That sits on a JVM that stores its data in a Linux file system that writes to disk And so a lot of the complexity is that stack and so as an administrator you have to worry about how data gets You know kind of basically written across that and you've got region servers to keep up When you're doing kind of writes you have things called compactions which increase response time so it's a it's a complex environment and we've spent quite a bit of time in in Collapsing that infrastructure and with the m7 edition you've got files and tables together in the same layer Writing directly to disk. So there's no region servers. There's no compactions to deal with there's no Pre-splitting of tables and trying to do manual merges. It just makes it much much simpler. Let's talk about some of your customers in terms of The profile of these guys are I'm assuming and correct me if I'm wrong that you're not selling to the tire kickers You're selling the guys who actually have some experience with with Hadoop and have run into some of the limitations And you come in and say hey we can solve some of those problems. Is that is that right? Can you talk about that? characterization I think part of it is When you're in the evaluation process and when you first hear about Hadoop It's kind of like the Gartner height curve right and you know this stuff It does everything and of course you've got data protection because you've got things replicated across the cluster and Of course you got scalability because you can just add nodes and you know so forth Well once you start using it you realize that yes, I've got data replicated across the cluster But if I accidentally delete something or if I've got some corruption that's replicated across the cluster too So things like snapshots are really important so you can return to you know, what was it five minutes before? You know performance where you can get the most out of your hardware You know ease of administration where I can cut this up into into logical volumes and and have Policies at that whole level instead of it an individual file. So there's a there's a bunch of features that really resonate with users after they've had some experience and those tend to be our You know our kind of key customers. There's a there's another phase too, which is when you're testing Hadoop you're looking at What's possible with this platform? What what type of analytics can I do? When you go into production now all of a sudden you're looking at how does this fit in with my SLAs? How does this fit in with my data protection? Policies, you know, how do I integrate with my different data sources and can I leverage existing code? You know we had one customer You know large kind of systems integrator for the federal government They have a million lines of code that they were told to rewrite to run with other distributions that they could use Just out of the box with map bar so Let's talk about some of those customers. Can you name some names and get sure sure so? Actually, I'll talk with the we had a keynote today and We had this beautiful customer video that we had to cut because of time so it's running in our booth And it's a streaming on our website. I think we've got a actually some of the bumper here. We kind of inserted so But I want to shout out to those because they ended up in the cutting room floor That's good. We've actually been running it here. Yeah, so one was Rubicon project and They're they're an interesting company. They're a real-time advertising platform at auction network They recently passed Google in terms of number one ad reach as mentioned by comscore And the lot of press on that I particularly like the headline that mentioned those three companies because I was measured by comscore and comscore And Google's a key partner and Yesterday we announced a world record for the Hadoop teresort running on running on Google. So M7 for Rubicon it allows them to address and replace different point solutions that were running alongside of Hadoop and You know it simplifies their their potentially simplifies their architecture because now they have more things done with a single platform increases performance simplifies administration Another customer is ancestry.com who You know, maybe you've seen their ads or heard some of their radio spots They're They do a tremendous amount of data processing to you know help family services and genealogy and figure out, you know family backgrounds One of the things they do is is DNA testing So for an internet service to do that advanced technology is pretty impressive and You know you send them. It's 99 dollars I believe and they'll send you a DNA kit you spit in the tube you send it back And then they process that and match and give you insights into your family background. So for them Simplifying H-Bass meant additional performance so they could do matches faster and really simplified administration So, you know and Malina Graham's words, you know, it's simpler because they're just not there those those components Jack I want to ask you about enterprise grade had due because Yeah, and then Ted Dunning because he was mentioned by Tim Estes on his keynote speech So so you have some rock stars in in the company house his management team We had your CEO when we've interviewed emcee Srivastan at Google IO when we were on a panel together So I have to know your team solid team So we'll talk about Ted in a minute, but I want to ask you about the enterprise grade had due conversation What does that mean now? I mean, obviously you guys were very successful at first again We were skeptics at first but now your traction and your performance has proven this is a market Yeah, and for that kind of platform What does that mean now in this in at this event today as this is evolving as a dupe Ecosystem is not just a dupe anymore. It's other things. Yeah, there's there's there's three dimensions to enterprise grade the first is is ease of use and Ease of use from an administrator standpoint. How easy does it integrate into an existing environment? How easy does it does it fit into my my IT policies? You know, do you run in a lights out data center does the Hadoop distribution fit into that? So that's that's one, you know, whole dimension a key to that is is You know complete NFS support so it functions like You know like standard storage A second dimension is on dependability reliability, so it's not just you know, do you have a checkbox h a feature? It's do you have automated stateful failover? Do you have self-healing? Can you handle multiple? Failures and and you know automated recovery, so you know in a lights out data center Can you actually go there once a week and then just you know replace drives and a great example of that Is one of our customers had a test cluster with with map bar was a POC went on to did other things They had a power field they came back a week later and the cluster was up and running and they hadn't done any manual Task there and they were they were just blown away that the recovery process for the other distributions along So I gotta ask you I gotta ask you. It's the third one. What's the third one third one is performance? Okay, and Performance is is you know kind of raw speed. It's also how do you leverage the infrastructure? Can you take advantage of of the network infrastructure multiple nicks? Can you take advantage of heterogeneous hardware? Can you you know mix and match for different workloads? And it's really about sharing a cluster for different use cases and and different users and there's a lot of features there It's not just Ross. So ease of use fitting into the existing IT infrastructure policies the whole the whole what happens when something goes wrong Can you automate that and then easy? Dependable fast and it's the same thing. Yeah, making H base easy dependable fast So the talk of the show right now You had the keynote this morning is that you map our marketing has dropped the big data tournament going with data Cosm is that true? Is that true? So Joe Hellerstein just had a tweet Joe Name is Cal Berkley professor computer science professor now's CEO of a startup What's the name start up trifecta they're doing here a good couple epic tweets this week So shout out to Joe Hellerstein, but Joe Hellerstein's tweet just says map our marketing It's decided to drop the term big data and go with data Cosm with a shout out to George Gilder So I'm kind of a little intellectual kind of humor. So what is that? What's what's your response? Is it true? What's happening? What is that? Are you a VP of marketing? Yeah, well, if you look at the big data term I I think you know there's a lot of big data washing going on where You know architectures had been out there for 30 years or you know all about big data so I think there's a There's the need for a more descriptive term The the purpose of data Cosm was not to try to coin something or try to you know change a big data label it was just to get people to take a step back and think and to realize that we are in a massive paradigm shift and You know with a shout out to George Gilder acknowledging, you know, he recognized what the impact of of making available compute Ment he recognized with telecosm work what bandwidth would mean and if you look at the combination of we've got all this this compute efficiency and bandwidth now data Cosm is is basically taking those resources and unleashing it and changing the way we do things and I think I think one of the ways to look at that is the new things that will be possible and There's been a lot of focus on you know sequel interfaces on top of of Hadoop, which are important But I think some of the more interesting use cases are taking this machine J generated data that's being produced very very rapidly and having Automated operational analytics that can respond in a very fast time to change how you do business either how you're communicating with customers How you're responding to to different? risk factors in the environment for fraud, etc. Or just increasing and improving Your response time to kind of cost events. I met it earlier called it He called the actionable insight any sort of signing intent to be able to respond Well, it's interesting that you talk about the George Gilder because we like to kind of riff and get into the concept abstract concepts But he also was very big in supply-side economics and so if you look at the business value conversation One of the things we pointed out yesterday and this morning's opening Review was you know the top conversations insight and analytics, you know as a killer app right now The app market has not developed and that's why we like companies like continuity and what you guys are doing Yeah, under the hood is being worked on right at many levels performance units of those three things But analytics is a no-brainer insight, but the other one's business value Yeah, so when you look at that kind of data Cosm I can see where you're going with that And that's kind of what people want because it's not so much like I'm Republican because he's Republican George Gilder and he bought American spectator everyone knows that so so you know obviously he's a Republican but politics aside The business side of what big day is implementing is massive. Yeah, I guess that's a Republican concept But not really I mean businesses is all parties so relative to data Cosm I Mean no one talks about e-business anymore We're talking IBM at the IBM conference and they were saying hey, that was a great marketing campaign But no one says hey You in e-business today, so we think that big data is gonna have the same effect Which is hey, do you have big data? No, it's just assumed. Yeah, so that's what you're basically trying to establish that It's not just about big data. Yeah, let me give you one small example from a business value standpoint and Ted Dunning you mentioned Ted earlier chief application architect and one of the co-authors of The book Mahoot which deals with machine learning He dealt with one of our large financial services companies and You know one of the techniques on Hadoop is is clustering K nearest neighbors You know different algorithms and they looked at a particular Process and they sped up that process by 30,000 times. So there's a blog post That's on our website. You can find out you know additional information on that and I know this on this one On this on this one point, but I think you know to your point about business value and you know, what does data Cosm really mean? That's an incredible speed up in terms of Performance and it changes how companies can react in real time it changes how they can do pattern recognition and Google did a really interesting paper Called the unreasonable effectiveness of data and in there they say simple algorithms on Big data on mass amounts of data beat a complex model every time And so I think what we'll see is a movement away from data sampling and trying to do an 8020 to looking at at all your data and Identifying where are the exceptions that we want to increase because they're you know revenue exceptions Or that we want to address because it's a cost or a fraud issue That's what I would give a shout out to To the guys at digital reason Tim Asti's one plugged Ted It was idolized him in his terms of his work. Obviously his work is awesome But to he brought up this concept of understanding gap and he showed an interesting chart in his keynote Which was the date explosion, you know, it's up and you know straight up, right? It's massive amount of data 64% unstructured by his calculation then he showed out a flat line called attention So as data has been exploding over time going up attention me user attention is flat with some uptick maybe but so users and humans Yes, they can't expand their mind fast enough. So machine learning Technologies have to bridge that gap. That's analytics. That's insight and you know There's a big conversation now going on about more data or better models people trying to squint through some of the comments That Google made say all right does that mean we just throw out the models data Trump's algorithms data Trump's algorithms But but the question I have is do you think and your customers talking about okay? Well now they have more data can actually develop better algorithms that are simpler and Is it a virtuous cycle? Yeah, I think I mean There are There's there are a lot of debate here a lot of information But I think one of the one of the interesting things is given that compute cycle given the You know kind of that compute efficiency that we have and given the bandwidth you can take a model and then iterate very quickly on it and kind of arrive at insight and in the past it was just The that amount of data and that amount of time to process Okay, that could take you 40 days to get to the point where you can do now in hours, right? Right? So I mean Great examples fraud detection. Yep, right? So we use the sample six months later Hey your credit card might have been hacked and now it's you know You get a phone call or you can't use your credit card or whatever it is And so but there's still a lot of use cases where you know weather is an example Modeling and better modeling would be very helpful Excellent, so so data Cosmo are you planning other you know marketing initiatives around that or is this sort of tongue-in-cheek fund Throw it out there little red meat into the to the chum in the waters You know the cubes here talking, you know for the whole day What could we possibly do to help give them a topic of conversation? Okay data Cosmos now Of course we found that on our proprietary H base Jack Nora, thanks for coming in we appreciate your support you guys have been great We've been following you continue to follow you've been a great support of the Cube want to thank you personally while we're here MapR has been generous underwriter supports of our great independent editorial want to recognize you guys Thanks for your support, and we continue to look forward to watching you guys grow and kick ass So thanks for all your support, and we'll be right back with our next guest after this short break. Thank you Ten years ago the video news business believed the internet was a fat the science has settled We all know the internet is here to stay Bubbles and busts come and go but the industry deserves a news team that goes the distance Coming up on social angle are some interesting new metrics for measuring the worth of a customer on the web Every morning we're on the air to bring you the most up-to-date information on the tech industry With scrutiny on releases of the day and news of industry-wide trends We're here daily with breaking analysis from the best minds in the business Join me Kristen Folletti daily at the news desk on silicone angle TV your reference point for tech innovation 18 months ago