 Live from New York City, it's theCUBE at Big Data NYC 2014. Brought to you by headline sponsor, Juan Disco, with support from EMC, Mark Logic, and TerraData. Now, here is your host, Dave Vellante. Welcome back to Big Data NYC, everybody. Jeff Frick and I are really pleased to have Kumar Sarkanti here, my friend, my long time friend, Kumar, CEO of Blue Data. Thanks for coming on. Thank you, Dave. So, congratulations. Thank you. You won the startup award. We were very honored and humbled to have been the winner of the startup showcase at Stradah. We could not have asked for a better introduction in terms of the company, and I'm very happy. Yeah, that's awesome. So how did that all come down, you guys? For those who don't know the startup showcase, the startups get up, they give, it's kind of like speed-dating. You don't really have much time to present, right? Yeah, I think they select 10, 12 finalists, and we were one of the 12 finalists, and they give you an opportunity to talk, to make a presentation, and then there is about a panel of judges, and then the next morning, they give you the winners. And there were some really incredible companies there, and we were very honored to be, again, to be placed as a winner. And I think it's a testament of our teamwork, and what we are solving is a problem, and I'm sure that's one of the things that we have. Yeah, we're going to talk about that. Well, let's get into it. At our event yesterday, Jeff Kelly got up and talked about how difficult Hadoop is. Yes. You know, and your goal is to make Hadoop easy. So let's talk about the problem and how you guys are solving it. Fundamentally, as you probably know, I was a VP of R&D at VMware, and I was responsible for all of storage networking and clustering products. It hit me back in 2012. I mean, big data, there's a lot of innovation, but let's be honest. It's incredibly hard, right? I heard today a story, somebody told me that I have 100 node cluster, and I have 20 people managing. So the mission that Kumar and my co-founder Tom, we got on is we want to build what I would call VMware for big data. I know that everybody wants to be VMware or something, but at least it conveys the message that, so we are on to, we want to build a platform that allows people to, you don't have to worry about, you just, it's like when you go to a haircut, you're not worried about what scissors is using, and you won't have a haircut. So scissors, to me, is like a virtualization. A lot of people ask me, are you virtualizing? Like, yeah, virtualization is there, but it's really, I want you to be able to do five most clicks and you get your cluster up and running, and we have demonstrated that. And in fact, I even have a, I wanted to calculate how many most clicks it takes to do a hundred node physical cluster, and I'm guessing it'll be some 50,000 most clicks, and I like to say 50,000s to five. I'd like to do that benchmark, I just don't have time. Yes. I could be, somebody asked me, are you sure 50,000? I said, okay, let's pick 20,000. Yeah, pick an Umber, right. And how many most clicks with blue data? We have five most clicks. So you pick your name of the cluster, and you pick, today we support Hadoop, Spark, and Hive. We plan to support other big data applications. Today we support Cloudera and Hardenworks, and we plan to support other distributions, and then you pick whether you want the honor, you want the 1.0, then you pick number of slaves, and number of mustard nodes, and you are done. I've seen the demo. Yes, thank you. And I watched it and said, okay, this is, I could do this. Yes, thank you. Literally. Yes, yes. No, I mean it. Yeah, yeah, yeah. I'm happy you said that. I could spin up a Duke cluster. Yes. I mean, I couldn't do much with it, but I could get it up and running for it, somebody who knew how to do it. Right, I mean, really it's that simple. Yes. Big data does not mean it has to be big hard, it doesn't have to be big moves, it doesn't have to be big costs, and that's I think is our goal, and you can actually get the utilization as high as 80 to 90%. So everybody is talking about speeds and feats, but nobody's talking about how hard it is to make this work. So our goal is to build that platform that allows the people to, in addition to that, I think that you saw this in the demo, you don't have to move the data. You can leave the data where it is. Right. You can use HDFS, you can use HDFS, you can have NFS, you can have a cluster, we support all the platforms. Well, I think that's key, because we can't move the data, data mover is dead. Yes. Right, high-speed data mover is an oxymoron. Yes. And so, you know, and it's funny, Kumar, because nobody's, it's funny how few people have paid attention to this. Yes, exactly. And now that it's coming into production, it's really becoming a problem. And I think the reason is, they would maybe spend a dollar on their enterprise data warehouse, and they say, wow, I can offload that and do some new stuff, and it only cost me 30 cents. Now, yeah, it's really complicated, but it's cool, and that's fine. And then, after a while, when they start to scale, they go, wow, this doesn't scale so well. Yes. It's too complicated, I'm going to spend all this money here, and it's just going to be a one-time hit, and it's going to be a nightmare. So, your timing is very good. Thank you. Is that luck, or is that talent a little bit of both? I think it's probably both. I don't think I can totally say it's like that. I do have to admit that I wrote a blog on the day we launched. I was actually flying back in 2012. I think it's from New York to San Francisco on a late-night flight on a Friday. It hit me, the big data is here, and it is going to help all the enterprises, but it is incredibly hard. So, how do I make it very easy? So, my fundamental goal is, how do you change the consumption model of the data? So, you, as a data scientist, you consider a desk, you should be able to pick a cluster, and you should be able to pick where the data is, and you should be able to go again and go. That's really the goal of the blue data is. And I think we have demonstrated that. I think it's a beginning of one-to-one product, but we are very confident that we can go next level to building this. Yeah, I mean, conceptually, you nailed it. Thank you. Now, of course, the execution. But you know something about execution. Yes, yes, thank you. Coming out of VMware. I started a vSAN project, and it was able to get it out running, and so thank you. Now, I want to step back and ask you as a technologist to get your opinion on something. We've seen the storage scale out, the compute is scaling out. The networking is still very structured and higher, and I know VMware is trying with NSX, is trying to change that, but is the network really going to flatten? Is it going to scale out? Is it inevitable, and what are you projecting? So actually, I'd refer to this couple of times. There is a paper by Microsoft, and you know Steve Austin, if I remember correctly, Microsoft Research. It's called Flat Data Center Network. I think one of the key findings of that, it's a couple of years old paper, is that the network is not the bottleneck in a data distributed workload environment. I do think the network still is a little bit ways away from it, but I think all the indications point to me that it is getting better and better and better. So trying to load all the data and moving the computer closer to the data is not necessarily a panacea. So there are certain workloads, you actually don't need to do that, and then you can get your results done. That's a different- That's a question though. Where is the bottleneck going to be? Because those are the big three, right? It's a compute store and network. Correct. So disk bandwidths are probably about six gigabits bandwidth, and then there is network bandwidths as quadrupled in the last four years from the, now you're talking about 40 gigabit networks. So you now have a very interesting dynamics that is coming. It's not just the networking, it's the software of the networking too, which is the word I think you pointed out, the SDN part of it. One of the inventions in Blu-Reta is we actually try to minimize the thrashing of the virtual network when you run the big data workloads. So we are making sure that the virtual networking path is optimized as much as possible so that you can get the best of the compute workloads. I see. Now, are there announcements at the show? Yes, we have announced a GA product of the Epic Enterprise. So our product is called Epic. So we have announced Epic Community Edition, which is a single node edition for anybody DevOps can download. It's a full function, single physical server, and Blu-Reta is not in the path, anybody can download it and run it. And there's a community website, they can post and we respond. We have announced the GA product, which is called Epic Enterprise. And we will be shipping that next month. And we have announced five nodes enterprise license free for anybody. It is fully supported by Blu-Reta and in perpetuity. Five node enterprise license? It's free. For free? Free. Okay, so you've got a community edition. Yes, single node, anybody can download. And it actually, on a 24 core system, you can actually run a pretty decent workload. But the idea there is really to give the Blu-Reta functionality. The point you just made a few minutes ago, I could do this, right? So that's the goal of Community Edition. It's to give the people opportunity. Enterprise edition is you're serious, you're ready to go, but you want to just be up and we will support it. It's a full, think of it as a $0 PR. Yeah, but it's a license. It's a license. It's a $0 PR, right? So there's some kind of commitment there, right? Yes, yes, that's it. It's a $0 PR. We are so confident on the platform that we think that once you use the platform that you will realize the advantage of the platform. For example, you can run Cloudera 4.4, 4.6, 5.0, all simultaneously on the same hardware. You can run Hardenworks. You can run Spark. You can run Spark Native. You can run Spark inside of the yarn. You can run the Spark. We are certified with Spark with the Databricks. We're certified with Hardenworks. We're working with Cloudera on the Cloudera certification. One of the other challenges you hear when you talk to practitioners is when they do, you know, the point releases of a distro, no big deal, but when they do a major release that's a lot of times it's, they have to incur some downtime. Some serious downtime. It can be days sometimes. Maybe most of the applications will run, but there's always problems. Yes, exactly. So you're seeing the same problem. How is the industry addressing that generally and are you doing it? Yeah, so I think that with us, this actually reminds me of the VMware back in 2000 when Windows came out with new revisions. You first spun up the new VM and you ran your application on it. With the Blu-ray, you can exactly do that. You can keep running your 4.6, and then you can run your new application on 5.0, and you can spin up as many clusters as you want with the new versions without having to disrupt your existing ones. And your data stays? Data stays the same. You can actually, multiple computer clusters can access the data. Remember, our computer clusters are stateless and the data remains, when the computer clusters goes away, data still stays there. That is not true with any Hadoop service. Hadoop as a service, the data goes away with it. So our data is, because we virtualize the data, your data stays wherever you want your data to be. So the industry's addressing today is either you spin up a new cluster, or there's a rolling upgrade, or you down time, as you pointed out. There's no other alternate. So the solution that we are putting in place is the first of its kind in the big data space, but it is nothing new when you actually follow the footsteps of VMware. And rolling upgrades are dangerous? Yes. You know, they really are. I've seen some disasters, literal disasters occur in rolling efforts. Come on, come back. And I also heard that, and I'm not an expert on this, but I've heard that the version to version compatibility is also a big problem from, they go from 4.6 to 5.0 or 1.2 to 2.0. So this, I think we help customers solve that problem because they can spin up the 2.0 cluster, they can run, and they can come back, and then they say, okay, now it's all running, and I can switch over. And I was bringing a lot of that facility of VMware to the big data world. I want to come back to licensing. So you've got the enterprise edition, free, you don't even have to talk to new data, and then there's the $0 PO, I love that concept. Every startup has to go through, okay, how do we price this stuff? How are you pricing when you actually start to think? Yeah, today we are pricing per core license, and we have not announced the actual dollar cost that we will be announcing it in the next couple of weeks. Yeah, okay. But per core, in the sense. Yes, it's a per core license. So we actually went back and forth between the pricing and per server per core. I think if you price per server, you're penalizing the people who are having a low corn stack. And then if you're taking advantage of, so we are seeing between 24 to 16 cores. So I think that it's probably uniform per core licensing price. And will you do perpetual licensing? Yes, no, I'm sorry, it's a three year licensing model. It's three year. Three year licensing. So term licensing? Yes. And so you won't do perpetual license, is that right? No. Anybody in software, enterprise software doing a perpetual license? The map are. Oh really? They are, because some customers want to do perpetual license for it or not. We are. But many won't. Cloudera won't do it. My source is anyway. Yeah, it works, of course, doesn't do any licensing. There's a subscription back in. We think that as we were thinking about this, obviously we are in a very early stage of the conversations. I'm sure we will listen to the customers as they say, but I think it's the right model, given that you come back and you add the value to the software and you give them more additional. Well, that's what the VCs want. They feel a lot fresher to do that. But it's a good model. This is Mrs. Maul, it's fantastic. But the enterprise is funny. So every customer is different, it seems, you know. That's true. I actually have not followed the map over it. I think, I didn't know that, but. Yeah, I mean, I think it's an outlier. But there are still some customers that want to do that. I want to just own it and you go away. Perpetual, fully pay it up worldwide. The lawyers love that stuff, some lawyers, you know. Well, that's great, Kumar. Congratulations on all the success. Anything else you want to last words that you want to share with us? No, again, I think I'll be repeating, but it'll be worth the repeating. I think we are really excited. I think we are bringing a very innovative platform. We want to make big data, not big hard, and make it big easy. We're going to New Orleans next year. Well, yeah, we'll be just there. You know, I just love it. You've always been a problem solver. Yeah, thank you. And you're very thoughtful and you got a great team. I'm really excited for you guys. Congratulations on all the success and good luck. All right, keep right there. Everybody, Jeff and I will be right back right after this word. This is theCUBE.