 I'm John Furrier. We are at Google IO 2012 in San Francisco, where Google's announcing a lot of great new innovations. They're jumping out of blimps and zeppelins with virtual reality, a lot of tech, a lot of science here. But one of the panels that we wrote were paying close attention to obviously the big data stuff, as well as the Google Compute Engine, and I'm here with the co-founder and CTO of NAPR, MCSRI, best welcome to chatting with SiliconANGLE. So, we're up there on stage doing the demo, and people were kind of freaked out, like it's a, you know, they stare at the future. When you look at this kind of power that the cloud offers, and I think I wrote on a tweet, you know, hello, CIOs, start to smell the coffee brewing. It's a whole new day coming tomorrow. There's a notion of, you know, massive scale and such a low cost and fast provisioning, in fact, a fair metal data center. So, tell us, as the CTO of NAPR, how did you guys get involved in the demo with Google Compute Engine? So, you know, NAPR, we were a big data company three years ago, four years ago, when we started. When John and I founded the company, we knew that big data was becoming very big. I mean, big data really means that, you know, it's the old thing that says, you know, I'm an expert, but my data knows better than me, right? So, big data means I can ask my data, it'll tell me as long as I have enough data, right? So, Hadoop is there to process large amounts of data very rapidly, and it's been really taken off, so NAPR is right at the forefront of it. And so, it was a natural fit for Google to call to invite us. And we were very, you know, very honored because this thing originated at Google, so it's kind of a validation for us that, hey, we actually did something good as a startup. And you're ex-Google, right? Yeah, I was Google. What did you do at Google? So, at Google, I worked on a search, and search, of course, relies on MapReduce a lot to get its very high quality of results that Google is able to provide. So, I mean, think about it this way, that Google was like the, maybe the 20th search engine on the planet, and yet, you know, and before that, you had Lycos, Altavist, all those guys. The portals. And... To burn themselves the portals. Well, because the quality was very poor, right? I mean, the quality was very poor, and Google, when you use Google first, you said, wow, this stuff really works well. Well, this MapReduce stuff is a secret sauce behind that, behind Google. That is, it let Google analyze data better than anybody could ever do, and, you know, give you the deep insights from your data. And with MapR, what we brought was extreme reliability to Hadoop and extreme performance. And so it was a great fit with Google, because, you know, they are a company that, you know, boasts about their performance, extreme reliability, extremely performance, and extremely high quality. And that's exactly how MapR is founded on the same principle. I'm after all, it's... And Google also, you know, at Google, they need performance, and hence, they, why they try to, you know, build their own ISPs because, you know, the faster the page loads, the better advertising they get, as you know. But more importantly, they're an ease of use company. I mean, they've always been simple to use, and elegant, you know, quick results, not a lot of clutter on the page. So they don't like clutter. They like the simplest of things. What did you guys see with this that caught your eye relative to both performance and the ease of use? Because the cloud done properly should be like interfacing with the web page. What about this project? Has ease of use in the performance? Can you share more about that? Yeah, absolutely. I mean, look, the demo that we did here was about 1,250 nodes. And we did that under a week. You know, within a week, we had to build that. And to show it here, we had to build this from scratch, and which means every day we were able to spin up 1,000 nodes, plus maybe even 2,000 nodes, spin it down. And within an hour, we had to do that every time we, you know, roll something new. That's, that requires tremendous infrastructure investment. If you're going to do it on your own. Incredible. I mean, the budget required for that, it would run into literally into the several tens of millions of dollars. So to be able to do that at a moment's notice, and only pay for what you use, that is, you know, like electricity or water, right, you kind of use it, open the tap, draw as much as you want and close it and you're done with it. You get a bill. That was that's really the model I think the world is going to, because that's how big data is viewed. You're not analyzing big data every day. I mean, you are but you aren't in the sense that you want to have big data. You want to have big data. Use it when you want. Use it only when you want and use it only as much as you want. And so this combination of this kind of, you know, cloud storage, that says storage is very cheap in the combination of and then you need to really marry that with, okay, I have all the storage. How can I get my insights from it? And that's where Google Compute Engine is needed. But you have something like Google Compute. Well, now the next question is, well, how do I analyze it on Google Compute? Then that's where you need something like a MapR to run your Hadoop jobs, which let you do the analytics. And with MapR on a cloud, really, that's the only way you can, I mean, see what happens in the cloud is it's a virtual environment. So if in a virtual environment, you know, the virtual machines can move around and then when some hardware goes bad, you know, Google Compute will migrate that to a good piece of hardware detected and so on. But what about your data? It's been left behind on that hardware that failed. And MapR is the thing that makes the whole thing stitch together really well. That is, it offers a completely resilient fabric that runs on top of the cloud. And really, that's what makes the combination very unique. And there's nobody else today in the Hadoop world that can provide that to you. Nobody. It's only from MapR. A lot of people talk about infrastructure service. They think of Amazon, obviously, you know, instances, spinning up instances, spinning them down. Nearly days, you know, you lost an instance. You know, good luck with that. Right now, you have tools out there to save them. But think data, you don't want to spin down your data. As you mentioned, you want to access your data by the drink. So you want to spin up compute, but you don't want to spin down your data. How does that all factor? How do you explain that to someone who's not in the know on the big data world, because that's a fear. I want to spend an instance, but I want to lose my data. So it's all about cost, actually. You're absolutely right. It's about cost, right? So if it was cheap, I would keep it if it's expensive. I filter it and keep only what I call as valuable stuff and throw the rest away. So it's a totally a cost question. And so if things were, you know, it was very cheap, I would keep everything. You know, because I don't know what I might need later. What if I made a mistake and threw something away? It's very, very, you know, it's frightening to think that, man, I should have just held on to that. I would have made so much money. I mean, you always, you know, people are doing that now. Well, look at garage scale. They're holding everything. Right? You know, I mean, I had a fear of compliance. Well, that too. But the classic garage sale, right, where you go and find this painting that's been worth so much, because somebody threw it away, right? They said, I don't have space for this. And then they discovered it's a great painting, right? That Oh my God, it's the same angle on big data, right? Is that you try to hold on to everything. So if the cost can be really low, you would hold on to it. Now, the great thing with the problem with big data, however, is that, okay, so I held on to it. Now what? How do I analyze it? Now I need big compute. But I need enormous compute power to go through that. And that's where Google compute really fits that, right? I mean, and then I still want to do it at low cost. So how do you spend the folks out there with MapR, let me know specifically on the storage side, okay, I want to have all the cloud benefits, which is great. Can I still have the storage? How does the storage piece work? Can I store everything? You can store everything. If you know, it's a cost-based thing. It's what what you call as everything, right? I mean, what is interesting, what's not so interesting, what's worthless, right? So, of course, storing everything, everything is really, is prohibitively costly. So, but what with MapR, specifically, what what it does is, you know, you have a cluster of 100,000, whatever number of machines, because this is big data, you cannot do it on one machine. It won't fit, right? So, you do need many machines. So, the concept that I can do this all in one giant machine is that those days are over. So, I do need many machines. Well, the moment you introduce two, three, four machines, even ten machines, you have failures. And then you have a portion of the data that was on that machine that failed, lost. And the analysis is lost, and you have to restart that again. What MapR gives you is that, so firstly what Google gives you is, when that machine instance fails, they discover that and migrate it to a good piece of hardware, and they'll worry about, you know, unplugging the drives, you know, replacing them, whatever, whenever they, whenever they get around to. What MapR gives you on top is when that machine has moved, the data is restored and recovered on the new spot where the, that one of those machines that failed went to. So, that, that kind of is a really cost savings for you because you don't have to redo that compute and redo the analysis that you, which you lost normally because, because the machine had failed. So, there's a lot of religious conversations, religious meaning, like, you know, tech religion, and how to, where to put the data. Do you put the compute? Some say put the compute near the data, move the data to the compute across the network. So, how does this product, Google, a FUDE engine, so, and most people think generally acceptable that you move the compute with the data, you put MapReduce and FUDE, it's a good architecture. Do you agree and how does this fit into that? So, that's a good question. I mean, it's, I wouldn't, I would, you see, I am, I'm not religious that way at all, right? I mean, you have to do what, you have to be pragmatic. That is, you do what's best for that particular use case. So, if you need to move the compute to the data, you would do that because you know, the data is very big and you want to have the compute, you know, not have to suck that data over the wire. If the data is very tiny, like you're doing an airline reservation or you're buying a ticket for an event where the data is really a small purchase, then you move that, you know, you move the data to the compute because the data being moved is very tiny, right? So, that's, that's what databases are about. It's dynamic, it's situational, yes, dynamic, and sometimes you do a mix of both. So, you see in the Hadoop ecosystem, you see both H-Base and MapReduce together because they, they, there are needs for both. And so, that's really the truth. I mean, you have to do what's pragmatic for you, without any relationship. The whole eternity of Hadoop, which is H-Base, MapReduce, and HDFS kind of tied together really works well. My final question for you is, what do you think about what's going on here? You guys were actually they've got a great presentation, you got great applause. What do you think about Google IO this year? I mean, honestly, there's a lot of tech, a lot of science, surrogates, I've got Jeff out of the blimp with glasses on, showing kind of like that. Although, jury rate with RF antennas, but you know, you know, but that was cool demo. What do you think about the vibe here at Google right now? I think, I think it's a very young audience who are seeing what the future is looking like, you know, and, and they are, you know, they are, it's, so it's very gadget-oriented a little bit. I mean, I do see that, but, but that's, that's the, that's the young audience. That's the sex issue. That's the sex appeal. There's a lot of sex appeal. There's a lot of coolness, but there's also a lot of deep technology here. I mean, some of the things that Google computer, Google IO is demonstrating is stuff that you are, it's very, very difficult to do technically, to pull off. I mean, it's extremely difficult technical problems they're solved and they show it so simply and they make it look effortless, that it's, you know, it's incredible to think that they're doing things that we thought wasn't possible before. I mean, it's just, it's just amazing that the, and they're doing it at very high quality, at very large scale. I mean, it's like having, you know, the Google computer engine experience for me was, like, you know, like going downhill skiing, but with, you know, jet skis attached to your feet and going, whoa, this goes fast, right? I mean, that's how it feels. It's exciting. What's, what's next for you guys? Obviously, you're the technical co-founder and you, John, are a great team. You got to be excited. Obviously, the market's exciting right now for anyone in the space that's innovating. You guys, what's next for you guys? What's next on your vision? What do you see happening next? So, I'm going to say map art, but with it around the map art. So, yeah, in and around, I think, I think Hadoop is here to stay. Firstly, I mean, there's a lot of questions about is Hadoop the right technology or not. I think it's like I said, you know, there's no religion here. You should do what is correct for your particular business. It's business driven. But I think what has been accepted by many people. A, Hadoop has been accepted. B, big data has been accepted. That is the so called expert that used to, you know, know about your business. It has been replaced by data scientists that know about your data and they can tell you and they can let you visualize it from different angles you never thought was possible. So, that's here to stay. What's coming up next is the kind of, you know, big brother watching you with forecasting. I think that's true, but it's not really big brother from a bad sense. It's big brother from a good sense because, you know, you can find diseases, disease cures earlier. You can find, you know, like, for example, insurance rates on all of your things can go down better. You'll get more accurate determination of, like, things, for example, you know, last month I had a, I had to appeal my house valuation because the guy came up with some random valuation from the county. Well, with big data, if he had big data, he would get a much better accurate assessment of how much my house is actually worth. And that's what it really brings, you know, it's not big brother that way, but it's a good accurate assessment of how things really are so that you can do a proper demographic information on which you can do rational business decisions based on that, without having to guess or doing that with a very small sample. I mean, that's really what I think big data gets you. I think the insurance industry has been doing this for a while, and but, you know, it's a, it's very hidden behind the scenes and now it's democratic. You're trying to see these, the fast follower verticals come on board, you know, you had, you had financial web scale, you had government, you had the big guys in there, and then now the other healthcare insurance, they're all kind of coming on board, so they're now getting the big data scene. I think they always got it, I think it wasn't possible before. I mean, I think people are very smart, I think they always got it. I think it was a cost decision, and now the costs have gone down so tremendously that it is, it is like a utility, right? I plug in and I draw as much electricity as I want, now I draw as much as compute as I want or draw as much storage as I want, without having to go and purchase, you know, how to go and build out a data center every time I want to do something small. Thank you very much. We'll be right back with more right after this. Thank you.