 Okay, we're back, this is Dave Vellante and we're live here at Silicon Angles, the Cube's coverage of Hadoop World 2012, the Strata Conference, they've merged those two together, Riley Media and Cloudera have came to a partnership and we're here, this is our third year having the Cube at Hadoop World, the conference is expanding, our first year was 800 on top of the previous year which was 500, then it went to 1400, it's up closer to 3,000 now, I think the official number's 2,500. And we're here, we're covering all the action, this is the Cube where we try to extract the signal from the noise, bring you the smartest guests that we can find. And we're here with a very interesting company called Rainstore, the CEO is John Bantelman and John, welcome to the Cube. Pleasure, thank you. Great to see you, it's been a while since we've talked. We were just talking off camera, we I think met in California a while back, right? It was in Boston. It was one of the first BD events that Greg DuPlessis ran, so you guys have come a long way since then, we've come a long way, so why don't you give us the update, I'm here with my co-host Jeff Kelly, but give us the update on Rainstore. Well, probably our most recent news is we raised $12 million in the last couple of weeks, so we just went through a big fundraising. Interestingly, our investors are basically a major bank and a major telco, and that represents our market. So we deliver big data solutions to the enterprise. And what's interesting is we're deployed in companies that deal with carrier-class products, i.e. products that can't break. So for our software to go into production in organizations like, you know, the biggest telco in North America, one of the biggest banks on Wall Street, it means that we're delivering a very robust and capable solution to big data. I noticed on your website, you have a blog called Hadoop the Tape Killer, somebody we're at IBM's IOD conference this week, and somebody called Hadoop the New Tape. Well, I think it's a really interesting issue. So we're dealing with a client today. They have a very large, complex data warehousing infrastructure, it's 500 terabytes. The price is in the tens and tens of millions of dollars, and they have two to three petabytes on tape. Wow. And for them to manage and get access to their history, they need a much more capable and economic scalable solution. And I think there's an interesting proposition where companies have been forced to take very large amounts of valuable production data and basically kill it, because tape kills information, it kills information. Well, once it's there, you're never going to go get it, right? Well, two petabytes on tape recovery is not viable, right? Haha, they say backup is one thing, recovery is everything. But so, you know, it's interesting, John, I'd say four or five years ago, if you said you were in the database world, people would say, oh yeah, that's nice, and they move on to somebody else at the cocktail party. Database has become one of the hottest topics going. It's like a renaissance. What's going on there, and where do you guys fit? Well, I think big data has just completely changed databases. The databases are well established. Most of the technology, relational technology is more than 40 years old, and the requirements to scale, the requirements to move from millions of records a day to tens or hundreds of billions of records a day, moving from tens of terabytes to petabytes, every dimension of data's changed. And new innovation is absolutely required to meet customer requirements. So talk about that innovation generally, and specifically, rain store innovation, we can get into it a little bit. So the corneration around most platforms is scalability. So if you think about Hadoop, Hadoop is about distributed scale out. And generally, even though Hadoop gets a lot of press, Hadoop is generally about brute force. If you have a big enough data problem, and if you can throw enough machines so that you can solve the issue, what rain store brings is some, I don't know, intelligence and efficiency to that. So the first part that we bring is the ability to store data in a very unique and very efficient form. And our data reduction is 20, 30, 40, 50X. So that means that we're saving, we're reducing the data 90 to 99%. So if you take a petabyte of data, you put it in rain store, and you're physically taking 35 terabytes of disk, you've changed the problem. You fundamentally changed the problem. You made it much easier, much more manageable. And we've cut the cost, obviously. And we've cut the cost, and then most people then believe, because you're storing the data efficiently, it's going to be slow to access. And then after it's net. I was going to ask you, what's the performance impact? I would think, to me, intuitively, it would speed it up. It speeds it up. I mean, have you seen that consistently in your customers? We've seen that consistently. So we can run both MapReduce and FullSQL, full ANSI SQL. So you can connect with your table tool, you can connect with business objects. You can run an Oracle SQL statement that's a meter long. Again, it's rain store and it works. You can also, if you want to get at your data differently, you can use MapReduce and PIG, and you can do more complex, kind of multi-structured analytics of information. So rainstalk is you, both dimension, both accesses of dimension, but we're very smart at how we read the information. Yeah, so you're seeing this as a major trend here at this event and others, this sort of unification, bringing together SQL and the NoSQL world. You saw Adapt, won the startup award, Cloudera announced Impala. We've seen Hortonworks announced, you see that, Hortonworks announced Jeff with Microsoft. And there are others, MapR announced M7. So you're seeing this trend. What do you make of that and where do you see it going and where does rainstore fit? So I think there are, you know, this is where we have lived for the last five years, right? It's about dealing with big data and the enterprise. And I think Hadoop is suddenly trying to bridge between the world west of Silicon Valley, where startups are willing to rewrite everything in Java to try to distribute the software into banks and telcos and the broader enterprise. And the skills, knowledge and tools there exist. So if you're going to sell to an enterprise, the robustness and the ability to leverage those existing tools and techniques are key. So you've got to do something on top of SQL. You can't do something instead of SQL. Rainstore delivers that and has been delivering it for years. And other companies have announced products aren't working yet and we'll see whether they work. So, you know, another, you know. You want to name names there, John? No. How about you, Jeff? Well, so why don't ask you about how you're helping customers kind of make a transition from that old world of data management, you know, just where you just kind of got your relational database. You've got the very, very structured, rigid data models to this kind of new world. You've got platforms like Hadoop and what Rainstore does allowing you to do both BapReduce and SQL in the same environment. But still, there's got to be a transition period. And how are you helping your customers deal with kind of that legacy technology and make the transition, you know, by you got to minimize the risks and, you know, maximize your return on the new technologies like Rainstore and Hadoop and other things? Well, so Rainstore runs across a variety of scale-out platforms, so Hadoop is one. But you could equally run us on a cloud storage platform with a scalable set of virtual machines. We have deployments which run on scale-out NAS and a bunch of kind of Red Hat Linux service. So we operate across any low-cost scalable architecture. And what we're finding in the industry is there are some people who love Hadoop and there are some people who don't. Sometimes in the same organization. Well, I think the most important thing is to understand, big data is a problem definition. So depending on the organization, so sometimes we're going to an organization and their issue is like the data warehouse example, the data structure, it's already sitting in a data warehouse. The problem is scalability and efficiency. So we can directly take information, we can understand the DDL, we can take their existing SQL, we can plug it in, they're not changing the line of code and it works. That transition has allowed them to move to massive scale, very low cost. We have other examples that we're dealing with where we're one partner, for example, dealing with Homeland Security. The requirements of two million records a second. And querying across 10 trillion records in a few seconds. So Rainstore can manage and deliver massive scale and it gives the flexibility to the customer to decide. So as opposed to, we are very focused on working with standards. So pick as a standard, we'll support pick, we'll support it directly, don't change the line of code, SQL's a standard, we support SQL. Talk a little bit more about your secret sauce. You talked earlier about your data reduction. Are you doing deduplication, compression, a combination of those? It's a combination but the core capability is most databases, most traditional databases, store rows. So if you've got a trillion records, you've got a trillion rows and then you have to put two indexes on top so you kind of have two trillion rows of data that you're trying to manage. Rainstore stores patterns. So if you give us a set of records, we will store the unique values and patterns of values that make up that data. We do it purely algorithmically. So we are at a very gradual level to duplicating the information. I'm also doing it very quickly because we can ingest millions of records a second. So we store everything without repeating anything. And we do it completely scalably. So it's not a bad job that you're going back, obviously. So as records are streaming in, we'll take a group of records, deduct them and we store a file. That gives us a huge advantage over a conventional database because essentially the resulting dataset that's 30 times smaller, the IO pipe, appears to be 30 times bigger. So most databases, even Hadoop, get clogged trying to get information in. Rainstore's almost always CPU valid because we're making the data so small and we're able to read it without having to re-inflate it where we can move data around at a pace 20 or 30 times faster than most conventional databases. So that's, so as we described before, the way in which we store information drives performance. And you talk about some of your larger customers before, you mentioned Telco and some others. Talk about how they're using your product specifically. So in Telcos, one of the company's cases is that I want to capture every network event. So everything that's happening in my network, every drop call, every call tried. Every time somebody phases out because they move between antennas, I want to understand my network behavior. Typically those are tens of billions of vents per day. As we move to 4G and LTE, they'll become hundreds of billions of vents per day. That raw data is actually telling the carrier or the ISP what's happening, who's doing what, how the information, how the college, how their bandwidth has been consumed, how many people are going to video. So that's a key requirement. If you're going to manage customer experience, you've got to understand the usage of the phone. The data's phenomenal. Data's just amazing. And the ability then to both capture that at speed and analyze at a pace is a key requirement. And banks, the biggest requirement for us is how do you deal with 40% data growth on a reducing IT budget when you have ever-increasing requirements to keep information around forever to meet regulatory requirements. So this whole, if you have 100 petabyte data estate grown by 40% per year, and the goal in most Wall Street banks is to reduce budget, you've got to drive efficiency. And the experience that we have in those markets is literally we're making a 10x cost difference. We're literally moving the cost of that data management by a decimal point. So it's a business-driven, commercially-driven. Exactly, it's got to be business-driven. It's got to be business-driven. So talk a little bit about that. You know, we were at the IBM IOD conference a couple, well, yesterday, and it feels like a few days ago, Dave, but- Weeks ago. Yesterday and the day before. Somebody mentioned that, you know, if you treat big data as a science project, you're going to fail. You need to have a business problem you're trying to solve, a discrete business problem, and attack it from the business problem down rather than the technology up. Do you agree with that sentiment? And if so, or if not, how do you work with your customers to best leverage your technology, especially new and new customers who are kind of maybe new to the, quote-unquote, big data? We literally look for customers that have, so big data is often described as a technology stack. To me, big data is a problem, right? So if you've got to ingest a million records a second, that's a problem. If you've got to store a petabyte of data and it's going to grow by 40 or 50% per year, that's a problem. So we focus on customers who identify whether data management issues are, and that's how we, you know, we very much have taken an industry-focused solutions route to market, right? So if you're dealing with compliance and trading, credit card analysis, smart grid data. So while there's commonalities, you've got to focus specifically on the business issue to come to the vertical and into the specific company. Yes, as opposed to saying, go play with my software and what's interesting. We're talking to customers about solving focused and discreet business requirements. So what are you guys talking about at the event here? What's going on? What's the conversation like? What are you telling prospects and customers? I think a lot of our conversations are focused upon addressing enterprise-scale requirements around big data. That's simple. Is that simple? And so what are those requirements? What are people telling you? We needed to be cost-effective. We've checked that box. Yeah, we need to be cost-effective. Liability. Well, we need to operate with standards. I mean, as you get into larger-scale organizations, telcos and banks, I mean, I need to have, you know, multiple end copies of my information. I've got to be able to do that scalably and effectively. I've got to replicate data across a wide-area network in order to address, you know, data security issues. So there are a large set of requirements which are really driven around enterprise needs. And that's where Rainstore, we believe, brings something to the big data party. Excellent. My last question, John, is, so we here, this is, like I say, our third year at Hadoop World. It started out as a lot of tire kickers, a lot of sandbox activity. It's evolving, companies like yours now talking about solving real business problems. Where do you see this going? Is the traditional world of analytics and big data ware, or let's call them data warehouses, is that going to give way to this world of Hadoop? Right now you sort of got the Hadoop tail wagging the dog. Do you see that equation flipping? Do you see those two worlds coming together? What's your vision for the future? I think the dynamics of this industry are changing completely. So two years ago, probably 10% of the people who we were speaking to were interested in the deep. Today, 80% of the POCs we're running are on Hadoop. So I mean, if you look at that in terms of, now I wouldn't say 80% of the deployments are on Hadoop. Many people will test out the product on Hadoop and say, well, I want to start off more conservatively. I'm going to run this on a NAS box. But Hadoop is absolutely the center of the conversation. I think Hadoop will challenge the data warehousing technologies. I mean, I think there's a, at the scale that data's now being managed, paying, you know, 20, 30, 40, $50,000 a terabyte when you're thinking petabytes is hard. So, and I think the technologies that we have, the technologies which are coming out of cloud error within parlor, although still early, people are going to say, do I really need to have two platforms and shuffle the information from my very expensive, very scalable Hadoop cluster into this very expensive appliance because I want to query it. And the answer will be, the answer is probably not. Well, Hadoop will do more and more and more and more of what the business requires. Well, we were at Oracle Open World a couple of weeks ago, Larry Ellison's keynote, he said, big data, meet big iron. And the implication was you're going to have, you know, a million and a half dollar infrastructure to succeed in big data. The world that you're describing is different. Well, it's not the world I live in. I mean, we are able to deal with, you know, if I can compress my data and run a petabyte of infrastructure on 20 nodes, 30 nodes, that's interesting. That's really interesting. And those nodes cost me 5K, it's really interesting. Yeah, I think Larry's, that's Larry's world. And I think... Larry bought a sunny, he's got to find somebody who's going to buy it, right? Well, he could buy his way into the big data world, but hopefully the innovation will continue without him for a while. I think, as you say, this is the most innovative time in data management. Yeah, no doubt. We believe we are part of that, but obviously there's a whole host of other companies coming up with brilliant ideas and great new products. There's so much out there. And the industry is going to be transformed. Yeah, excellent. Well, thank you very much, John, for coming on. Rainstore, keep an eye on these guys. Just raise the big round, congratulations on that. And congratulations on all your success and good luck, we'll be watching. Cheers, man. Thanks very much. Appreciate it. All right, keep it right there. We'll be back with our next guest live from Strata and Hadoop World. This is theCUBE. And I'm Dave Vellante, keep it right there. John.