 at Big Data SV 2014 is brought to you by headline sponsors, WAN Disco. We make Hadoop invincible and Actian, accelerating Big Data 2.0. Okay, welcome back everyone. We're powering through day three at the Stratoconference and the Big Data SV event here live in Silicon Valley. This is theCUBE, our flagship program. We go out to the events, extract the signal and the noise, talk to those tech athletes, the ones making it happen. Disruptive game-changing technologies, entrepreneurs, executives, VCs, where we can get our hands on that has that signal. We will share that with you. I'm John Furrier, the founder of Silicon Angeles. Jeff Kelly on this segment and we have entrepreneur and CTO of Aerospike Brian Volkowski on theCUBE. CUBE alumni, welcome back to theCUBE. Thank you, John. So, you know, we've been on theCUBE a couple of years ago. We've been talking ever since. We love having you on, sharing that knowledge. But I got to say, you're in one of the hottest areas, your company is in memory and has so much action going on. The funding is robust. Spark in in memory is really highlighting some real scabs that was in the Hadoop space, now kind of fill in the holes. So you've seen the ecosystem grow up around Hadoop and big data and people are realizing the value for in memory is really amazing. So you're in that space. So one, give us the update of what's going on in memory and why is the world so hot on in memory now versus say a year and a half ago? Sure, John. So, you know, in memory has been great to us and I think it's great for the industry as well. So the big power of in memory is you don't have to spend as much time as a developer, optimizing for all of those rotational problems. You don't have the slowness associated with it and you have the agility in your processes to be able to keep changing your app over and over again and adapting. And that's why companies like Apple and Facebook and Google have placed such a value on in memory in general. What we've seen at Aerospike that's pretty amazing is the growth of flash in that area. So a lot of folks are still saying, well, in memory it's gonna be RAM, you know. I saw a Forrester analyst this week who said, terabyte of RAM costs a million dollars to actually host and have in your data center. When you take into account power and everything else a terabyte is a million bucks and flash doesn't cost that, right? Flash costs four or $5,000, $10,000 with all the power associated with it. You know, flash is the way that in memory is going and I think that's really what's gonna push this a level further. So software is eating the world, a great quote by Mark Endreys in Wall Street Journal, seminal posts to put out there. You know, which is something we've talked about before so it's not a new topic for us. But for the outside world they're seeing it. We see some news around the founders of Fusion IO, David Flynn and Rick White announced and leaked out that they just got a series B of $63 million for their new company primary data. So people are hot on data, data-driven software, okay? In memory's key for that. So I gotta ask you, how has the world changed for you? You guys had done a lot of work and I was on the high, high end. You have great use cases. I think the case studies are second to none in my opinion, you guys, for your size company, you guys have knocked down some pretty awesome reference accounts which is on your site in Will Dockman. I don't wanna talk about that too much, but more of. Okay, you knocked down the early adopters, people with the need. Now as you go mainstream, what are some of the things you guys are doing to take the aerospike in memory into the mainstream? Does that involve an ecosystem? Does that involve developers, different tooling? What are you guys working on? Tell me, be specific. Yeah, so we're releasing, just today, we're releasing our Node.js client. Node is one of the most popular platforms in high performance because you can scale out laterally and you can also keep the language the same between any browser-based application and your server side, right? So Node's a great thing. A lot of developers love it. It scales out very well laterally. DevOps developers. DevOps, but also just someone trying to get something done and what you need underneath Node in order to make it really sing is a fast database that's highly reliable. So we think we'll play great in that architecture and in memory does in general, but aerospike within Node.js is a perfect fit for doing great operational stuff. So building on some of our successes in the .NET community, where we have a large number of customers, folks switching in the Java world over to Jetty, which is an asynchronous platform. We have full support for that and we've open sourced a lot of those clients already. So a couple of weeks from now, we're gonna be open sourcing yet another tool which is a recommendation engine based on non-contextual systems. So using some very common sort of cosine distance vector approaches will open source all of that. You need a very fast database. You need a lot of IOPS and a lot of random access in order to be able to make those kinds of algorithms work in real time. So this isn't just recommendations on Hadoop the next day. This is, you know, I'm at the store right now. I looked at this thing. What thing should I do right now? Right, so that's, we're open sourcing a bunch of those tools as well. Is that new news or is that, we're gonna be doing it next week? I guess we're breaking the news here. We're breaking the news on the Q. Absolutely. It's already done. We just got the press release schedule. It's a brand kind of a more of a big thing. We love the Q to get the data out early. More of a kind of a big picture question. So you mentioned real time. Obviously that's where AeroSpike excels. You know, we're here at Strata where, you know, the focus I think at the start of the show a few years ago was around a lot around Hadoop and the Hadoop ecosystem and we think batch. But how have you seen that conversation change? You know, from your perspective, you know, as a real time database company, are we starting to get to the point where the community is understanding that, hey, real time is a critical part of this larger ecosystem and architecture that we're building for kind of, you know, the future of data management? Absolutely. So let me give you an example. We ran a meetup a couple months ago. It's on YouTube somewhere where one of the guys who was working with Walmart said that they got, Walmart got in their first year using real time on their website a lift of $1.2 billion in sales. Okay? $1.2 billion. That's what real time did. So they're using their Hadoop and big data instincts, right? They bought an entire company that knew how to do it. That's how they were able to hire the people. Now I don't know if this guy was interested. Really, I wasn't there, you know, don't quote me on it. But that kind of news is leaking out. Once you can do deals and give the right level of discount or not give a discount, right? Give someone the right product recommendation without a discount if they're that kind of customer and keep your average price up. Companies are out there doing that today, whether it be in telecom with how they write data, whether it be in retail and how they do deals. The real time aspect of using your Hadoop data, not just for BI and not just tomorrow, but using it today, I think the word is out. Yeah, talk a little bit about, from an architectural perspective, what has to happen to make that, kind of integrate that into the larger data management architecture at any given company. Because we're seeing Hadoop deployments, but they're often kind of almost siloed from some of the other traditional architecture that they've got. And you bring the real time component in really, you've got to stitch these all together and orchestrate data movements. And it's a really complex affair. How does kind of the real time aspect fit into that architecture? Yeah, that's a great question. So what we see at Aerospike is these are usually greenfield side by side style deployments. And that's one of the problems in larger enterprises and people with real money is they've got these legacy systems and those legacy systems just don't want to deal with 10x more load piled on top of it. You got these DBAs going, oh my God, don't crush my Oracle servers, right? So you have to do this very delicate dance of how to get the ETLs out of them and how often to do it and how much of the table do I need. We have another meetup where this guy was just talking about how to talk to the DBAs and grab the data out of those legacy databases so they can put it in Hadoop, drive these insights, and then staple that up to things like their web presence and their mobile application, their mobile vendors, and all of that stuff. So we still see a lot of ETL in the market. It'd be nice if there was a little more real time. I've seen conversations around doing distributed query management where you can query your NoSQL database and query your older relational systems. But frankly, I don't see that taking off because those guys who maintain those systems just don't want to deal with any more complexity. So really it's all about these scheduled ETL jobs. That's what you can talk them into. Then that goes into your system that's doing big data, doing real time, and an architecture where you've got Hadoop, you've got a variety of different in-memory and analytic systems. Maybe you've got some Spark running in there. Maybe you've got a lot of Vertica or one of the other guys in there. But then you need something on that front edge to really be doing it in real time, to be able to take the web load, to take your mobile data load, and that's where Aerospike's been playing. Well, so you mentioned having, talking to DBAs and getting them to say, okay, we're willing to go this far and do these level of ETL jobs. I mean, are they potentially talking to the wrong people and that you need to get the business buy-in first to kind of motivate those people to do that? I mean, when they're getting directed from the business, maybe that'll change their tune. Yeah, you know, Jeff, there's a lot of tricks. And sometimes it's, you know, talk to the business guys, then the business guys put pressure on it. Part of it is, you know, this one guy was saying, well, you know, you got to find the right manager and find out what kind of whiny likes. And if you can figure out one whiny's likes and then bring him a bottle of wine, then he'll make sure his DBAs do the right thing for you. Within every organization, there's a million tricks like that. It's an art and a science, isn't it? An art and a science. And then of course, you know, one other thing, you know, Aerospike, of course, not only in memory, but no SQL. Talk a little bit about that component and why that's important to your value proposition. Sure, so, you know, at Aerospike, we've always believed in polyglot databases. So, no SQL for us. We were in the first no SQL conference in 2009 in Atlanta, the no SQL East Conference. And we've always seen no SQL as the beginning of the polyglot world of databases. So, going beyond SQL. But, you know, frankly, I think the pendulum's swinging back a little bit. There's a lot of folks out there with SQL. There's nothing wrong with SQL other than it's not the perfect fit for every language. So, doing fully, you know, break everything apart, normalization, generate all these seeks. You know, it doesn't work for some cases. You have to do something a little different. So, I love the world that there's a lot more developer choice out there. And developers can choose a little bit of no SQL, choose a little bit of SQL, if that's the kind of analytics they have. And that's what Aerospike stands behind. So, talk about the open-source thing in developers again, because I find that fascinating. So, you guys have always been on the high end, you know, in terms of the high-end clients that are saving billions. I love that. So, that's kind of a small mark. It's still big dollars, but you guys have a hot technology that's appealing to some of the trends we're seeing here. Spark, a lot of in-memory, a lot of discussions, Abhi Mehdi was saying, you know, when you look at graph database at the scale, that you can traverse for fraud data, for instance, in banks and other areas. Still, the tip of the iceberg, people are getting more and more data. So, one, open-sourcing your stuff, is a question I want to get a comment deeper on. And also, the challenge of the enterprises that possibly could use that, and the developers and potential, what kind of developers should be adopting this? Sure, John. So, what I think is fascinating right now about open-source, is open-source is about trust. An enterprise needs to be able to trust their database vendor and a lot of their core technology vendors. It's not just a point solution for them. If there is a point solution, it'll save a billion dollars. Honestly, they don't care. But if you're really making a bet on infrastructure, you have to trust that company and trust their source code. And seeing that source code and understanding that source code is key to the trust that gets created. Now, that's a very different question from, does that product have value? Are you willing to pay for that product? Is that product something you want to have? But right now, what we're seeing at Aerospike is that enterprises have just simply shifted to reading the source code and having the source code be available for infrastructure is more important than performance, more important than anything, because trust and not being held up for ransom by your vendors is what an enterprise needs today. And I'll tell you, I got an email just this week from a very large financial services bank in the UK, won't mention their name. They just said, look, we've been burned too many times by Oracle and proprietary software. We need to have an open solution simply because we don't get held up for ransom for a price down the line. And they can look under the hood too. That's one of the things that OpenStack appeals on the cloud side for folks. But what does that mean for you guys? So looking at your reference accounts on your case studies, a lot of media companies, right? So media companies have always been a predictor of the data-driven model. We look at, you know, Amar Awadallah came from Yahoo. You know, the big web scale companies had to build their own stuff because there's nothing else out there. Okay, that's a trend setter, what's a bell weather, whatever you want to call it. Now, as your product kind of seems to be fitting into the mainstream, what do you guys have to do to be successful, one, from a product standpoint and how to communicate that to the target audience? Sure, John. So, you know, the fact that we have so many great customers doing massive deployments and massive scale within advertising and media has always been great for us. And you know, you can see that on the website, as you said, I don't want to go into that too much. But word of mouth has always been the most critical aspect of selling and enterprise infrastructure, right? I don't necessarily trust any vendor who might be even at a wonderful forum like theCUBE. I want to see it myself. I want to talk to a guy who's used it. And then I'll really know whether a product works, right? So having those reference customers is key. On the other hand, having the trust, like I said, from source code availability, we've always made our source code available to a variety of our customers. We've always had open clients where every single line of client source code you're bringing it into your application, every single line that we provide. You have no problem opening that up. Absolutely not. The client libraries, you know, you're bringing them into your application. And you know, I think that the real shift here in the enterprise is the enterprise has woken up to the fact that they can't, you know, infrastructure, they have to be able to trust those vendors and trust is built around openness. And what does the developer model look like? I want to drill down. You mentioned DevOps and the node piece, node OSI and the JavaScript server side real time, a lot of interactions. What specifically with a developer community are you targeting? So in the developer community, developers need a couple of things, John. They need great experiences from a documentation perspective. Developers have gotten justifiably used to, I want to be able to just fire this thing up. One of my quotes is installing software is so 2012. You want me to install anything, a server? You know, forget it. It should be here. It should be right now. You know what, you know, my time's being wasted here if I have to actually download something and install it. Everything should be available in the cloud. There should be cloud services running. What we did at Aerospike in order to start working with that is a company called InterNap, wonderful real time company for that does hybrid bare metal and cloud, very popular within gaming community, especially then the node.js folks that we work with. There's a free Aerospike node cluster that you can set up. You don't have to install it. You just click a button, and suddenly there's a two node cluster that you can attach to and start seeing the power of Aerospike with SSDs using the power of flash memory. That's the kind of experience that developers need. Share it to folks out there, final question. In your own words, Aerospike's value preposition, why they should be talking to you guys and when should they be talking to you guys? Sure, so you should be talking to Aerospike when you know you need in memory, when you've already figured out that the power, the flexibility in your application of in memory, on the other hand, you have big data insights coming out of your Hadoop cluster, your PentaBike Hadoop cluster, and you need to be able to pour that into your front edge database and build a service that is highly reliable. If you have figured out that you need in memory for that, you should check out Aerospike because of our reliability as well as our focus on flash, which will dramatically both cut cost and give you the big data that you need. Okay, Aerospike here inside the queue, Brian Wilkowski, CTO, we are bringing you all the action live from Silicon Valley, big data SV, covered in the Stratoconference, all the news here inside the queue, but right back with our next guest after the short break.