 Okay, we're back here live in Silicon Valley in the heart of Big Data Land. This is Stratoconference in Santa Clara, California. This is theCUBE, our flagship program. We go out to the events, extract the data from the noise. I'm John Furrier, founder of SiliconANG. I'm Joe with my co-host. I'm Dave Vellante of Wikibon.org. And this is our segment on in-memory or extended memory. And we just heard from one CEO about their in-memory grid architecture. And we mentioned on that segment a company that we introduced to our audience last fall, a company called Aerospike. Very interesting company. Actually, a little twist on in-memory. And we're here with the founder and CTO, Brian Bakowski. Brian, good friend of theCUBE. Welcome back. Good to see you again. Hi Dave, thanks. So, I was joking, Brian, that somebody had said a dupe was the new tape. And I said, okay, that's, but it's ironic. We're hearing all this stuff about in-memory. You guys, you know, focused on the flash layer. The two worlds are coming together. The traditional BI, everybody's talking about real time, in-time. What's your take on all this? What do you see going on? What's new since we last talked to you? So, since we last talked, the biggest change that's happened in flash and SSD is a dramatic change in the last six months of price and density. All of our largest customers are doubling or quadrupling the size of each and every node. So instead of having, say, only 500 gigabytes of SSD or flash per node, they're all moving to one terabyte, two terabyte, even four terabyte architectures. That's pretty life-changing for a lot of these guys because they're going from 40 node clusters of our system, which is satisfying big advertising use cases, big real-time real data, big real-time data, and switching over to just a couple of terabytes each. So 40 node clusters are turning into four node clusters, still giving them hundreds of thousands of transactions a second full-up time. And prices are now down around one to one and a half dollars per gigabyte, which compares to the $30 per gigabyte of RAM. So it's a 30X price performance difference and a big change in the density that you can have. Yeah, I'm just pulling up an article that David Floyer initiated back in 2009. And essentially what he did is he did different scenarios on pricing the flash relative to spinning disk. And it was very obvious, no matter which scenario you picked, that the high end, the high spin speed, the high performance disk drive, which is obviously an oxymoron, was doomed. And that's what's happening now, isn't it? I mean, we were just talking before, not only the prices, but certainly the prices, but the productivity impacts of putting data into memory and into flash are so overwhelming that the economics just become more attractive to the point where it doesn't have to be less expensive from a cost per bit standpoint. Well, it also doesn't have to be slower anymore. That's what's really exciting about where we are right now in terms of flash. So the issue we had with flash before was at 10,000 transactions per second, it was the bottleneck. And you had to stripe a whole bunch of them across each individual server. Essentially that day is over. What we have right now is the network is now being back to its position as the bottleneck. And at a network level, it's hard to get past 100,000, 200,000 transactions per second per server. Flash has blown past that in the last six months. No, the other thing I want- So if you've got SSD and you've got in memory, you might as well use flash, because the memory is going to be the bottleneck in either case. Sure, in memory is faster, but if the network's the bottleneck, save yourself a factor of 30. Well, and if you have to reload the data into the, into the- That's another issue, right? Absolutely. There are a lot of issues there, but it's counterintuitive to a lot of people, right? People assume, oh, well, it's obviously faster, because the specs are faster. And it is faster. There are cases where in memory, so in memory solves this particular set of problems and rotational disk, if you've got to put together petabytes, you know, you're on rotational disk. That's still- Otherwise it's too expensive, right? Otherwise it's too expensive, because it's 10 cents, a buck, and 30 bucks. Well, you know, if you're, 10 cents is attractive, right? If you're going to be, on the other hand, $30 when, you know, you say, well, 30 times, but I should get 30 times the speed. I'm not getting 30 times the speed. It's maybe 50%, maybe 25%. So yeah, we're big believers in flash and the increase in density we're now seeing. Very excited. Brian, let me ask you a question. I want to drill down on something. We don't want to up-level a little bit for the audience. We've had a lot of conversations, I was in the past two days around Hadoop, the Hadoop distributions. We had some segments around SQL meets Hadoop, no SQL versus structured data, untrusted data. So can you explain to the folks out there what real-time no SQL means and why it's getting a lot of attention and specifically connect back to the in-memory? Because I think that's an important concept to clarify. Sure, John, the main architecture that we see among our customer base is there's still what used to be called OLAP and OLTP or real-time and warehouse, right? And warehouse has been taken over by Hadoop, it's been taken over by Vertica and column databases, Cassandra, all those kinds of things. That's where you generate your insight. But in order to actually make that work, on a moment-by-moment basis, you actually have to do something. You have to respond to web requests. And Hadoop is never going to be on the front side actually responding to HTTP requests, either at age-base maybe. And there's a bunch of guys coming in that direction, right? MapR has their no-SQL distribution. That's their attempts to bring no-SQL go in that direction. But we're over here at Aerospike. We're over here in real-time trying to push out into analytics to change MapReduce, the MapReduce semantics, and say, hey, MapReduce isn't just batch. MapReduce may only take a millisecond. MapReduce can be done in 10 milliseconds. MapReduce is a programming technique and MapReduce is becoming real-time as well. People like MapReduce, people like the Java, they think they can whip up some MapReduce quickly. That's how some hearing from developers. But explain the importance of real-time and what specific applications are real-time that you see your customers using, because that's an elusive term, near real-time, inline, as Dave points out, that Chris Lynch coined that term. But real-time, talk about some of the use cases around real-time and debunk some of the myths around that's not really gettable right now in terms of performance. Sure, so let's talk about threat detection. Threat detection, you've got all of these insights, you've got all these patterns, you've got all these signatures, right? And on a moment-by-moment basis, you need to track more and more data that says, hey, look, I've seen this pattern from this IP address. I've seen this particular signature out of this. I've seen this particular payload. Well, that's got to be on the front side. That's your algorithm that's saying, hey, look. And what we had previously to now, the shift we're going through in the market right now, is what we used to have was extraordinarily fast key value stores, right? And that's all you could do. It's fast key value stores. And now we're shifting over with the NoSQL crowd, folks like Aerospike, shifting over into, let's do a little MapReduce on that. Let's do a little more data. Let's touch 10, 15 rows. Let's touch 100 rows. With the changes in hardware, you can now do- And hold on, why do they want to do that? Why do they want to do that quickly? Sure, let's say I want to, part of my threat pattern is, if I see this particular user from sets of IP addresses that's geographically dispersed, maybe it's a threat. So then I want to do a MapReduce over every IP address and say, hey, look, I've got this packet coming in. I need to look at all the different things that I've seen from these different IP addresses really, really fast. So that's where, say, network threat detection. You can do that in transactions. You can do that in online gaming, what we call matchmaking as a use case, the ability to find the best opponent for a game. That's also where you want to say, hey, I've only got 500 milliseconds, a guy's sitting there clicking on a button, right? And I still want to do an analytics. I want to do a fairly complicated little match algorithm and figure out who's best online right now, not in some batch job. Yeah, one of the things happening at this show that's really not being talked about and amplified by the press is kind of elusive and it's hard to get your arms around or mind around is that this notion that, oh, the real-time, super geeky stuff that you're mentioning is a very narrow use case. But in fact, on the keynotes today, they're saying, no, no, that's not the case. You mentioned game, you mentioned threat detection. We're breaking news on SiliconANGLE today around major security issues going on in the US. But that real-time operational sense is not about just known things. There's a lot of unknown things. So today, what people will talk about is, oh, no, no, operationally data warehousing and business intelligence, you know what you're looking for, it's operational. What you're saying is there's some operational needs that are unknown of which require low latency. Is that kind of what you're getting at? Yes, and the number of use cases really is, use cases are really what drive our industry. That's what actually puts benefit into people's pockets. Find a problem, build a solution. Find a problem. But it also means folks who are trying to push the boundaries on what has ever been done before. Let me give you an example. I was talking to some folks a couple of weeks ago and they realized as a financial services company that they were actually a media company. The analysis that they do on stocks and bonds and all of that stuff, they publish that. And if you go to a library, you can find a book with all of their analysis going back 100 years on the financial services industry. Well, guess what? Now, media is real-time. They need to vend all of that information second by second. That's their next frontier in being a financial services media company is having all of their data for all of these positions online moment by moment. Well, heck, that's a big analytics problem, first of all. And real-time it means millisecond by millisecond getting answers into the hands of people. Okay, I got to drill down on this because this is awesome. One is, actually in memory is a hardware issue. You got flash SSDs in memory. There's a hardware issues. But one of the things that we've been banging hard at wikibon.org is software-led infrastructure. It is software at the end of the day. Intel coming in with their distribution. They're essentially a software company still although they make chips. So talk about the role of software in particular to make all this happen and some of the nuts and bolts under the covers. Sure, so software-led infrastructure for me is all about flexibility. It's sort of the next step and a maturation of a lot of the cloud technologies. It's allowing us to mix and match, hey, my router isn't just my physical router. It's really the next step of VLANs, et cetera. So what does that mean? It means that for a database company like ourselves, being able to flexibly and elastically add capacity when, well, maybe one of those servers is somewhere completely different. Maybe it's not on your local network. Maybe it's here, there, somewhere else. That level of flexibility. So I really like software-led infrastructure because it's taking us out of all of us database guys going elastic and everyone going, well, what the hell is elastic? And instead saying, hey, look, it's software-led. You can just mix and match stuff. It's good. And that gives what for the customers out there? Well, it gives them the ability to respond very quickly to different business conditions. Dave, what are you finding on software-led infrastructure? Obviously, we coined the term Wikibon, led that analysis as groundbreaking. It kind of takes the notion of old convergence which is defined by HP and kind of puts some coherency around the hype around software-defined data center, software-defined, whatever, right? So what are you hearing on that? Do you agree with what he's saying? Yeah, absolutely. I think I would add to that the sort of databases services becomes a new opportunity that's enabled. But I think that we're really prescribing a whole new metadata management paradigm, right? We used to manage data from slow disks and that's not the right way to do it. You can't do that in this near real-time world. So a whole new metadata management approach where today metadata is locked inside whatever it is, some device or some network device or switch or array and bringing that into a place where you can access it and actually leverage it across the infrastructure. So that's something that we're seeing as great potential, particularly to impact, as we were talking about before, Brian, corporate productivity. And that's really kind of what you guys are all about. We had Tapat on earlier. Was it this year? No, it was late last year. No, it was early this year. A fantastic story. Now, that's a story of ad-serving, right? And of course everybody quotes Hammabarker saying, the tongue in cheek, the greatest minds of my generation are figuring out how to get people to click on ads. But we heard at Dumbbill talk about, there are other use cases that will be spawned from that, and particularly, you know, behavioral science, behavioral economics. So I think, you know, the key for us is software-led means new kind of productivity impacts to the organization. And that's really what you guys are dealing with. So let me ask, we have two minutes. I want to get to a quick point because obviously we went to Sapphire now three years in a row with theCUBE and we were there covering on the ground SAPs and now we had a chance to talk with the inventor of HANA and also Jim Schnabe about big data. So there's a kind of a world of the in-memory guys that HANA was built way before, kind of the big Hadoop world hit, and they kind of have a little bit of a different solution position. You guys have a unique in-memory database. How does the in-memory work with this Hadoop culture, which is the bottoms up software-based, you know, batch or whatever you want to call it, you got Storm out there for real-time and streaming, whatnot. How's that relationship between those two worlds? You mentioned MapReduce. How does things like HANA, because we're going to have HANA on next to talk about this, fit into that. It doesn't fit in well. Is there architectural mismatches, opportunities? Does it matter? So the first disappointment I have with HANA is that SAP has decided to create a single layer where other database players can't play. Everywhere else in the SAP infrastructure, you can plug in a different database. You don't have to use, you know, you can choose Oracle, you can choose Sybase, you can choose any of the big relational players, right? But in this very important area of new database technology, SAP is taking the position that, hey, we're going to have only one database available for our customers. And frankly, I think that's, the great thing about SAP, what has made everyone very excited about HANA, is how fast you can snap together solutions. You've got your big tables, you've got all your business infrastructures, you go bam, bam, bam, and suddenly you're doing in-memory analytics. And that's great. But there are more databases than just HANA out there. There should be a plethora and an ability to bring in technologies like ours, flash-based specialists, et cetera, into that infrastructure because there's a lot of great database companies out there. And I think SAP is really losing out by betting on only one horse. Do you think they could bring in other databases in the future? Is it, do you think with their current architecture and roadmap they could open that up a little bit? I think they could. I think that what they signaled in the last month by making HANA essentially free with all SAP licenses is an unwillingness to do that. And I hope that that signal was not a sign of the future because in-memory databases are here to say they are an entire market category. And they're distributed resources, so you can't really have a monolithic solution pretty much. Yes, you should have a variety of things. HANA is better at some things. And we can talk about the technology of HANA and in-memory versus flash, but I think that SAP customers deserve a choice. Well, we're going to have them on next on this in-memory segment, but just final question on the parting question is, give us an update. You guys have had a good whirlwind. You've rolled out some really big use cases, big deployments in real time. What's going on with the company? You guys go and mock 100, give us an update on Aerospike. So our biggest push right now is to user-defined functions. So taking a page out of some of the other in-memory specialists and bringing the power of user-defined functions, which will essentially allow you to customize your database. So by being able to run a part of your query, essentially in database, figure that out, and which allows you to do things like MapReduce on top of it. There's other folks out there like Volt who have specialized on that purely in-memory. We're bringing user-defined functions and that kind of customization into the world of Flash and SSD. We've got that out in beta with a number of customers, folks gaming customers, online retailers who are doing matchmaking, folks like that. So it's a very interesting new direction for us. As we were saying yesterday, the pink on your world view and how the world spins is a lot of different beach heads for everyone, whether it's HANA in-memory, will they all converge, will they all be separate? Does it matter? We're going to be deconstructing that here inside theCUBE on day three. I'm John Furrier. We'll be back with our next guest in a short break and we'll come here from SAP directly on some of those questions. We'll be right back with our next guest as the short break.