 Okay, we're back here live at the Hadoop Summit. We're in Silicon Valley, the heart of Silicon Valley. It's San Jose Convention Center. This is where Hadoop Summit's going down. Big time conference on big data. This is theCUBE, our flagship program. We go out to the advanced district, they send it from the noise. I'm John Furrier, I'm John Furrier with my co-host. I'm Dave Vellante at wikibond.org. Brian Bukowski is here. Brian is the CTO and co-founder of Aerospyke, longtime CUBE guest. Brian, welcome back to theCUBE. Hi, David. I say long time, actually. It's not really been that long time, right? Last year at Oracle Open World. But we've had a lot of great interactions and discussions with you since then. We've had, you know, we snooped out some of your customers and talked to those guys, and you guys are really, you know, making an impact in the marketplace. Give us the update, how's it going? So a lot more customers. Everyone who told us that advertising was a small market. My belief now is if you're going to make money, it's either with advertising or transactions. Transactions have fraud issues and trying to bring more and more intelligence into real-time streaming, real-time fraud analytics. Same thing with advertising. With advertising, you need more analytics, more power there, and more data about users. So that really covers the entire way to make money in the modern world. It's a great market for us. Yeah, and we talked at Oracle Open World about how you guys are leveraging flash in a unique way. You guys were early on in that trend, and everybody now hopping on the bandwagon and starting to see the economic benefits of flash. So what are you seeing in terms of your ability to leverage that? You still got a lead in the marketplace or others catching up? Talk about that a little bit. Well, the great thing about flash, Dave, is that flash is getting faster and less expensive every month. My favorite device right now is the Micron P320H. It's beating every other device hands down, and we get lots of great devices in. That device is seven times faster than Fusion IO at the same price. And I'm not saying that because I'm plugging them or I have stock. I just want people to use the fastest flash. What Aerospike is able to do is drive that flash to its limit. So where another database is going to be bottlenecked on CPU or bottlenecked on IO, we can drive it all the way to the 700,000 transactions per second that one of these devices can do. And that's really something. Yeah, so we just heard H-streaming was on here. We were talking about streaming technology, talking about the storm a little bit. Talk about how you participate in that world. You complimented, is it competitive? Let me discuss that a little bit. I completely believe that streaming is the next Hadoop. It's a complimentary technology to Hadoop, and it will be the next metaphor that drives big data forward. We have to act on the real time. That's where big data has to go. And taking streaming, why store it? Some data is coming past and you just analyze it on the fly. We believe in storm as the correct metaphor and the correct platform to do that. Storm is simply a processing system and an API where messages flow through and you can add your own analytics. Well, that's great for Aerospike because at Aerospike we can be the persistence layer and the only persistence layer that's fast enough to keep up with storm. Storm can do a million messages per second per server and that's how fast Aerospike can go. So the storm guys are going well. It has to be in memory computing. In memory computing with Flash with Aerospike, we believe it's a perfect fit. So we've released as storm connectors an example storm code on GitHub. Aerospike Storm is the name of the repo and we have found two customers, one in China and one federated media here in the States that are already using storm with Aerospike to get very fast real-time analytics for advertising. Okay, so it is, I was going to ask, what are they doing with it? That's the predominant use case, right? Yeah. So talk about the messaging and processing side because that's obviously MapReduce. There's been a lot of benefits there. So when messages are dropped or lost, that's a big problem, right? Absolutely. So it's all about the storage angle there because we had Amer on earlier, HTFS and some other alternatives are out there. But mainly, how do you guys fit in that? Because the storage is critical, right? Because if you lose messages, then what do you do? So the way it works with storm is there are two dominant processing methods. One is at least once delivery where you don't, it's okay if you replay them and storm does that itself or you can add on top the Trident overlay which is another project by the storm guys by Nathan. And so with Trident, you need a storage device underneath it that can keep track of what message has been at what phase. So you need to build a Trident plug-in that allows very, very high speed access for the replay and transaction tracking system. That's a perfect spot for Aerospike. So the other way to do it is to do a lot more processing with Kafka. Kafka is a natural add-on to storm and that's a reliable system using HTFS. We really believe in Trident. We think it's a lightweight, better processing model although the market will decide eventually. So basically you're going to get a lift off this because you have the persistence store, right? That's the key. And the only one running at that speed. Only one running? The only database that runs at storm speeds. So when you need to do a million, 10 million messages a second, we're the right database for that. So you won't be the bottleneck in that configuration? That's right. A lot of the storm guys currently believe, hey, if you're in a bolt, you can't make a database call. That's insane, storms too fast. And the answer is aerospike. In-memory processing with Flash. Yeah, I mean, we've always loved, we always thought that they were in-memory key but now this highlights it because what it does is highlights what everyone else can't do. Who's the closest second place to this? What other options are out there? Well, there's a lot of companies like the fellow who's on earlier and the IBM Streams guy and Stream-based guys and were they acquired this week? I forget. So anyway, there's companies that are doing ways that you can build queries and have streaming queries. And that's really a different world because what Hadoop has taught us is write code, have that code be accessing. You don't have to do everything with queries. Sure, some people want to do queries and that's great. But Storm is all about writing code that is executed as opposed to having to write queries. So there's a lot of other folks like the Streams people and Stream-based that are writing streaming queries. We believe you can just write code just like you do with Hadoop and that'll give you a higher level of parallelism. So we were just at the Velocity Conference where you guys were there. You weren't on, but your co-founder was. Again, that's a monitoring side. So moving up the stack, what are some of the things that you see technically and then from a solution standpoint, obviously advertising is great and transaction is talking about banks, talking about high volume. What's on top of the stack? Let's think developer now. How do you guys see you guys crossing over to the fatter part of the market in terms of the transactional side? Is there anything that you're seeing directly that you're looking at? Absolutely. What we're seeing is the Web 2.0, the analytics, being here in Silicon Valley, everyone's already on top of big data. There are companies like BlueKai, Exolate that can buy and sell massive quantities of data and give companies data sources they've never seen before, much a very important thing. But there's still the rest of the market. So let's take, for example, Merchant Risk, which is part of the fraud processing system. Those guys have not fully used the technologies of big data. They're still running queries against Microsoft SQL systems. They have not figured out how to, in real time, how to make use of the data that's now available and the technologies available. So the financial services guys are always on top of things, they're leading edge, but there are so many new industries where, first of all, they have to get more data than their own individual data. They need to buy and sell at a greater velocity to get access to big data. What we call multi-channel data, because it's not just about your data, it's about the social data that's out there, your customers' social data, when that's available. So those guys, the ability to access that to make sure you are really buying a particular TV and you don't get annoyed by a little pop-up asking you for extra validation. So that's the key we're living in, a feed world. I mean, our API-based service-oriented architecture. I got to ask you this question because that kind of brings up, kind of like, you've been around the block. We've seen SOA, this is going back in the 2000, I mean, you know, web services promised a slew of tech that just now is starting to hit this mainstream now. It's also message-oriented middleware. It's the return of that as well. We have a lot of returns of things. You know, it's like the web, when the bubble burst, everyone was complaining, oh yeah, but all the stuff that people actually wanted to do happened. Right, so what of the old is popping its head out now that's relevant? SOA, our service-oriented architecture is one. We've seen that with API-based data centers. What other things can you share from your personal perspective that you think is relevant right now? Well, we are seeing multi-master databases, MDM. We're seeing that sort of continue to struggle on. And that's the ability to have a centralized data source to bring together and still work with all of your legacy data sources. So as people are adding in Hadoop, it's not like they're phasing out their old systems. So now you've got specialized databases like Hadoop or Vertica cluster or something like that, or an Aerospike analytics cluster where you can start doing your processing. How do you know which database to use? So there's one area where at the application level, I think we're going to have a little more of the federated query architectures and MDM systems that'll allow us to do processing over this diverse set of query engines. Okay, Brian, thanks for coming inside the queue. Always great to have you on the queue because, one, you're so plugged in, but really highlighting some of the things like Storm that we're seeing and other things like Spark is out there. What's your take on that, too? Great little database. And what do you think about Amar Awadal's comment that HTFS is being fragmented when putting storage in Red Hat, for instance? Well, I actually think one of the benefits of the Hadoop architecture is it can take different storage plugins. That's a good thing. That shows that there will be places for vendors, people like MapR to come in and innovate. If Hadoop was a static architecture where you couldn't plug in new architectures, where companies could not make money underneath, that's a sign of not as good an architecture. So I think it's good that we will, and yeah, you get a little fragmentation. So what, we've got that in Android, too. And I love my Android phone, and it's a great ecosystem. What he's saying is that everything's put in HTFS, then that's better for the platform. That's his argument. Not just MapReduce, Hadoop is not just MapReduce, it's HTFS and MapReduce. Is it a legitimate argument or? Yeah, I think that HTFS does need to be the core system because that interface, and being able to swap in that interface is what's important. But having plugins for compatibility and certain things like positive compliance, which Red Hat has to have. Absolutely. It's kind of, okay, cool. All right, we've got a little signal there from the comment here. I had someone on Twitter, Dave commenting already about that we're confused on that comment. So it's always good to get a clarification. Brian, thanks for coming inside theCUBE. Co-founder of Aerospyke Hot Company. Again, timing's everything in this world. You guys have been banging on some good tech in the right place right now. Streaming, storm, hits right to your sweet spot in memory, persistent store. Really, really nice element. Congratulations and we'll see how that plays out. This is theCUBE. We'll be right back with our next guest after this short break. I'm John Furrier with Dave Filante. We'll be right back.