 George Gilbert, we're on the ground at Spark Summit 2015. We're here with Martin Ben-Riswick who's from DataStacks and DataStacks was one of the pioneering NoSQL databases but has figured in very prominently in the Spark ecosystem as the operational database and perhaps sometimes even the analytic database in the foundation. Tell us where customers started pulling you in and what the component parts now, you know, Kafka, Spark, Cassandra, how they fit together in common use cases. Yeah, great. So Cassandra is a high throughput, high availability distributed database, right? It's more for online transaction processing than maybe an OLAP, right, which was traditionally the Spark ecosystem. So what we had customers say was, we don't want to ETL our data out of that online system and then do Spark analytics on it somewhere else. We need the answer so fast, we want to do it in place. So we had a lot of customers say, we're moving that analytics further up, closer to when the data is ingested. So please integrate Spark, which we've done. And then operationalize it back into the same database. It's in the same database. So a lot of times it's not a human doing the analytics or looking at the results. It's a machine. So it's running queries at millisecond levels, not minutes or 10 minutes, using that result and piping it right back in. So streams of Internet of Things data, fraud detection, recommendation engines, things like that where you have to analyze but on stream data that is real time. Okay. So just to see the topology, let's say the streams of data, are they landing in Cassandra and then pulling them out in, let's say, Spark SQL and analyzing them or were they coming in in Spark streaming, doing the machine learning and then applying it to Cassandra. Actually, both. We have customers doing both. So we've integrated Spark streaming, we've integrated MLib, we have customers doing the streaming use case where they're using Cassandra to persist it after it's come out of Spark streaming and we have customers where the data is ingested first into Cassandra through some app or interface using standard SQL, which is the interface for Cassandra and then doing it after the fact analytics but after the fact is like seconds or minutes. Interesting. Yeah. Okay. The HBase is sort of known as the database for the Hadoop ecosystem but as the ecosystem sort of shifts on its access a little bit more towards Spark, where does it make it more difficult for HBase to fit into that sort of pipeline and how much easier is it then for Cassandra? Yeah. I don't really want to get into where HBase might be going or where it's strong. I can just say that for Cassandra, the strength is customers who have real-time apps who availability, so whole data center is going down, it's that critical of an app. It's not a batch job thing. It must be up all the time, 99.9% of the time and high performance that can scale the thousands of nodes is why they choose Cassandra and once they've done that then it's kind of clear why you would use one database over another if those were the things that you needed. Then, do I want to do analytics on it? We've got the Spark integration to help you. That's kind of the calculus most of our customers go through. They pick the store already for its abilities and they all have pros and cons and then once they've chosen Cassandra, now I want to do Spark. Okay, interesting. We're going to have to save that to drill down more at a future date. This is George Gilbert on the ground at Spark Summit 2015.