 This is Dave Vellante, I'm with Jeff Kelly, this is theCUBE, we're live at the Splunk Conf 13. Last night, big party, big customer party. About 2,000 people here, just under 2,000 people. Excellent event. Matt Vile is here, he's a co-founder of Datastack, Smokin' Hot, no-SQL database company. Matt, welcome back to theCUBE. Thank you for having me again, love hanging out with you guys. Yeah, thank you. So tell us, what's going on here at Splunk.conf? At Splunk.conf, what's your relationship with Splunk and then we'll get into what's new with Datastack. So on the conference side, it's amazing how many people are here. I think they had it 50% year over year, which is ridiculous and it looks like they're going to need a new venue if they're going to keep this up. I mean, I think we're at the bursting of the seams point. In terms of Splunk, so we obviously, Das X is the commercial entity behind the Apache Cassandra project. We do a lot of products and services for that and we've got a number of customers who are using something called Cassandra Connect with Splunk that they can, from Splunk, look at all the machine data that's in Splunk and cross reference it against the stuff that's in Cassandra to have one complete view of their world. Okay, so it wasn't clear from some of our web searches what the nature of the relationship is between you guys. Do you have a formal relationship or is it more just you're sort of doing stuff together in the field? No, we're formal partners. We're doing stuff in the field all the above. Godfrey and Billy, the two CEOs of both companies are really good friends and they're actually talking on a panel. I think Godfrey was in the crowd actively participating yesterday as opposed to on the actual panel. That's their birds of a feather. So what's new with DataStacks? We've covered you guys in Cassandra for quite some time now and you guys are doing great in the marketplace. What's the update? So on the product side, we just continue to expand the Cassandra community. We've got thousands of deployments now in production where businesses are actually running their company on top of Cassandra. On the DataStacks product side, our latest release offered security inside of our main offering which is DataStacks Enterprise. In that offering, you've got integration between Cassandra for your online needs as well as Hadoop for analytical workloads in addition to solar for search across that data set. And the latest release brought security into the mix to really satisfy the needs of Fortune 500s who have strict compliance requirements. Yes, they're sort of bringing that enterprise capability to big data and the like. People talk about that a lot. And Jeff Kelly, you and I have talked about this. We talked yesterday, actually, one of the analysts that's here at the show, Peter Goldmacher, was at the Oracle event, Jeff. And you were telling me, he mentioned that he asked a question of Mark Herd, what's your biggest competitive threat? And his answer was, not surprising, I think I even said to you, it's going to be internal execution, sure enough, that was it. So evidently, you're still under the radar. It's a good place to be. Yeah, I mean, on one side, we're definitely an up-and-comer on that aspect. But on the flip side, Oracle has an offering called NoSQL. Now, I don't know how many people are using it, but they've identified the space and named a product to try to compete in it. So there's something going on there. And as the 800-pound gorilla, I doubt they're going to come out and say, oh yeah, our biggest fear is to start out of San Mateo. Well, that's their strategy, or their playbook, right? It's to come in late to market and act like you invented it. And it works, right? So good for them. Yeah, absolutely. Matt, talk a little bit about how you differentiate from some of the other players in big data. This is kind of a basic question, but I think it's important, because sometimes data stacks get kind of lumped into that group with some of the Hadoop players. Obviously, what you do is different. I mean, there's a Hadoop component to the data stacks enterprise product. But really, you guys are about online transactional processing, powering web applications. Tell us a little bit more about how you differentiate from those kind of Hadoop workloads, which I think are generally more analytic in nature. That's a really good point. The worst thing about big data is it's basically everything in today's day and age that's anything in terms of a data management system. But if you really dive into it, there's sort of two different types of workloads. There's your online system and your offline system. The offline system in the traditional world is the data warehouse. We're seeing that evolve into what Hadoop's ecosystem is today. The online is your traditional relational database and sort of the up-and-coming aspect of that in today's world is no SQL space. We play in that online space. In other words, we are the system that is actively running the business where results are measured in milliseconds, not minutes or hours. And if it goes down, the business simply can't do business. So give us some examples. What are some core applications of people running on top of data stacks that are powering the business that are critical and can't go down? So a great example is we recently held the Cassandra Summit 2013 over the summer in San Francisco, and one of the talks given there was by Intuit, the guys behind various tax options. They're moving and are actually running now their entire data infrastructure for doing all of their tax workload on top of Cassandra at this point. And that's obviously a multi-billion-dollar-year business. If that goes down, especially during tax season, not so good for the rest of us. Absolutely. So again, and we brought up Oracle earlier. Talk a little bit about what you're seeing in terms of trends in your customer base. We hear a little bit about people looking to no SQL options when either Oracle or traditional databases, not just Oracle, either hit performance, performance starts to degrade because of either the type of data, the size of the data, et cetera. But also there's the price component as well. I mean, Oracle is not known for being inexpensive. So what are you seeing in terms of people using the traditional databases coming to you? What are the main concerns that are coming to you for to address? Let's tackle that in a couple of parts. First and foremost, we're in this data age and Gartner's got the three Vs, volume, variety and velocity. Traditional relational systems were never built with that in mind. Datasets 20 years ago were just simply smaller than today. The challenge of those three Vs as they come into data management systems, you can overcome those, but it usually requires a lot of complexity and complexity inevitably leads to downtime because when something goes wrong in a complex environment, it's usually not easy to fix it in a really quick manner. So from the ground up, Cassandra was built to handle various aspects of data, whether it is volume, variety, or velocity of data coming in and out of the system, but maintaining uptime as the number one concern. In the online world, uptime is all that matters. Performance is second, and features are a nice gravy after that. But if you're not up and running, the system's down and the business is down. To your point around cost specifically, I think one of the really interesting things is the only reason this big data movement is possible is it's economically feasible for anyone to store practically unlimited data. For example, if I go online today, I can probably buy a three terabyte SATA hard drive for 100 bucks. 20 years ago, three terabytes would have cost a fortune. But if you're going to use commodity hardware to make it economically feasible to do all these things with data, you've got to have a system that is resilient to failure because commodity hardware inevitably fails. As a result, you see a huge cost savings just on the hardware alone in this day and age compared to buying really expensive sands, et cetera. You can accomplish the same thing by utilizing something like a sander on top of commodity hardware. And I think one of the better examples of that is Netflix, who's running their entire business on the Amazon cloud, which is not exactly the most resilient system on the face of the planet. Very true. Okay, so now let's dive a little bit deeper into NoSQL. So there's different flavors in NoSQL out there. You've got competition from companies like MongoDB and others. What is Cassandra and Datastack's sweet spot in that kind of continual NoSQL databases and different flavors out there? Mission critical environments. We've got a reputation for staying up and running no matter what. You can lose a machine. You can lose a rack. You can lose an entire data center. The system will keep on chugging along. We've got deployments that often start as small as three and five nodes, but we've got ones that are running thousands at this point. And so we're really known for maintaining uptime and performance no matter what comes at you in terms of the challenges of some of the largest companies on the face of the planet. And then now let's give, you know, just want to pivot a little bit back to the kind of the relational world. And let's give Oracle their due. I mean, it's clearly they're a very successful company and the database is powering a lot of applications and a lot of core business processes at enterprises across the world. So what is it that, when is a relational database appropriate and when is NoSQL appropriate? In other words, what are some of the things that maybe Cassandra doesn't do great? So I think that in the day and age, when you've got a data set that is, in Cassandra's case, always has to be available, even if it's a small amount of data, Cassandra's actually better at that than a relational technology and that's because of its core architecture. It's a peer to peer based one as opposed to master slave. So there are some areas where we have customers on 20 gigs of data, which would honestly fit on my iPhone at this point, but actually can have a better experience running out of Cassandra compared to a relational system. With that said, I won't say that we should always put everything on that. There are scenarios, for example, sometimes financial transaction type things that actually makes sense to put on just a single relational database because it's more feature rich when we're doing things with that small amount of data where uptime might not be the primary concern. So that kind of back office, some of your financials, things like that, is that where a relational, the Oracle database has been powering those things for years? Is that where, okay, that's still a good fit and sort of the online, some of the customer facing applications is where no SQL makes more sense? I think you nailed it. Netflix has moved all of their stuff over to Cassandra with I think the exception of their HR systems on the back end and you know what, that just doesn't make sense for us. You could do it, of course, but he's a square pagan around hall all the time. It just might be the smartest thing to do. Matt, why did you and your colleagues start data stacks? So Jonathan and I were at Rackspace and we were actually both working on Cassandra from different angles. He was hired to build a distributed database potentially for product purposes one day on their cloud team. I was in their cloud apps division and I saw that there were 20 or so development teams who were all spending more time scaling their databases than actually working on customer facing features. So I was working on an internal service where the internal teams would be our customer and we would focus on scaling the database provided as an internal service to those teams so they could focus on poor customer facing features. As a result, as our paths crossed he actually took me out to lunch one day to tell me that he wasn't going to stay at Rackspace much longer because he wanted to do a startup. I got an end to Rackspace through an acquisition of a previous company and while I was trying to talk him out of leaving the company because I spent about half my time recruiting he convinced me to leave and go with him. Okay, so that's how really you guys got together but why did you guys choose what you chose and what was kind of the early defining mission of data stacks? So the really cool thing about open source is that there were lots of companies using Cassandra out in the wild and they were basically saying, look, we see the real strength and the benefit of using this in our company. We really need a commercial entity behind this so that we can trust our business on it and have some input on where a product feature is going in the future. Have someone that we can call whenever things are going wrong and need support so we can make sure that we're keeping the business up and running. So the nice thing was literally the first day out the door we had customers and that's lucky. That's the best way to say that. You have established install base. So you guys saw that opportunity. You were users of the database and felt like you could add real value there to the community. The other thing I wanted to ask you is you feel like sometimes the Hadoop NoSQL open source tail is wagging the data dog. Yet at the same time you see all these statistics. How much of the data is unstructured or multi-structured and how fast that's growing relative to structured data. Do you see that flipping? Do you see that equation flipping? The other big question is are guys like is the oligopoly or the cartel as we call it sometimes just going to grab up all the innovators and subsume it and try to keep the status quo. What are your thoughts on how that plays out going forward? So I think one of the things we tell people who are just getting involved in the big data space is that there's a little bit of a learning curve and it's not so much that you're learning something new it's that you're unlearning restrictions in the past world you know you can only store a certain amount of data there had to be structured to it and it's a little bit like the red pill blue pill seen out of the matrix where once you get around the corner and you realize there are no more restrictions on what you do with your data there's sort of no going back and honestly having unlimited access to data no matter what size shape or form it is is actually better for the end customer so I don't see how that's ever going to go away if we can innovate to make sure that the customer experience and user experience just constantly improves. So that innovation, the premise there then there's the innovation and the disruption and the value add the business value that's going to be created by that dynamic will ultimately become the model of the future whether or not it's a startup like yours that becomes prominent whatever does IPO and takes over the world or a large company subsumes that ultimately that's going to be the model of the future you believe. Yeah in the very near future because technology is just evolving at a more rapid rate than ever before I don't even think it's going to be the outlier things can be the norm and people looking back and say why did they ever do business that way? Do you think that there is I mean it's a software industry right which if you go back to 1990 and had to predict who was going to be the leader in ERP you wouldn't have predicted SAP right so it's very hard to predict although things are different now they've changed as I said them I referred to the oligopoly the industry is you know more stable now than it was back then do you think there's room for a new you know billion dollar software company in this space to really come in and thrive and survive and remain independent for a long period of time? Absolutely and I think what you're actually going to see is there's going to be a collection of billion dollar companies coming out of this big data race in the coming years is you start to see some of the Hadoop players and the NoSQL players start to gear up for those IPOs there's going to be a collection of them Do you think there'll be a red hat of a dupe? I do and I think that there might even be a couple You think those companies will be independent? That's a big question I think they'll be independent for some part it'll be curious to see what the guys with the deep pockets do because you wonder right because you know the industry kind of let red hat get away with it yeah and now you see guys like you know EMC buys green plum you're seeing what you know different you know whatever whether it's you know Microsoft making relationships or Oracle I think they're wiser to the potential of that you know that market value that's been created there so you savour your CEOs and boards these days they know that there's a general flavor right now where a lot of these players in the Hadoop and NoSQL space are open source and that's really attractive to most of the end users of the system so I don't think that guys with money are going to let these very successful open source players sit on the sidelines for long even if they go public for a while but at the same time their heft doesn't necessarily ensure their success right so it's a really interesting dynamic I mean the passion you know the startup and the agility of a startup you know to a point seems to have an advantage I completely agree with that All right Matt we'll give you a last word before we got to go what thoughts do you want to leave with our audience with this whole space data stacks you know Splunk Conf we're just really grateful to be here and seeing this conference you know grow so much year over year just I think is a huge plus for the big data community all in because you see more and more people understanding that this is the new norm as opposed to just an outlier and the numbers just on every aspect how that from the financials to the number of people doing it Matt Fowle thanks very much for stopping by the Cube it's great watching you guys build an awesome company and I really appreciate your time Thank you Keep it right there everybody I'll be back with Jeff Kelly right after this we're live this is the Cube at Splunk Conf