 Okay, we're back, this is Dave Vellante, and this is day three, we're winding down here. It's been three great days, jam-packed, executives, practitioners, bloggers, opinion makers. Kartik Kanan is, Kanan is how you pronounce your last name, right? As the head of marketing and products at Cetus. Now, you may have, well, now at VMware. Now at VMware. Cetus was a very quiet, kind of quasi-stealth, analytics company that kind of exploded on the scene recently with VMware acquisition, and we had heard about them before the acquisition, and kind of squinting through what was going on there, but, and then of course the acquisition. It really started VMware's, in our opinion, big data play. It's been about a year in the making from our perspectives, and we felt like there was a big hole in their strategy with regard to big data, and you guys are in part, anyway, designed to fill that, and I know there's much more in the roadmap, but maybe you could just take us through the beginning. Cetus, you guys started this, you got some amazing people involved in the company, some really big players. I know Bechtelstein, Dan Warman-Holven, my friend Ravi, TM Ravi, great guy, so some really sophisticated, early supporters, but talk about the early days and where we've come from. Yep, well, I will. First of all, VMware does have an excellent portfolio in the big, fast, flexible, big data space, but we'll come back to that later. Going back to Cetus, we started the company in 2010, and when we started off, we were thinking of the data warehouse and BI problem, and going back 20 years, I started off as a data warehouse programmer, and I look back and I say nothing has changed other than things speeding up, getting bigger and faster, but really it's the same sequel, it's the same schemas, it's the same everything. So what we thought about was there is all of this big data, there must be an easier way to mine all of it, and we started talking to several companies in the online space initially. We talked to e-commerce companies and gaming companies and advertising, and the common problem across all of these guys was not getting enough insights deep enough instantly because they're all about monetizing user behavior. So if you have to wait days and weeks to get your insights, you really are not monetizing. You're looking in the rear view mirror. You really are. So hey, look what happened. Exactly. Boy, I wish I knew that, right at that. Exactly, right, exactly right. So it boils down to being just a game of reporting, and reporting's not enough. So we founded the company on the premises of big data, but being able to provide insights into the big data easily, that's premise number one. Premise number two, we didn't really want to cater to the DBAs and the IT folks, for reasons being the business users in these companies really want fingertips, information in their fingertips, and they want to be able to make those decisions quickly. They don't want to have chains of people to go through. So premise number two was how do we empower the business user to get all of these analytics? So that being the two premises, we kind of started up Cetus, and we are continuing on in that direction, and under VMware as well. So you touched upon a couple of points that I'd like to come back to. So the first being, big data insights fast, and you're absolutely right, it's almost like the business intelligence and the whole enterprise data warehousing business was kind of a fail. It was fail in terms of, not that it didn't create some value, it certainly did, but it was fail to deliver the vision that was put forth by all the BI vendors early on. It became rear view mirror thinking, and I think the second point is the business users. So my question is, in your view, what is different about this big data trend, generally and specifically with Cetus and VMware are doing, that will live up to that promise this time around? Well I think the most important thing that we have focused on that is going to be the answer to your question is we are all about delivering real time analytics as well as batch analytics. I think the two have to come in combination to deliver a full analytics suite of products to the end user. What I mean by that is most people when they connect their application up for analytics, they want certain things out of the box. And these could be a number of unique users or a number of players who played the game today, things like that. But there's also a ton of things that is specific to their business that we as a vendor would never know, but we've got to be able to support it like a platform and let you define your own custom function and yet have it instantaneously respond to you with the right answers. So it's a combination of real time analytics out of the box as well as batch analytics on what you want specific to you. And behind the scenes we use technologies like Hadoop and the good part about what we do is we productize that whole stack. You don't have to be a map reduce programmer. You don't have to have 12 data scientists sitting and cranking through the data. So what we've done is made all of that technology, made that complete layer opaque. So you don't have to divert the attention of your developers to becoming Hadoop experts and running your data. Let them do what they do best, which is develop the game or the e-commerce site, but we'll productize that whole thing above on top of it. So I think those are the answers that are going to make this really real for the end user as opposed to just being a big data fad. So again, I want to make sure I understood what you're saying, Karthik. So batch, bringing together batch and real-time big data analytics. And if I understand that correctly, you're not positioning, for instance, Hadoop as just batch. What you're saying is it can be real-time. Let's define real-time as, we like to define it as, before you lose the customer. Exactly right. It could be milliseconds, it could be microseconds, it could be even a minute. Yes, right, absolutely. So you're comfortable with that definition. But you're not depositioning Hadoop like a lot of companies and people as just batch. You're recognizing that Hadoop can be real-time, near real-time. What about those systems feeding the transactional systems of the organization and delivering literally orders of magnitude, maybe not orders of magnitude, but meaningful impacts in productivity and products to organizations? So I want to remove the technology conversation from this answer completely. I mean, things like Hadoop and being able to leverage search indexes like we do or combining the two machine learning algorithms, graph databases, all of these contribute to helping solve the problem. But they're all technologies. Unless we can leverage all of this in a platform that is geared toward the business user, you really are not taking the impact of all of these technologies. So taking the technologies aside, coming back to the point I was trying to make is, combining real-time and batch is one aspect of it, but you touched upon a very important thing too. How easy it is to get events floating from these applications. Feeding in the right data into your analytics platform is key to get the results, because the fewer the data points you get from your application, the fewer the analytics are going to be. So another dimension of how to make it real is to integrate directly into the applications. So you don't have to write a lot of code, provide them SDKs, provide them instrumentation, use syslog services, whatever it is, stream the data as seamlessly as possible in multiple dimensions. Just raw event data from the applications coming in real-time. I think that's the right answer to make all of this possible in real, is make sure that the application layer is not feeling the pinch of having to integrate with an application for analytics. Sorry, so let me, again, there's a little bot there and I want to make sure I understand it. I'm going to use Oracle as an example. They might, in one example, in one use case say, okay, use Hadoop for the batch. Let's build a connector to Hadoop and we'll do the transactional piece, the real-time piece. You're describing a scenario that is quite different from that, directly feeding that data into the application. Now, there are others like, I think of an adapt, trying to build a new platform to do what sounds similar. Would they build that on top of your offering, is that a competitive offering? I'm just trying to position them in my mind. Adapt would be more at the infrastructure layer, not to pitch them in a position that they wouldn't want to be pitched in. Like I said, taking the technology layer aside, what we are is an analytics platform layer sitting on top of no matter where your data wants to be. If you want to put it in a transactional and relational database, so be it. The right way to say extract some data from Salesforce would be to build a connector to Salesforce. So no matter where the data is, structured or unstructured, the right approach is to make sure that you have all the right connections in place to make that data transfer as seamless as possible. Our way, when we deal with online companies, is to talk to them about having an application integrating directly into our platform, therefore bypassing this whole transfer of data, ETL into a database and then extracting it out into a Hadoop cluster from where you do further mining. You can continue to do that in parallel until we can prove the value out. But really, I think the value can be derived by streaming application data directly into an analytics platform. If what you're saying is true, why would you do it that way? But you're right, you got to earn the right to get there. Okay, but essentially you're saying you could work in concert with ADAPT and build applications on top of that. Absolutely. Absolutely. Okay, that makes a lot of sense. So that's, you can see the stack emerging here. What's priority one for Cetus now, that you're part of VMware? Well, the first thing that we want to prove out is in the online application analytics space, as we call it, get a good install base of online companies as our customers. We have several trials and several customers at the moment, but we want to expand it out to thousands of customers. And we are focusing on the online space, so e-commerce, gaming, ad publishing, social media, and mobile app companies. And we really want to establish a presence here. That's goal number one. And then we'll think about goal number two and three. So where are you in terms of actually shipping product? We are in limited availability and servicing customers right now. So we have existing customers. We are in trial in several environments. We will announce GA in a public manner later this year. So limited on time, you mentioned sort of a number of examples. Can we go deeper and pick your favorite use case? We can talk about that a little bit. Well, e-commerce and gaming happen to come up to the top in terms of favorite use cases because both of these are specifically focused on user behavior analytics. They really are all about monetizing user behavior, either by serving the coupons or having their loyalty and engagement increase. So those are the two ones that I gravitate towards and we have some good trials and customers happening in that space. So we'll probably by the end of the year have more in that space ahead of mobile app and social media. Where do you see the practitioners now? Have they done probably a data inventory? What data do we have internally and externally? And then they're starting to, sounds like build a data architecture. Is that right? And then they're sort of still do an inventory of the data? Yeah, well, I think that's where the bulk of the exercise remains. First phase is inventorying the data. And we often find that too, when we engage with customers, they ship a whole lot of data through application feeds to us. They go do some discovery of the data, get some insights, and go back an instrument to send us more data. So it becomes an iterative process where they realize they don't have enough analytics because they're not capturing enough event data. So they go back. So it's a bit of an iterative process and it will take time for a customer to get fully engaged and receive the full benefits of a platform. So there's still a lot of learnings going on. We know this for sure. A lot of big data strategy discussions going on in organizations right now. And some deployments that are pilot, but not a lot of large scale production deployments broadly. They're out there. And this is a very, very steep learning curve and we think it's going to happen. We've been predicting now for quite some time that it's going to make major impacts on business productivity. And we're very excited about the combination of these big data applications and this new data infrastructure. We are exploding onto the scene. So first of all, congratulations on your successes and your acquisition. And really appreciate you coming on to theCUBE. Thank you, Dave. Sharing your insights. Good talking to you. Hopefully you can come back again. We will. All right, keep it right there. We'll be back with our next guest. Right after this, Silicon Angles theCUBE from VMworld Live 2012. Keep it right there.