 Hi, welcome back everybody. We're here at the Cassandra Summit 2012 live, Santa Clara at the Hyatt. I'm Jeff Kelly, lead big data analyst with Wikibon. Our next guest, Ben Connors, Worldwide Head of Alliances for JasperSoft. JasperSoft, as you know, open source business intelligence reporting and analytics company. Thanks so much for coming on theCUBE. Great, thanks for having me. Really enjoyed being here. Great, so why don't we just start talking a little bit about what JasperSoft is doing with, you know, doing here in a big data world and I know you've got an announcement this morning with data stacks, so talk a little bit about kind of how BI fits in with kind of the big data world. Great, be happy to. So, JasperSoft, as you know, is, as you mentioned, open source, this intelligence company. We do reporting analytics dashboards from a variety of databases. We started from the relational side, but we've been making a big push now into big data. The announcement this morning is the latest in a chain. We're really excited to be working with data stacks. We see that they have a great team, great product, great ecosystem, lots of customers. Everything's going really up and to the right, so that's really strong. The reason that we're focusing a lot in the big data space with data stacks and others is because we believe really that the value of the big data is not in, you know, the traditional 3Ds or the velocity, the variety and the volume of data, but it's really the time to insight. It's the value, the fourth V, the value that you get from that big data, and that's what we try to bring to the party, and that's what Cassandra does really well, because as you know, they are a very high volume, but very low latency data source, and that plays very well to our strategy of low latency reporting and analytics. So they have the idea of as close to real time as you can get. Some of these other big data platforms are more batch oriented, not quite as real time low latency, so Cassandra sounds like it's as a data source for Jaspersoft is something that can really support more of the real time. Exactly. Yeah, they have a focus on low latency or near real time as do we, and so it makes for a very close fit. Right. So you mentioned, you know, you're, as most BI companies kind of start in the relational world, connecting to a central data warehouse, running reports, providing some level of kind of interactive analytics. Talk to me about as, you know, we transition to this big data world. I mean, certainly relational databases aren't going away, but we're going to see more and more big data sources come online, and that means BI companies and BI technology now have to shift to be able to support these kind of no sequel, non-relational tools, more data sources. So what are the challenges associated with doing that? Adapting traditional BI that was built in that kind of relational centralized data warehouse world to a big data world. Yeah, so great question. It actually, maybe by fortunate coincidence, fits in very well with something that we did a while ago, and that was we built, we're open source, and so we're completely open, we publish all our APIs, including custom data sources, custom connectors. We did that before the days of big data, but as it turns out, that fits perfectly into what big data requires, because as you know, these are multi-structured, unstructured, multi-structured data sources, and so we use that underlying architecture to build our various connectors, including, originally, curiously, to Cassandra using their APIs, and as you know, they've now gone to Cassandra Query Language, CQL, data stacks is fully supporting that with not only the data itself, but also through the solar interface, as you know, for search, so we support all of that. Now the interesting thing for this intelligence to your question is that traditionally companies weren't thinking in those dimensions, and so they're married to SQL, and what that means is you have to try to figure out how to make whatever you have in front of your fit SQL, and people do that through, for example, ETL, taking whatever the big data is and trying to push it into a relational data source. Well, that has its challenges in terms of latency, in terms of volume, in terms of all the things that big data does very well, which is why people invented big data to begin with, or they try to use traditional SQL approaches, and that leads to some things like, for example, the Hadoop world using Hive as a connector, which is fine, but again, that's a batch-oriented approach, as you alluded to earlier. So what we're doing and what we think is the way for BI to effectively address these big data sources is to go natively to them and go directly to them for the near real-time analytics and the flexibility and the power of the underlying database. So maybe could you walk us through one or two kind of use cases, specifically of Jaspersoft working against Cassandra, what are the most popular use cases you're seeing, what are you hearing from customers, how are your two technologies really going to work, practically speaking, what are the real value, what do you bring the real value? Yeah, okay, so some typical use cases are for things like log analysis, things like gaming applications where people want to see from many, many thousands of users and how they're tweaking the game and what that's meaning to their use, so forth, whatever. And what we help to provide is insights into that by visualizing, digesting, presenting the information in a meaningful way. So in that scenario, a company might be using Cassandra really to support their real-time gaming platform and Jaspersoft can kind of come in and help you understand that data as it's being created and exactly who's playing, how they're playing, what are maybe some bottlenecks, whatever the case may be. Exactly, one way I like to think of it is we try to make big data small. So by presenting it in a way, instead of having this sea of information, you have a nicely packaged table or graph or dashboard to reduce it all into something that we mere mortals can digest. Well, that brings up another interesting point. So when you're actually trying to visualize and present big data, what kind of challenges does that present in terms of versus smaller data sources where you knew what you were looking for in a sense? With big data, you don't necessarily know what you're looking for. You want to do more ad hoc type analysis and it might not always be clear the best way to visualize so many data sources and all the different potential correlations you can draw between data sources. So how does that impact product development on your end and actually building user interfaces that are in fact intuitive and easy to understand? Is it just the case? I mean, can you just take what traditional BI and put it on top of big data or did it require a new way of thinking? It really does require a new way of thinking and the reason that is severalfold. Number one, being able to efficiently access the data sources and take advantage of their capability but also it comes back, I think interestingly, to the near real-time latency aspect. What I mean by that is, as you point out, when you have all this data from all these different sources and all these different structures and formats and so forth, it can lend itself very nicely to iterative or live exploration and that's what we facilitate very well. So as you point out, you look at it and you say, gee, let's use a gaming example, just as an example. Gee, you know, how is this game compared to that game? Well, that's interesting. I wonder if that's true in North America and Europe. No, it looks like over in Europe that game is a little more popular. Gee, why is that? How does that compare by age demographics? Or since we introduced that new weapon into the system or whatever. So one question leads to another. If you have to wait around, again, for latency, if you have to wait around until tomorrow, to figure out what it is, it makes it a very cumbersome tool, a very cumbersome analysis. When you have this near real-time, you can operate more or less at speed of thought and quickly, the creativity, the human mind can quickly formulate new ways to think about it, new ways to look at it, and derive some interesting interrogations that way. So, you know, as we've talked about, you're an open-source company, and, of course, Cassandra got their vibe in an open-source community. So I wonder if you could talk a little bit about how the two communities mesh and maybe how they're similar, how they're different. And what's it like trying to bring together two different communities like that? Is there a lot of overlap? What are the challenges in terms of negotiating that kind of landscape? It's very complementary, very synergistic. So as you point out, both companies have very similar business models. We have free versions of the product, and then we have commercial versions. As you know, the open-source model is, or so-called premium model, typically the vast majority of the users use it for free, and they're happy to do that, and the company's happy too, because it seats the market and gets more input into the product and so forth, whatever. People who do pay for the product in both cases are really paying for it for one or several of three reasons, either professional support, the ability to embed and resell the software in another product, or thirdly for some premium features, some extra lit. That's the data stacks model that's Jaspersoft model, and so it really plays very well together. Yeah, interesting. So yeah, I mean in terms of the use cases, we mentioned the gaming use case, looking forward, where do you see this kind of going, not just with Cassandra and data stacks, but in general, how is BI going to evolve in this big data world? I mean, we talked a little bit about the technical challenges, but I mean, is BI going to be, are we going to be able to recognize it compared to what it is today in five years from now, when we were talking about where the term big data might not even exist, it's just a given, you've got all this data. What's this going to do to the BI industry? Okay, so what we see happening is trend towards what we consider self-service BI at scale, and what we mean by that is with all this big data and all these hidden messages, these gems of information that are buried in there, more and more people are going to find value from that because you're collecting information that's interesting to this group or that group or this function, that function, et cetera. So what that means is you're going to want to be able to have easy to use BI tools that will let the user reconfigure, re-sort, re-examine the information in a way meaningful to them. So the days of static reports, I think, are going to fade from us as we get to the point where you want to empower the users to see it in a way that makes their life easy, their insights faster, et cetera. Secondly, as I mentioned, self-service BI at scale, the at scale part is very interesting in a couple of ways. Well, what it means is that you don't want to have, again, you want to make the data available to lots of users in lots of ways, and things like desktop BI, I think, are going to fade because they're expensive and technically, it's hard to scale and support large user communities. Big data, I think, is going to lead to big user communities and that's where you want both the technical and economic models, financial models, to make BI able to serve hundreds, thousands, tens of thousands of users. And I think that will be the trends we'll be seeing is empowering the users to do more of the self-service analytics and having a technical and financial model that makes it feasible to put it in the hands of all those people. Well, let's dig into that a little bit. What will that technical and financial model look like? So, technically, what it means is that you want to have easy-to-use web-based tools that don't require the desktop installation. It means you want to have them intuitive and flexible and secure to be able to put in the hands of so many users and let them sort and filter the data. It means you want to support lots of end-user platforms, not only desktop but web and even mobile. And, for example, at Jaspersoft, we now support the Apple iPhone, the iPad, and Android.