 Hi everybody, we're back. This is Dave Vellante of Wikibon.org and this is theCUBE, SiliconANGLE's flagship product. We're here at Strata. theCUBE is the reference point for innovation. We go out to the events, we extract the signal from the noise and try to bring you our audience feel for what the event is like. This is probably fourth or fifth Strata that we've done and it really is the big data event. And now SiliconANGLE this year has entered into a new partnership with O'Reilly Media and we are actually doing theCUBE at a number of events this year with O'Reilly and very pleased about that. And today's segment, one of the themes that John Furrier talked about earlier was just the importance of visualization, tools for the trade, actually getting big data into the hands of business people. And visualization is a key enabler for that. And we're here with Ben Connors, who's the worldwide head of alliances at Jaspersoft. If you don't know Jaspersoft, it's kind of open source BI, doing a lot in the visualization space. Ben, welcome back to theCUBE. You were on, I guess, last year at the Cassandra Summit. So welcome back to theCUBE. Thanks, delighted to be back here. Yeah, so as I was saying earlier, visualization is a key component of putting value in the hands of users, of business people. You know, people often talk about monetizing big data and you really can't do that without allowing the tools of the trade to get into the hands of business people. So why don't you just give us an update on Jaspersoft, what you guys are up to, and we can get into the whole visualization thing. And I know we have a demo as well. Looking forward to that. Great, happy to. So first of all, thanks for having me. You're welcome. Delighted to be back. Yeah, so let's talk for a minute to your question. Jaspersoft, as you know, is, as you pointed out, business intelligence product. We do reporting, visualization, analytics, dashboarding. On all kinds of data. We started out in the relational database world and address all the traditional data from Oracle, MySQL, DB2, what have you. But we have expanded now and provided our front end visualization, dashboarding, reporting products to the big data world. And we cover a whole range of big data products ranging from Hadoop to Cassandra, to Google's BigQuery, to Amazon Redshift, to MongoDB, et cetera. So quite a broad range. What we're trying to do is really bring the rich set of visualization tools people have come accustomed to for their relational databases and make those same products available for the big data world. Because the way we see it is, to your point, it's really not about the big data, it's about the value from the big data. In fact, I heard an interesting comment on that, which is that we as an industry shouldn't be talking about big data. We should be talking about big answers, because that's really what it's about, is driving the value. And visualization is a key component to that for business users to really get value from and understand what the meaning of the big data is, get the big answers from it. We do that, by the way, in a couple of ways using our visualization front ends, we'll take a look at. But we're also big into embedding. And what we like to do is, through open APIs and so forth, make these products available from within the context of familiar business applications that people don't have to go elsewhere to try to figure out a new tool set, et cetera, but rather present the information from within their established applications and business processes. So you've just recently announced this capability for the cloud on AWS, right? So, I mean, this is like, I look at this, Ben, it's like the infrastructure, the plumbing, you know, you're ticking some of that. Really, the more interesting things are what you're doing in terms of putting data into the hands of business users. Why has it traditionally been so difficult? Okay, so that's a good question, and that's for a couple of reasons. One is the industry has grown up around SQL, around the relational database, and big data often doesn't provide that kind of capability to access the data. Secondly is the fact that is endemic, not only to big data, but even the small data or traditional data, and that is that probably, you know, business intelligence has been around for what, 20 years or more. And still, you see survey after survey, even today, that says that most users are not making use of big data, not big data of business intelligence, period, that it's penetrated only maybe 5% of the community. Other than say Excel. Other than Excel, exactly, yeah, which has its own challenges, as we know. And the reason for that is because business users don't spend their days in BI tools, nor do they want to. What they're about are their applications that are presenting information to them in a way that is meaningful and contextually relevant. And that's what we're trying to do, is bridge that gap by embedding the BI within these application processes and help to permeate more of the community that way. Okay, so I think it'd be great if we could see an example. I know you have a demo. I do, be happy to show you. So what we're going to look at here is, and because of some technical challenges, I'm going to try to spin this around and see if we can get it visible on camera. And let me just peek. So what we're going to see here, and I'm going to talk while this runs, is an example of Jaspersoft in this case being used with, if we can see that, can we, there we go. In this case being used with Amazon Redshift. This is a product we announced just a few weeks ago. As you know, Amazon Redshift is a big data in the cloud, service that is very scalable and interestingly accessible through SQL, through traditional SQL. Jaspersoft now has our products hosted in the Amazon cloud. And so anyone can go up there, pull what's called an AMI, an Amazon machine image of Jaspersoft and do exactly what we're doing here against Redshift or even against their own data. In fact, we've done some benchmarking and from the time you decide to use Jaspersoft to the time that you can provide the kinds of visualizations we're seeing here is less than 10 minutes. Start to finish. So take us through that process. You got a corpus of data that you want to analyze. What format is it typically in and does it matter and how do you get it into the system? Okay, so in this case, what we're going against Amazon Redshift data as I mentioned, but this could be from Amazon RDS, which is Oracle, MySQL, SQL Server, traditional, or it could be from your own data as well. Now- So I could have a CSV file or- Exactly, yeah. And Amazon has a lot of infrastructure. They spend a lot of time doing things like the virtual private cloud and the data pipelining, various ways to move the data around either within the Amazon cloud or even from getting it behind your firewall and on-premises they have some sophisticated ways to do that. What once you do, what we're doing here is providing interactive visualization. You see this is all HTML5 based, very interactive. And although if you pay attention over, where am I? Sorry, over here, you'll see some dragging and dropping of various parameters and you'll see the graph change. So this kind of thing that a business user can do on the fly without knowing anything about the underlying structure. In this case, by the way, note some of the response times we're going against a billion and a half rows of data. This is big data. This particular one happens to be a Natality data set. So this is of all the bursts in the US over some 50 years and sorted by mother's age and father's age and smoking and baby's health and all kinds of interesting things that, for this example could lead to some interesting, for example, medical policy issues, et cetera. But the fact is a billion and a half rows, very fast response time and we cash the data once we query that big data set. You see now it's going, got a billion and a half rows sorted out. We now have that data cashed, makes it very interactive, very easy. So how difficult is this setup for a business user to set up the parameters? Okay, yeah, so in this case, what we've done is integrated with the Amazon infrastructure and we do something called auto discovery. So what that means is that an organization that already has data in the Amazon cloud will go out and find it automatically. So once you log in with your account, we go out and say, okay, Dave, you have this accounting data, this natality data, what do you want to look at? We present it, you click on, in this case, I want to see the natality data and boom, we use your credentials because the credential login is all integrated and you're up running and you're just picking and choosing from our metadata layer, which is this part, which is providing meaningful business names to what might otherwise be underlying cryptic. And that's auto classified, essentially, you go out, you just look at the corpus of data and you magically somehow categorize it and then present it in business terms. That's right. What's the tech behind that? Is it some kind of SVM or semantic indexing? So we're using our metadata layer. It's a, we use a relational database to essentially catalog the information, map it and provide security so that you can't see my data, et cetera. All that's done behind the scenes. So you built that from scratch, which is non-trivial. I mean, that's a, it's hard to make things easy. That's right. You know what I'm saying? Exactly, that's part of our value add. Submit the report, you say, if I had more time, I would have made it shorter. Yeah, yeah, exactly. All right, so talk a little bit about the product suite, you know, where you are in the maturity of the company, maybe, you know, customer profile, a number of customers, things like that. Sure, okay. So, Jestsoft is, what, about 10 years older or so now, we're around 200 people, headquartered in San Francisco with offices around the world. We have a product suite that includes everything from ETL products for extract, transform and load of data sets to a server with security, scheduling, et cetera, to front-end visualization tools to report builders. So it's a full BI suite. We're actually the most widely installed Ben Stiller's product in the world. We have, gosh, about over a quarter million members of our community. We are both open source and commercial subscription license. We have over, what, 160,000 customers and deployments. And increasingly, a lot of those are big data on everything from Hadoop to MongoDB to Redshift now to Cassandra and you name it. I mean, do you see that as the future of the company or a nice little niche for you guys? No, you know, I have a personal bias here because my baby, my focus is big data in the cloud and that's where I'm spending a lot of my time but I really see that as an interesting uptake and I think that especially as these kinds of tools, JasperSoft and others, make the data more accessible, more available to more people, you're going to see big data becoming the core, the back-end of more and more systems. So I really see it as the future. And the key is, as we've talked about earlier, to get that capability, that visualization capability in the hands of business users. And so that's a vision that you've put forth. You're starting to realize that how far are we into that game? Well, I'd say it's early days but I'm very encouraged. So we have now people in, for example, not only web analytics but financial service fraud detection, environmental monitoring, et cetera, putting these very large data sets to use. So I think that we're going to see it more and more. It's probably still in the early part of the growth curve but it's a way of the future and I think we're about to hit an inflection point. And talk about the ecosystem leverage and strategy. I know, I think you were up at the Green Plum event yesterday, Jeff Kelly saw you up there, you mentioned. What's the ecosystem strategy? Obviously it's critical for a company like yours. Talk about that a little bit. Okay, so we are very open source from our foundation and that permeates into everything we do, including access to multiple data sources. So our ecosystem includes the traditional relational database place. It includes virtually all of the big data sources. It includes the cloud. It includes system integrators who are bringing the value to customers. So it's quite a broad mix and we're delighted to make the technology available to the broader community that way. Yeah, I mean there's a lot of talk about, for example, you're seeing a lot of talk about bringing real time to Hadoop, which really isn't real time, but let's say it is for a second, bringing integrating SQL and NoSQL. A lot of the demos will have a visualization component. So are you working with all the sort of major platform vendors? I mean, we talked about Green Plum. I mean, but there's, you know, Cloudera, Hadapt, you know, Intel today announced the distribution. I mean, you know, et cetera. It's MapR on and on and on. So talk a little bit more about the platform. Yeah, so we do integrate with virtually all the major platforms and we provide some additional value in a couple ways. Number one, for the SQL players, as you mentioned, it's fairly straightforward with our connectors. We've also done something pretty unusual and that is we've written custom connectors to the major NoSQL data store. So let me give you an example. MongoDB from TenGen. Not NoSQL interface to it, but we've written custom connectors to the MongoDB data structures, the JSON data structures, and you can use the kind of products we just saw on the screen a minute ago directly against Mongo to provide real time analytics. So no ETL required, which is quite unusual in the industry. Usually people say, okay, if you want to take NoSQL, first thing you have to do is move it over to SQL, which kind of defeats the whole purpose. Do your filtering and then bring it over. Exactly, we go directly to the NoSQL data source. And with their Amazon announcement, you can buy it by the hour. No upfront cost whatsoever. You go up, you spin up Jastrosoft instance for one hour, do what you need to, spin it down and you're done. Visualization by the drink, folks, excellent. All right, Ben, well, we're out of time. Thanks very much for coming on. Appreciate the demo. Good luck with everything and we'll see you next time on theCUBE. Great, Dave, great to talk to you. Thanks. All right, everybody, keep it right there. We'll be back with our next guest. Bill Schmarzo is in the house, longtime CUBE alum and the Dean of Big Data anointed by John Furrier at one of these conferences. So keep it right there. This is Silicon Angles theCUBE. We'll be right back after this. This is the O'Reilly Stratoconference and we're covering it wall to wall. Keep it right there.