 Okay, we're back live at Stratoconference, siliconangle.com, wikibon.org's coverage, in-depth coverage live, eight hours today, eight hours yesterday, four hours on the first day. We love Stratoc, we love O'Reilly, those guys are kick butt here, own big data. We're here to support them and bring great content to you and I'm John Furrier, the founder of siliconangle.com and I'm Joe with my co-host. I'm Dave Vellante of wikibon.org, day three for us, John. It's great to have Ian Fife here who's a technology evangelist at Pentaho. Ian, welcome. Thank you. Thanks for coming in. I know we had a little mix up yesterday in terms of the time so we appreciate you guys coming on. Pentaho, you've got a data integration platform called Kettle. Great, yeah, we've been watching that. Why don't you start by telling the audience a little bit about the company and then we'll get into it. Okay, yes, so Pentaho, we provide a complete business analytics suite, so including Kettle, the data integration capability on top of that. We also provide reporting, dashboards, interactive analysis and data mining and predictive analytics, so really a complete end-to-end business analytics BI suite based on an open-source model and then with commercial products as well. So open-source, I infer from that new mindset, new model, we had Bill Schmarz on one before and he's got a sort of long BI history we were talking about just the traditional BI space and there's a lot of KPIs and essentially rear-view mirror-looking type of exercises. Building a lot of statistical models on top of it and some of the challenges with that, the goal of moving beyond reporting. What's your take on the state of the intersection between the traditional BI and this new big data movement? Yeah, I think it's interesting, the market's changing pretty quickly with the rise of big data and we were actually one of the first BI companies to really start building out the big data capabilities actually a couple of years ago. We were the first vendor to support Hadoop for example and since then we've been investing heavily supporting all the multiple distributions of Hadoop as they've grown and then beyond that into the NoSQL databases so MongoDB, Apache Cassandra, HBase and so on. So really our goal is to be ubiquitous. So if you have a big data store it doesn't matter whether it's Hadoop, it doesn't matter whether it's a NoSQL database or one of the high-performance analytic databases like a NetEaser, a Vertica, InfraBright and so on. We can provide that glue, that data bus to kind of glue these things together, move data between them. You've got data in Hadoop, you want to quickly spin off a data mark, you can do that entirely through a visual interface. It's all about lowering the technical barriers, making big data stores much, much easier to use. What's your sense of the goal of a single version of the truth? Should we give up on that? Are we further away than we've ever been? Is it just now a single version of a distributed truth? What's your take? Yeah, I know that's been our goal, it's been hard to get to, I think the bottom line is the role of data marks and data warehouses is here to stay. I don't see we're really going to move away from that, and that's really about the single version of the truth, bringing data across from different parts of the enterprise into a single consolidated data store for doing analytics. And so I think that's why data integration and ETL tools will continue to be critical to provide that single version of the truth. Yeah, so well, let's talk about that. So how's it work? I mean, you're basically extracting data. You mentioned ETL, you're transforming it, you're cleaning it, and then you're loading it into some target. That's sort of the way the industry does it, I presume that's the way you do it. Is there a different twist on that? Yeah, that's the way we do it, but we've kind of taken it to the next level now by this dramatic rise of big data. Now we've got multiple big data stores and we support more big data stores than anybody else. Again, we kind of got started very early on this. So we can provide this, we're not just siloed, we're not just, for example, a Hadoop analytics vendor, some of our competitors, they focus on Hadoop and they do a very good job of that. But when you try and go beyond Hadoop, you're kind of stuck. So we want to provide this kind of enterprise-wide view across big data as well as traditional data sources. So Hadoop, you mentioned traditional data stores like Vertica, which ones don't you support? You know, it's hard to say, because we are so ubiquitous. In fact, we put out a press release a few months ago because we truly believe we support more big data stores. In fact, more data stores in general than any other BI vendor out there. And I'm intrigued by the whole open source model, but it's always a discussion and there's almost the gradations of open and gradations of source. So where do you fall in that and how do you make money? Yeah, it's a subscription model. So we have our open source distributions. People, we get certainly a couple of hundred thousand dollars a month, that's great. We want to provide very useful, usable open source software. If we don't, people won't use it. And now a very small percentage of those people, they're going to be looking for more. They're going to be looking for more additional functionality capabilities, mission critical support, services and so on. And that's really where, that's frankly where we make money is on that commercial upsell, the people who are looking for those additional capabilities and support. What's the biggest change in your business in the last 24 months and what do you expect the biggest change to be in the next 24? I think the biggest change is kind of a reorientation of our corporate strategy. Very much focused now. We call them the three swim lanes. The first swim lane is kind of the traditional business analytics market or BI market. People tend to call that business analytics now. Second swim lane is the OEM embedded market. About 40% of our business is other software companies. They want BI and analytics as part of their own products. They can come to a company who's specializing in doing that like Pentaho and embed that technology. The third swim lane is big data. And we're making a major, major strategic investment in that area. So that's really kind of broadened the footprint of the company to both embedding as well as big data. My question for you, I know we, I'm sorry we didn't have more time to drill down up, but I want to get your perspective on the show here at Strata. What is your walk away takeaway from what's happening this year? Obviously you guys are very active in the space. You know, great business and partnerships. What's different this year? What's your share with the folks out there who aren't here? What's it like and what's the big walk away? What's the big themes? And your just quick high level banging out the check boxes. You know, I would say last year, I think last year was a lot more about people trying to understand what big data is. What does that term mean? What is Hadoop and so on. Noticing quite a big change this year. People are much more educated. They're up to speed on it. Getting, you know, more, it's more about adopting, how do we adopt this stuff? How do we put it into production? How do we use this in the context of existing databases and the existing IT infrastructure? So I think people really starting to put this stuff into production, you know, in real projects now where it's last year with much more kind of kick in the time. What themes are trending, so to speak here? We heard some folks say HBase is really hot. What other themes have you seen that are like, wow, that's really trending hot. People are interested in. Has there any of any surprises or and or what you've expected? Can you share with us? I think HBase definitely, you know, Hadoop continues to go from strength to strength. But I think the other theme is the NoSQL databases. You know, we, for example, we just put out an announcement with DataStacks, one of our partners. They are a commercial company behind Apache Cassandra. We're seeing a lot of growth, a lot of interest. That's a huge community. And so they, you know, companies like DataStacks and also companies like MongoDB, you know, seeing a lot of traction, a lot of growth. So that's what we're seeing. Yeah. Okay. All right. Well, listen Ian, thank you very much for coming on. We're sorry we're so tight on time, but it was great to have you. I'm glad we could fit you in. You're welcome. Pleasure to meet you. Thank you. Thank you so much for watching. Let's keep it right here. We'll be right back after this word from-