 Live from Orlando, Florida, it's theCUBE. Covering Pentaho World 2017. Brought to you by Hitachi Ventura. We are kicking off day one of Pentaho World. Brought to you, of course, by Hitachi Ventura. I'm your host, Rebecca Knight. And along with my co-host, we have Dave Vellante and James Kobelius. Guys, I'm so thrilled to be here in Orlando, Florida, kicking off Pentaho World with theCUBE. Hey, Rebecca, twice in one week. I know, this is very exciting, very exciting. So we were just listening to the keynotes. We heard a lot about the big three, the power of the big three, which is internet of things, predictive analytics, big data. So the question for you both is, where is Hitachi Ventura in this marketplace? And are they doing what they need to do to win? Well, so the first big question that everybody's asking is, what the heck is Hitachi Ventura? What is that? Maybe we should have started there. You're right, you're right. And we joke, some people say, well, it sounds like an SUV, you know, okay, Japanese company, blah, blah, blah. But when you, we talked about Brian Householder. Well engineered SUV. So Brian Householder told us, well, you know, it really is about vantage and vantage points. And when you listen to their angles on insights and data anywhere, however you want it. So they're trying to give their customers an advantage and a vantage point on data and insight. So that's sort of kind of interesting and cool branding. The second big, I think, point is, Hitachi's undergone a massive transformation itself. Certainly Hitachi America, which is really not a brand they use anymore, but Hitachi data systems. Brian Householder talked in his keynote, when he came in 14 years ago, Hitachi was 80% hardware and infrastructure and storage. And they've transformed that. They were about 50-50 last year in terms of infrastructure versus software and services. But what they've done, in my view, is taken now the next step. I think Hitachi has said, all right, listen, storage is going to the cloud. Dell and EMC are knocking each other's heads off. China's coming into play. Do we really want to try to dominate that business? Rather, why don't we play from our strengths? Which is devices, internet of things, the industrial internet. So they buy Pentaho two years ago. We're going to talk more about that, bring an analytics platform, and just sort of marrying IT and OT, information technology and operational technology together to go attack what is a trillion dollar marketplace. Yeah, that's it. So Pentaho was a very strategic acquisition. For Hitachi, of course, Hitachi data systems plus Hitachi insights plus Pentaho equals Hitachi Vantara. Pentaho was one of the pioneering vendors more than a decade ago in the whole open source analytics arena. If you cast your mind back to the middle millennium decade, open source was starting to come into its own. Of course, we'd already had Linux and so forth, but in terms of the data world, we're talking about the pre-Hadoop era, we're talking about the pre-Spark era, we're talking about the pre-TensorFlow era. Pentaho, I should say, at that time, which is, by the way, now a product group within Hitachi Vantara. It's not a standalone company. Pentaho established itself as the spearhead for open source predictive analytics and data mining. They made something called WECA, which is an open source data mining toolkit that was actually developed initially in New Zealand. The core of their offering to market, in many ways, they became very much a core player in terms of analytics as a service and so forth, but very much established themselves Pentaho as an up and coming solution provider taking more or less by the book, open source approach for delivering solutions to market. But they were entering a market that was already fairly mature in terms of data mining, because you're talking about the mid-2000s, you already had SaaS and SPSS and some of the others that had been in that space and done quite well for a long time. And so, kind of had to the present day, Pentaho had evolved to incorporate some fairly robust data integration, data transformation, all ETL capabilities into their portfolio. They had become a big data player in their own right, with a strong focus on embedded analytics as the keynoters indicated this morning. But there's a certain point where in this decade it became clear that they couldn't go any further in terms of differentiating themselves in this space, in a space as dominated by Hadoop and Spark and AI, things like TensorFlow, unless they're a part of a more diversified solution provider that offered, especially, the critical thing was the edge orientation of the industrial internet of things, which is really where many of the opportunities are now for a variety of new markets that are opening up, including autonomous vehicles, which was the focus of Ella Hilloff. Let's clarify some things a little bit. So Pentaho actually started before the big, before the whole Hadoop movement, right? So that's kind of interesting. And then, they were a young company when Hadoop sort of just started to take off and they said, all right, we can adopt these techniques and processes as well. So they weren't too legacy, right? So they were able to sort of ride that modern wave. But essentially, they're in the business of data, I call it data management. And maybe that's not the right term, but they do ingest, they're doing ETL, transformation anyway, they've got analytics, they're embedding analytics. Like you said, they're building on top of work. In the first push of BI as a hot topic in the market in the mid 2000s, they were a fairly, became a fairly substantial BI player and that actually helped them to grow in terms of revenues and customers. So they're one of those companies that touches on a lot of different areas. So who do we sort of compare them to? Obviously, while you think of guys like Informatica, who do heavy ETL, you mentioned BI, you mentioned before like guys like SAS, what about like Tableau? Well, BI would be, you know, there's Tableau and ClickView and so forth, but there's also pretty much Cognos under IBM and of course there's the business objects portfolio under SAP. Right, and Taland would be? Taland, oh yeah, in fact I think Taland in many ways is the closest analog to Pentaho in terms of a predominantly open source, go to market approach that involves both the robust data integration and cleansing and so forth in the back end and also a deep dive of open source analytics on the front end. So their differentiation, they sort of claim is their sort of end-to-end integration. Yeah. Which is something we've been talking about at Wikibon for a while and George is doing some work here, you probably are too, but and it's an age old thing in software. Do you do sort of best of breed or do you do an integrated suite? Now, the interesting thing about Pentaho is they don't own their own cloud. Hitachi Ventara doesn't own their own cloud. So they do a lot of, it's an integrated data pipeline, but doesn't include its own database and other tooling, right? And so there's an interesting dynamic occurring that we want to talk to Donna Perlick about obviously is how they position relative to roll your own and then how they position sort of in the cloud world. And we should ask also, how are they positioning now in the world of deep learning frameworks? I mean they don't provide, near as I know, their own deep learning framework to compete with the likes of TensorFlow or MXNet or CNT and so forth. So where are they going in that regard? I'd like to know. I mean, there's some others who are big players in this space like IBM who don't offer their own deep learning framework, but support more than one of the existing frameworks in a portfolio that includes all the, some of the, much of the other componentry. So in other words, what I'm saying is you don't need to have your own deep learning framework or even open source deep learning code base to compete in this new marketplace. And perhaps Pentaho's or Hitachi Vantaro's road mapping, maybe they'll take an IBM like approach where they'll bundle support or incorporate support for two or more of these third party tools or open source code bases into their solution. WECA is not there as either, it's open source. I mean WECA is an open source tool that they've supported from the get go. And they've done very well by it. Just kind of like early day machine learning. Yeah. Okay, so we heard about Hitachi's transformation internally and then their messaging today was of course. Well, exactly. And that's where I really wanted to go next was we're talking about it from the product and from the technology standpoint. But one of the things we kept hearing about today was this idea of the double bottom line. And this is how Hitachi Vantaro is really approaching the marketplace by really focusing on better business, better outcomes for their customers and for obviously for Hitachi Vantaro too, but also bettering society. And that's what we're going to see on theCUBE today is we're going to have a lot of guests who will come on and talk about how they're using Pentaho to solve problems in healthcare data, in keeping kids from dropping out of college, from getting computing and other kinds of internet power to underserved areas. So I think that's another really important approach that Hitachi Vantaro is taking in its model. And the fact that Hitachi Vantaro, in other words the Pentaho solution, has been on the market for so long and they have such a wide range of reference customers all over the world in every, many vertical. That's a great point, yeah. Willing to go on camera and speak at some length of how they're using it inside their businesses and so forth speaks volumes about a solution provider, meaning they do good work, they provide good offerings that companies have invested a lot of money in and are willing to vouch for. That says a lot. Right. And so the acquisition was in 2015. I don't believe it was a public number. Because Hitachi limited, I don't think they had a report but the number I heard was about a half a billion, which for a company with the potential of Pentaho is actually pretty cheap, believe it or not. I mean you see a lot of unicorns, a billion dollar plus companies. So, but the more important thing is it allows Hitachi to further its transformation and really go after this, you know, trillion dollar business, which is going to be really interesting to see how that unfolds. Because, you know, while Hitachi has a long term view, it's culturally, always takes a long term view. You still got to make money and it's fuzzy how you make money in IoT these days. Obviously you can make money selling devices. Open source, anything, you know, so yeah. Right, well, but they're sort of open source with a hybrid model, right? And we talked to Brian about this. There's a proprietary component in there so they can make their margin, you know, nuts. But, you know, Wikibon, we see this three tier model emerging, a data model where you've got the edge and some analytics, real time analytics at the edge. And maybe you persist some of that data but they're low cost devices and then there's this sort of aggregation point or a hub, I think Pentaho today, they called it a gateway. Maybe it was Brian from Forster. But a gateway where you're sort of aggregating data and then ultimately the third tier is the cloud. And that cloud, I think, vectors into two areas. One is on-prem and one is public cloud. It was interesting what Brian from Forster was saying that basically he said puts the nail in the coffin of on-prem analytics and on-prem big data. I don't buy that. I don't buy that either. No, I think the cloud is going to go to your data. Whatever the, wherever the data lives, the cloud model of self-service and agile and elastic is going to go to your data. A couple of weeks ago, of course, we, Wikibon, we did a webinar for our customers all around the notion of true private cloud. And Dave, of course, and Peter Burris were on it, explaining that hybrid clouds, of course, public and private play together where the cloud experience migrates to where the data is, in other words, the data will be both in public and in private clouds, but you'll have the same, you know, reliability, high availability, scalability, ease of programming and so forth, wherever you happen to put your data assets. So in other words, many of our, many customers, many companies we talk to do this, they combine in a zonal architecture. They'll put some of their resource, like some of their analytics will be in the private cloud for good reasons, the data needs to stay there for security and so forth, but much in the public cloud where it's way cheaper, quite often, but also they can improve service levels for important things. What I'm getting at is that the whole notion of a true private cloud is critically important to understand that it's all data-centric or it's all gravitating to where the data is. And really, analytics are gravitating to where the data is. And increasingly, the data is on the edge itself. It's on those devices where it's being persistent, much of it, because there's no need to bring much of the raw data to the gateway or to the cloud if you can do the predominant bulk of the inferencing on that data at edge devices. And more and more the inferencing to drive things like face recognition from your, from your Apple phone is happening on the edge. Most of the data will live there and most of the analytics will be developed centrally and then trained centrally and then pushed to those edge devices. That's the way it's working. Well, it is going to be an exciting conference. I can't wait to hear more, more from all of our guests and from both of you, Dave Vellante and Jim Kobelius. I'm Rebecca Knight. We will have more from theCUBE's live coverage of Pentaho World brought to you by Hitachi Ventura just after this.