 The Cube, at Hadoop Summit 2014, is brought to you by Anchor Sponsor, Hortonworks. We do Hadoop. And Headline Sponsor, WAN Disco. We make Hadoop invincible. Okay, welcome back everyone. We're here live in Silicon Valley in San Jose for Hadoop Summit 2014. This is The Cube, our flagship program. We go out to the events and extract the signal and the noise. I'm John Furrier here with Jeff Kelly, big data analyst on stage doing the keynote, Doug Cutting, Arun Murthy the other day, really getting into all the action here at Hadoop Summit. Our next guest is Mike Hoskins, CTO Actian. Great to have you back on The Cube. We had a great deep dive in New York, big data NYC. So what's new? You guys really put the fork in this show with the announcements. What's happening? Hi, John. Again, Jeff, I love The Cube, love being on The Cube. This event's been tremendous for us. Of course, Tuesday morning we announced hugely for the industry our Hadoop SQL Edition, basically SQL and Hadoop, bringing a fully functioning, complete SQL database into the Hadoop ecosystem. But more importantly, the world's fastest analytic columnar database, production ready right into the Hadoop ecosystem. This obviously had reverberations in the space. We're very excited. This is the greatest feeling and buzz we've had at Actian since I got here. We were commenting on day one, this has been the theme throughout, is that the need for speed at many levels, speed to market in terms of better tooling and technology, speed to get the demand filled on the other side, which is the data side, and also just speed for market position for all the competitors. So you have a lot of people jockeying, and it's about the social business. You're seeing, talking about outcomes, we saw Tony Bear from the analysts say that SQL is the gateway drug to the enterprise, meaning that the ease of use equation is huge right now. So what's your take on that? Obviously, SQL is, it makes sense to me, certainly the gateway for making it easier. Do you see that as well? I mean, what is the big industry issue that you guys see? You nailed it. I mean, SQL is the lingua franca with which we ask questions of data. There are millions and millions of SQL savvy users out there ranging from the power user to the person flinging Tableau and Cognos around and to unleash that army of talent and bring it into the game. We're increasingly the data lives. As I often say, data has gravity and a lot of it lives in Hadoop and you want to bring your best game and your best people and your best resources into that game. And we're opening that door into the Hadoop ecosystem. That's really important. Hadoop is a wonderful thing in some ways, but it is very immature. It is about scaling, but it's not about performance or it hasn't been until Acti and got in the game. And it's not just runtime performance. Think design time performance. The tooling is in a very primitive state inside Hadoop. And then after you design it, you punch it on and it runs really slow. These are not good things. So to lift Hadoop up, to industrialize it, to bring it to its proper 2.0 status in the world, to be yarn certified with all of our end-to-end capabilities in our Acti and Analytics platform. We're really excited. I think us sitting right there on top of your Hadoop distro is the big news at the event here. And you're starting to see the application side. We saw a couple of customers here talking about the success they've had on the clean sheet of paper using data as a competitive advantage. Certainly Jeff's survey is pointing to that as well. Still early days. But the application market is waiting. What do you guys say to the guy saying, give me better stuff to the tooling community? What are they looking for? And what are some of the demand pressure points do you see? Well, it's a good question and it's a challenge for all of us in Hadoop because it's a pretty raw infrastructure. You're talking about racing up the stack. And it's not surprising that customers want us to race up the stack. But think the BI and data warehousing space that took years and years to get to where we are today. And it will take years to get there with Hadoop. I think we'll do it faster. The key steps are, of course, moving away from the pure infrastructure into the platform tier like we have and offering advanced analytics with visual tooling and next generation parallel software and giving you the scale you need. But you're right. The real level is about that. How do you turn that into transformational business value? You look at our customers and they're starting to build these analytic apps on top of that superstructure. We're releasing very soon a series of analytic blueprints so that people can do customer analytics and healthcare analytics out of the box on top of our already best of breed analytics platform. But yeah, the race is on to get to analytic applications so that we can really turn that data into value. So Mike, let's take a step back and tell us a little bit about the guiding philosophy at Acton and your approach to this market. As our viewers may know, you made some acquisitions over the last year or so to really develop really what is an end to end platform. Is that really your goal to deliver that end to end from data ingestion through consumption? That's exactly right. We didn't arrive at what I think is the leadership position in offering advanced analytics on top of Hadoop just yesterday. This was a hard effort of many years. We've spent hundreds of millions of dollars over the last couple of years acquiring a best in class and best of breed next generation modern hyperparallel and super scaling software. The kind that can really take advantage of modern hardware in a way that traditional legacy software products just don't. And so you're right. We spent the money to get a full ingest framework. So we have hundreds of connectors to pull your data at high speed from any kind of infrastructure or data source, applications, databases, XML feeds, log files, internet of things. We can capture that data at any scale. We can land it where you like. Increasingly, customers like to land that in HDFS. We then have the world's fastest yarn certified data flow engine. No need for map reduced stone age programming. This is visual tooling, fine grain thread level parallelism, scales up in the node, scales out automatically across your entire Hadoop cluster. All the way into parallel data mining we've done highly parallelized implementations of the world's most famous statistical and data mining algorithm. So you get that at scale. And now, to add to that story, we have dropped into the mix like a nuclear bomb, the world's fastest, most fully functioning SQL database engine native yarn enabled. You're like Santa Claus for this industry, bringing gifts every day. Come on. Like what's happening? How do you guys do this? Jeff's question is good. It's been a multi-year effort. We acquired the technologies. We assembled it over the last 18 months. In January of this year, just a few months ago, we announced it as a single platform so customers can come to act in and obtain a single platform with the entire stack for building analytic applications. Modern stack available executing natively in your Hadoop cluster. And now with the SQL announcement sort of finishing that complete picture. So no, it doesn't happen automatically. It was hard work. But really shrewd investments and a really good engineering and technical team brought us to this point. Take a step back. Describe to the folks out there what's going on with the acting. The story is deep. It's rich. You know, I joke about Santa Claus, but you guys are delivering gifts and you're bringing a lot of automation. This is what, you know, when you see evolutions and continuance of these growth cycles, like as Merv Adrian Gardner says, a 10-year cycle with this magnitude in Jeff's work, has just something to do for over 50 billion. And you got on the other side on the application side, just a competitive advantage. It's in the trillions. It's about massive numbers. Automation and scale and ease of use. These are all things you hear about in these growth businesses. So talk about how you guys are doing that, the company, how this all came together. Give us a refresh on the story. So the, you know, acting, in some of my speeches, I call this the age of data. And I think we saw this before other people. You know, we had our age of hardware from the 50s to the 80s. Hardware was everything. We had our age of software, I think, from the 80s to just recently. We're softwares, everything. And they're still incredibly important assets. In fact, if you don't have modern parallel software running on modern hardware, you're going to lose. But it's really about data, and it's about extracting value from data now. And I think we saw that before many other people. How do you turn data into your competitive business advantage? How do you grow your revenue? How do you do customer churn analysis better? How do you do next best offer better? How do you do the full gamut of customer analytics? How do you reduce your cost? How do you find fraud? We have customers who are driving fraud close to zero. That's an amazing statement. We cannot do that without advanced data science, without the ability to crunch this much data in this much time. But it's not just increasing revenue and reducing costs, it's getting your arms around risk. We have some of the biggest banks in the world using our platform to do credit risk and portfolio analytics so they can really get a grip on their risk and drive down risk. So raise your revenue, reduce your cost, shrink your risk. These are the business drivers that are driving Acton to make these investments. And that's why the platform and its analytic horsepower is so important. So for the folks out there that don't know you guys that might be great customers of yours, share with them some of the things that you've done with other customers. Because you're not just a Johnny Come Lately start-up. You guys have a lot of history. You've done some good corp dev maneuvers and stuff. Explain that and what you've done with other companies and why they might want to work with you guys. So if you looked at our customer, and these are hundreds of people who are already using various pillars of our Acton analytics platform, now being made available to the market as a contiguous whole platform. And they cover every single industry in retail analytics. We have some of the biggest retail players in the United States who've really changed the game around market basket analysis and deep analytics of their store level sales and retail processes. The largest retailer in India is a customer of ours and they radically collapse time. I kind of like that phrase, time is, they're not making any more of it. And so the game is now to collapse time in your favor. And we do that for our customers. We collapse time so that they can get, drive advanced analytics on enormous data sets. And like the last guest you just had, I thought he made a great point. It's not just about big data. It's about converging multiple data sets. For example, at a telecom vendor, we were able to reduce churn dramatically in a test for them. Because we fused in, I think data fusion isn't that a cool term. We fused in data from five or six sources and converged them into a single analytic pipeline. And therefore the predictive power of our analytics rose dramatically. And so across all kinds of industries, healthcare, retail, capital markets were huge in capital markets. Digital media, we're very big in digital media. Some of the biggest properties in the world ever know Yahoo are customers of ours. So it covers, it covers the entire spectrum. This is a big deal. Businesses are on the cusp of revolutionizing how they make decisions. Well they're instrumenting their businesses, right? So now in the first time in modern business history, in business history, you can actually instrument everything about your business. I mean, that's unprecedented in the history of the world. It is, we are living. This is exciting. It's not just about IT anymore. It's not just about hardware and software. Our personal lives, the ability to predict disease in our families is going to be radically improved by using advanced analytics on large data sets. Yeah, we're instrumenting the universe, I say the same thing. We're turning every dumb object in the world into a smart object. And those objects are beep, beep, beep, emitting data events. Who is going to capture those events at scale? Land them someplace, process them at scale, and do the advanced analytics on them. It's active. So, you know, one of the things we've heard, or one of the things that happens when there's such a disruptive technology approach such as Hadoop and big data and advanced analytics is that some companies are a little timid, a little afraid to take some of those first steps. So I want to ask you, what are some of the characteristics of your more successful customers? What are some of the things that they do, that they have in common, that make the best use of their data assets? So it takes leadership in a business, obviously. I think part of it is fear. You have this in every early movement and the early adopters jump sooner. The advice I'd give people when I look at who's successful in our business is they look at ROI, they look at the business case to be made from this. And if you can take decisions which are today, you know, how do you know fraud today? How do you know whether to offer somebody a job? How do you know whether to give somebody a loan? How do you know if they're going to pay back their credit card? And these are, the average human makes thousands of decisions a day. We make those decisions informed by very little data and a lot of emotion. And then we do the same thing in businesses. And so people should have some confidence. The ROI leap that our customers see, our early adopters, is so stunningly big that it should give you confidence to make that jump. You don't have to drink the ocean and change your whole business. Pick a project. There's a couple of domains, customer analytics and marketing analytics that are just really sweet spots, retail analytics, portfolio analytics. There's some domains that I think have ROI just hanging there in front of you to be grabbed. And does it require an enterprise-wide effort or can one or two individuals take that first step to kind of get their organization on this journey? It's absolutely a small start is very feasible. I mean, it's one of the kind of nice things about Hadoop and the sort of revolution that's going on in software in a lot of ways. It's easy to experiment. It's easy to adopt. You look at one of our OEM customers is Amazon with Redshift. Redshift is our database. It's a hugely successful offering. People kind of adopt it like that. And so I think the advent of software as a service and open source letting Hadoop proliferate more widely in the enterprise makes it very easy. Customers don't have an excuse really to not jump in the game, at least with a project around a particular analytic opportunity in their business. I think you mentioned something really interesting, which was the word fear. And you know, fear can be a great motivator. And I think one of the things we're going to see is as your competitors start to pass you by because they're making better use of their data assets, that's going to be a great motivator, I think, for a lot of companies. Yeah, we show in our standard DECA statistic that shows how quickly competitors that use data to their advantage that drive advanced analytics in their business quickly exceed the competition. And it's really there to be had. And I invite people to contact us at Acti and our story of success across a huge span of companies and industries is really stellar. And I think we can give them the confidence they need to be able to jump to it. And, you know, again, I go back to the single platform. It's like the famous single throat to choke. Why buy a single product from one vendor and then have a pinhole feeder and then have no advanced analytics on it? Think if you could adopt an entire platform and that platform was super scaling end to end and it lived completely native on your Hadoop ecosystem. Do you want to buy big statements? You want to buy the Ferrari? Do you want to buy the parts and put it together yourself? So, yeah, so obviously open source is a big part of this community. So how do you guys fit into the whole open source growth? Because that's also growing very fast. So you guys have the super analytics platform to provide really super agility. You talk about collapsing times, great concept. I love that I just tweeted that out on our crowd chat. But talk about how you guys blend in with also the massive growth of open source as really first class citizen in software development paradigm that you're seeing, you know, you talk about, you know, cobalt. MapReduce can be the cobalt of big data, which is, you know, implying that still needs to get better. But open source is getting better. People are using it. How do you guys fit into all of that? So we love open source. It's a driver of adoption that is really important. I think it's why we will hit the victory around advanced analytics faster than we did around BI and data warehousing. But it tends to be living at the low infrastructure level. And the end game, as you pointed out at the beginning of this interview, is racing up the stack. How do you get to more advanced analytics and next generation parallel data mining algorithms? Well, that's not born in Hadoop. And then how do you get to the whole analytic blueprints and analytic applications? I think there's a very nice marriage here to be had between open source. We love it. You know, we're huge partners with Cloudera since the beginning, with Hortonworks since the beginning. Love what Hortonworks is doing at the show here. But we're also partnering with NIME, for example. NIME is the number one open source data mining tool in the world. And we've adopted it and brought it into the active analytics platform. And then we've turbocharged it with massive parallelism so people can have best of breed data mining. So we love open source. We embrace and extend it. It provides a fantastic infrastructure platform. But on top of that, we have to race up that stack into advanced analytics and analytic functions. So let's talk about the parallelism, because that's really a big topic that we were talking about a couple of days ago, which is a talent issue, right? Not everyone can be super coders in parallel processing and have all that magic juju and all the black arts involved in coding. It's really a superior talent. It's not many people who don't have that skill. So that's pretty much a fact. And maybe there's some guys committed, but you've got to abstract the way that complexity is. How can you, what's the issue around saying, okay, we're massively parallel, but you don't need to be a genius to do that. You don't need to be one of those guys or gals that does that, we make it easier. So what on the automation and abstraction away of that complexity, how do you harden that so that people don't have to be super gurus to do it? That's a deep question. Can you indulge me for a couple of minutes to go into that? It's important. People want to know. It is important. So here's a brutal and ugly fact. The vast majority of software that we run that we're proud of that I earlier referred to as legacy software is Stone Age crap. It is old, decades old, single threaded, not written in any kind of modern way for modern hardware. It's just a fact. And you can find it every time you punch run, you buy some glistening beautiful 16 core super computing server, you punch run and one core is just going tired and the other 15 are doing nothing because you've got bad software. And so it is now the challenge to the software industry to lift your game, to rewrite in most cases, your software to be ready for the avalanche of hardware goodness that's coming at us. Scale up for multicore and scale out for commodity clusters as far as the eye can see. And that hasn't been done at Actium. We did that. We discovered that very early. So let's just give you two examples. In our data flow, we have this yarn certified data flow engine which is next generation data pipelining at any level of parallelism through your entire Hadoop cluster from sort of left of pipeline ETL and data preparation to right of pipeline advanced data mining. If you adopt our data flow platform you get visual tooling. You drag and drop a couple of nodes, you're done. You're a parallel programming superstar. You don't have to know anything about memory management. You don't know anything about queuing. You don't know anything about threading. It's impossible to get a deadlock. The veins that you talked about of parallel programming go away. That's abstracting away because with the only way to get to collapse time, to turn this much data and this much time into analytic power is through much more aggressive use of parallelism and that is not prevalent in the software industry sadly. So we did that. And you guys interviewed Peter Bonch. That's the key to this announcement right now. He is bringing the world's highest performing database engine. He is the pioneer of vector processing. You see IBM and Stinger now racing to adopt vector processing. He invented it 10 years ago. We've been doing this for 10 years. He's bringing that incredible parallelism straight into the Hadoop ecosystem. Well the key about that interview was that he talks about large scale. So this is not like just a one-off benchmark. You talk about the issue around Hadoop is about large scale. Talk about why that is so important and how hard it is. Well to write software that scales linearly as far as the eye can see is a hard, hard problem. And we're in that watershed moment now between in my opinion the last 30 years and the next 30 years. And the software that we have collectively written over the last 30 years is going to fall over and die in the age of data. And as we face the scale and complexity that we've never seen, the challenges of blending and converging all these sources, the challenges of the instrumented universe and the internet of things, that will break the back of existing software. And so somebody is going to have to come up with new platforms, Actian has, that are super scaling, that understand that this is not your father's data warehouse. This is not a 10% increment of data. This is thousands of percent multipliers of data. And so all together new software architectures must come into the game. We have corralled those software architectures and brought them together into a single platform. And so that was Peter's point. This is a new class of problems with a new class of opportunities, analytic outcomes that transform your business. New user experiences and value experiences. I think, absolutely. And the killer one, if you go all the way to the end, is making ever more timely and accurate decisions. It doesn't matter what domain we're in, we want to be better decision makers every day. And we do that with informed data science and advanced analytics. Great to have you on theCUBE. Insight's fantastic, we love riffing on this and you guys had great success at the show here. Congratulations, great story, Actian. This is theCUBE, where we extract the signal from the noise and share that with you. We'll be right back with our next guest after this short break.