 Okay, welcome back to theCUBE Live in New York City. We are here at day one of Hadoop World Stratoconference and also Big Data NYC, the hashtags, Big Data NYC. This is theCUBE, our flagship program. We're about to advance and extract to see them from those. I'm John Furrier, the founder and co-host of theCUBE with Dave Vellante. We have some exclusive news here. We have a CUBE alumni, I'm George Mathew Altrex. Welcome back to theCUBE. Good to see you. John, thanks for so much having me. So we have some big news here. You guys are, got some big announcements. We just heard from the analytics folks revolution tell us one, the exclusive news. We're breaking news here on theCUBE. It's going to hit tomorrow morning, but we have a first look. Give us the scoop. Yeah, absolutely. So as we're seeing Hadoop becoming more of a general purpose compute platform, John, we're really excited by how much more is now possible in Hadoop. And so what Altrex and Revolution have been working on is making sure that you can actually take R-based analytics, things that you would basically program inside of the R modeling environment and be able to push that directly in Hadoop. And so what the announcement is that we're making with revolution and cloud era is the full support for R-based analytics in Hadoop where Altrex is now a very seamless orchestration and user experience layer that naturally sits on top of that capability. Orchestration has been the holy grail at many levels, both in the cloud now and the application environment. Why is that important? I mean, it just seems easy. Ours out there, people use it. They're comfortable with it. What's under the hood that people don't understand that makes it really hard to do? I think when you look at how this community has emerged, particularly in the Hadoop world, you have this idea that data scientists are very much a scarce resource. And why is that? Because ultimately, there's only so many people that can actually handle the programming inside of Perl and Python to do the actual scripting and the blending of data and bringing that together. There's only so many people that actually can do the programming inside of R or SPSS or SAS. And so the challenge becomes, how do you get it to a broader set of users most effectively? And how do you make the scalability of that effective for how an enterprise would want to function? And so that's what Altrix has been largely enabling, this delivery of a new analytics stack, which underlies Cloud Era, for instance, or Hortonworks as a data management platform for Hadoop. Being able to do in Hadoop analytics like R has been able to accomplish with the revolution capabilities that are now in market. And Altrix basically operating as an orchestration capability that lets you pipeline data effectively through the analytic function. And then I'll put that to a database. I'll put that back to Hadoop, write it out to Tableau. And that just becomes a much more natural way that data analysts can work today than it ever was before. So that changes the game. I mean, this is a game changer. And for the folks out there watching, I've been doing a lot of tweet lately about the whole big debate about women in tech and this big discussion around male dominated. But now you look at the analyst side. There's a lot of really high power women who don't program in Python or Perl, but are math geeks and doing a lot of data science. You see a lot of that new tech user, whether it's women or other job functions, not necessarily the Python programmer. Talk about that trend and what needs to happen to make someone who's just really good at their job as an analyst or a professional worker to become a full blown data scientist. So I think at the end of the day, there is at best 200,000 data scientists in the world. And that's being, I would say kind of generous. And there are well over two and a half million data analysts. Now the skills that are now involved in terms of being able to blend data, being able to analyze information at scale, being able to share that out, those are the skill sets inside of data science that really matter. It's not actually being a programmer within a framework of Perl or Python. It's not actually being a SaaS jockey. It's about being able to run analytics to a way that the organization can succeed and at scale. And so this is where I think there is a real arbitrage, frankly, in the market, right? Because when you look at how much is needed in terms of being able to deliver better tools, better capability for the modern analysts today, this is where the market's terribly underserved. And we really are looking at ourselves as a way of filling that gap very effectively. And the ease of use issue is huge. Whoever can get that abstracted, the complexity abstracted away will be the big winner. Dave, you want to give? Yeah, so I want to follow up on that because we did a crowd chat last week and leading up to big data NYC and the topic was what's needed to make Hadoop Enterprise ready. So you heard a lot of stuff like point in time, consistent snapshots and remote replication, all kind of sort of low level functions, but the maturity of the tooling came up. The ability to put analytics in the hands of business users came up. That's as things that nobody talks about at the low level, but these are the things that enterprises want, need and expect. So I wonder if you could talk about that a little bit and specifically what do enterprises need to make Hadoop Enterprise ready? Sure, so I was just in between the conference today. I actually went and visited one of the major financial institutions down in Wall Street. And what's really fascinating for particularly an institution that I visited, they're one of the largest players in the wealth management arena. And they've been actually looking at ways that they can drive better analytics directly in the hands of their private wealth clients. Now the challenge becomes when you work inside of that arena where you're either programming MapReduce jobs in Java where you're going ahead and doing esoteric R scripting, that only gets so much accomplished inside of an organization. And so when you look at an enterprise buyer today, there's a few things that they want. One is can they get a broader set of capabilities, whether that be tools or apps in the hands of users that are day-to-day functioners who are inside of an organization. And so the second thing is if you do that, does the actual platform itself scale, right? Does the capability from a data management standpoint, from a data orchestration lineage as well as being able to visualize and convey what the analytics are trying to accomplish, do all of those pieces effectively scale? And so really that's where Altrix plays extremely well in this market, right? Because we see companies like Revolution Analytics becoming the de facto for R-based scalability in Peridata and Hadoop as well as just directly on their servers. We see functions like Cloud Era and you guys talked a little bit more about Cloud Era becoming more focused on the enterprise. I mean, it makes perfect sense, right? Because they're largely now going after the space of really building out an enterprise-ready solution into this emerging Hadoop world. And at the same time, you have companies like Tableau that are taking off a storm, right? They just had their announcement a few hours ago and they're up 90% for quarter-over-quarter results, mainly because of this new analytics stack that's emerging right in front of us. Well, in the short-term, what's your take on that analytics stack, the new analytics stack? Well, I think that it is completely displacing the previous generation, right? That's what I think is happening today, right? When you look at the traditional database providers, for instance, the oracles of the world, they're actually very, very challenged by what's happening in the Hadoop world today. If you look at the modern analytics... We've got Red Hat. Red Hat seems to be one of those kind of straddle-in-defense oracles, obviously. You can always poke an oracle. How does Red Hat do that? Because, you know, incumbents like Oracle either adapt or die, right? And they seem to all buy. They buy people, but what about Red Hat? I mean, they're interesting, right? Are they on the side or that side of the street? How do you view them? So let me just finish that one thought and I'll go back to the Red Hat comment. So when you look at the rest of the stack, right, you have older incumbents in the market like SAS and you have older incumbents like SPSS. And they're very, very challenged by companies like Altrix because we're basically built on a modern experience. The same goes for Cognos or business objects, right? When you look at the BI layer, they're challenged by Tableau and QuickTech. And so this stack is becoming much, much more significant for a broader set of users than it ever was today. Now going back to our friends at Red Hat, I think what ends up happening with most of the infrastructure providers, particularly if you've kind of built yourself on open sources, you have to define what's your material that's open source and what's your material that's basically proprietary that you're going to monetize. And I think for Red Hat, they've actually done a pretty good job of being able to understand what are they making money on and what are they going to be able to contribute back to the open source community. And so any successful provider that's playing the open source game today is going to have to be able to understand what is the monetization element that's commercialized and what is the elements that are going to be contributed back into the open source community. Having some IP basically, that's a unique advantage. So Cloud Air is going down that path. If you look at what Altrix is doing, if you look at what Revolution's doing, all of us have a very strong toehold into the open source community, but we're also understanding what is the commercial aspect of what we're delivering. There's clarity around the role and people are being transparent. Hey, I'll contribute and I'm just going to have my own little differentiator. And we're clear about what's what. Yeah, and Hortonworks is very clear. We don't have any of that just service. Or do they have something? What do you see those guys? Well, I think that there's the last person standing in the Hadoop. Yeah, we were saying they're now number one in the pure play Apache Hadoop. I think they're a great partner of ours. And they have the greatest number of committers in the space from what I understand. I think it will be challenging to be purely an open source provider. And let's look at the Linux space, right? When you look at Linux even a few years ago, or when you look at how much now the Hadoop space has evolved, most of the monetization is still occurring by figuring out what's the commercial elements that make sense, right? Intel's going down this path. Cloud Air is going down this path. MapR went down this path a few years ago. And so I still think that it'll be hard over time to be a purely open source provider. Well, yeah, I mean, if you saw the Mike Olson article about that and there aren't a lot of historical examples of companies that have thrived as a pure open source, it's very hard to choose them. At the same time, open source is still this wild card. On one of the other crowd chats we did, it was sort of having a conversation with Matt, whose last name I can't say I say. And we were talking about canonical and how canonical is disrupting Red Hat, even though canonical is not making any money. But it's such a wild card in terms of being able to predict, we were talking earlier about Pat Gelsinger's statement that there won't be a Red Hat of a dupe. And I interpreted that as, because we're not going to let there be. But at the same time, there's a lot of money to be made around that ecosystem. Well, you know, when canonical comes into this Linux space, for instance, right? It's after quite a few years of the market sedimenting, right? There's an established market, there were winners and losers in the Linux market, even 10 years ago. And now you have this disruptor that's coming in and saying, hey, by the way, we're just going to crush this from an open source standpoint, right? And that's great, right? But once the market is made. But I don't think that's exactly where we are in the Hadoop game today. I mean, I've talked to you guys about this before. I agree, not even close. We're like in the third or fourth inning of this game. But that's why I think George Wyatt makes open source such an unpredictable animal. I agree. It's very, really difficult. I want to go back to something you said about Tableau. Tableau announced about an hour ago, a little more at the close, right? The shorts were out for the last month on Tableau. Yeah, they're hurting right now, aren't they? And the stock is up 4% after hours. And they just announced they're going to promote, they're going to float a secondary. And the stock's still up 4%. So the last time we saw you was at the Tableau conference. So my question is, where are we, in terms of sort of building your own, rolling your own analytics, build your own viz, if you will? Yeah, I think we're at a point where there's good stability in that market. If you look at, particularly where Tableau and ClickTech have taken things forward, you don't have to be in the business of building and rolling your own viz today, right? And there's great options, by the way, in the open source community to do that. But when you look at, say for instance, a dashboard, a visual analysis, a BI application that you might want to package, you see some very, very solid capability, particularly with those two products in market. And I would know, right? Because my last job being the general manager of business objects years ago, I know what was there in terms of the current place where the BI platforms were, versus where they knew to discovery folks. I wasn't clear. So I mean, in terms of a business user being able to build his or her own. Because here's my thesis on Tableau, when all the shorts came out, it was like, just ignore it guys, because they are barely scratching the surface. You know, you can talk about ClickTech and Tableau and blah, blah, blah. They're rising tide for both companies, because the opportunity really is to put analytics in the hands of the business user. Which the decision to support guys, we were talking about SaaS before and Cogniz, they never could do even though that was their vision. Yeah, so Tableau talks about this as power data to the people, right? And we think about it in a very similar way. We think about analytic freedom, right? And this idea that you have to empower the end users that are driving the outcomes inside of an organization. And so Tableau does that through a great visual analysis experience. And Altrix is really looking at it and saying, how do we make it easier to integrate all the data that you would want to be able to bring into the analytic workbench? How do you then make the predictive analytics more usable, right? And so all of the things that underlie, say for instance, a Tableau dashboard or a Tableau workbook can naturally be populated using Altrix as that analytic engine. So we do believe that the user experience is gonna trump all of the outcomes in this market. To me that's so key because it's like a pyramid, right? And this marketplace is just scratching the tip of the pyramid. And there's this huge opportunity beneath it of millions and millions of users who have an enormous appetite for that. One of the things you said that's critical is being able to deal with different data types. So talk about that a little bit. What's your play there and what's your enabler? Yeah, I think one of the challenges that have been in market with the traditional players in the space is that they've been very, very rigid in terms of how data is actually managed, right? And what data comes into the analytic pipeline. And so I think the biggest thing that if you look at what the Hadoop vendors and particularly Cloudera and Hortonworks introduced into the space, well it's the idea that often times when a traditional database comes into the picture you have to organize the schema on the right. The moment you write into the database you have to have a very organized schema to manage that. Well ultimately what those guys have done in the data management space is they've actually provided a schema on read, right? And so the moment that you put everything in you can actually construct the schema to read it out most efficiently. And so what that now changes is then your ability to go after structured, unstructured, semi-structured sources and the analytics that work on top of it have to work in the same way. So the last generation was completely built on structured schema on write. And now we're seeing companies like Altrix have a very natural experience to bring in XML streams, bring in social media content, to bring in data from a Hadoop infrastructure and naturally analyze that where the schema is actually occurring directly on the read. And that's what we really are excited about here. The schema on write is kind, right? Because the schema on write after you've spent months and months and months. I'm being very kind there. Debating, right? What the schema's gonna look like. Schema on write after seven months of analysis. Not fun. That's not the way the world is going. George, my final question for you is obviously, you mentioned great user experiences. That's the goal. What about business outcomes? What have you find talking to customers? Is the current situation, current status of the kinds of business outcomes they're seeing and also some of the outcomes they're expecting, what might be the future in this area? Yeah, so John, I think that there are, there's a very tight relationship between user experience and outcomes, right? So when we look at a customer like Southern States that we've actually worked with at Altrix for many years, Southern States went into this business to go figure out how do they sell seed and grain, right? The two farmers. They're the biggest retailer in the US of seed and grain to farmers. And so Southern States went into this business to go figure out, can they optimize the merchandise? Can they optimize the inventory so that with the same level of inventory that they're pushing through their stores, do they have a faster sell-through rate? And if they had a faster sell-through rate, they can actually buy 40% less inventory and more of the things that are basically needed for better merchandising optimization to that customer. So in the case of outcomes here, in a small project that they were actually doing with Altrix being that analytic workbench, largely they were able to accomplish over $20 million in savings of just inventory management in a few short months, three months. So literally you're now talking about a retailer with less than 1% margins in your business who is able to accomplish $20 million of additional top line. And that I think is where it's not just a question of the user experience, but what are the business outcomes that are being driven? And so we're really proud to say that we do this for our customers day in, day out and the user experience creates a business outcome that's extremely positive. Great, final question for you just as we wrap up. What are you expecting here this week at Big Data NYC to happen around us here at the show? Yeah, I mean, there's a lot of announcements that are coming underway. And so you're going to see quite a bit of the major distribution providers. My guess is going to start to update their capabilities, their flagship products and markets. So I would stay tuned for some of those announcements. That's going to be a big thing. And then I think the other thing that you'll see is that Hadoop is becoming a more general purpose compute platform. So more and more things in terms of graph in memory, being able to go R-based analytics in Hadoop. All of the things that were just limitations of the MapReduce world are now being opened up with yarn and the ability to plug other engines directly in. And making things a little bit more easier to use and doing some blocking and tackling kind of fill in the holes. Right, and that's where we see a lot of that orchestration occur on the front end. So the enterprise ready is a valid team? Yeah, because now we've gone away from developers and tinkerers to actually people who are just spend budget and scale this out inside their organizations. Okay, George, thanks for coming on the queue. Really appreciate it. Good to see you again. We're here live at Big Data NYC where there's so much happening. Clear stories, making some big announcements. Revolution, MapR, Hortonworks, Microsoft, Splunk, Cloudera, Altrix, Cognizio, our Aero, SpikeWin, Disco, Platform Pivotal, all of them are making big announcements. We're here at theCUBE covering it like a blanket. Three days of wall-to-wall coverage at Duke Pro Stratocon, with Big Data NYC, hats to hashtag. We'll be right back with our next guest after this short break.