 at Big Data SV 2014 is brought to you by headline sponsors, WAN Disco. We make Hadoop invincible and Actian, accelerating Big Data 2.0. Okay, we're back here live in Silicon Valley. This is Big Data SV. This is Silicon Angle and Wikibon's theCUBE coverage of Big Data in Silicon Valley and all around the world covering the Stratoconference, all the latest news analysis here in Silicon Valley. The Cube is our flagship program about the events extracted from the noise. I'm John Furrier, the founder of Silicon Angle. So my co-host, the co-founder of Wikibon.org, Dave Vellante, George Matthewsio Alterist on theCUBE again, back from Big Data NYC just a few months ago, our two events, welcome back. Great to be here. So what fruit has dropped into the blender that changed the colors of the big data space this time? So we were in New York, we saw what happened there. A lot of talk about financial services, big business, Silicon Valley, Kool-Aid is more about innovation, partnerships are being formed, channel expansion. I'll say the market's hot, growth is still there, seeing valuations are high. What's your take on the current state of the market? Yeah, great question. So John, when we see this market today, I remember even a few years ago when I first visited theCUBE, particularly when it came to Hadoop World and Strata a few years back, it was amazing that we talked about this early innings of a ball game, right? We said it was like, maybe we're probably in the second or third inning of this ball game. And what has progressed particularly this last few years has been how much the actual productionalization, the actual industrialization of this activity, particularly from a big data analytics standpoint has merged. And that's amazing, right? In a short span of two, three years, we're talking about technologies and capabilities that were kind of considered things that you play with and now these are things that are keeping the lights on and running major portions of how better decision-making and analytics are done inside of organizations. So I think that industrialization is a big shift forward. In fact, if you listen to guys like Norender Malani who runs most of analytics at Accenture, he'll actually highlight that as one of the key elements of how not only the transformation is occurring among organizations, but even the people that are servicing large companies today are going through this big shift and we're right in the middle of it. And we saw, you mentioned Accenture, we look at CSC, what service mesh in the cloud side, you're seeing the consulting firms really seeing build-out mandates, not just POCs, like let's go in lockdown now for the vendors, that means is people are looking for reference accounts right now. So to me, I'm kind of seeing the tea leaves say, okay, who's going to knock down the reference accounts and what is that going to look like? You know, how do you go in and say, I'm going to tune up this database against SAP or this against that and incumbent legacy vendor with this new scale out, all these things are in play. So we're seeing that focus of, okay, tire kicking is over, real growth, real, real, referenceable deployments, not like POC on steroids, like full-on game-changing deployments. Do you see that? And if you do, what versions of that do you see happening? And what ending is that? Is that like the first pitch of the sixth inning? How would you benchmark that? Yeah, so I would say we're definitely in the fourth or fifth inning of an unending ballgame now. And in these innings, what we're seeing is I describe this as a new analytic stack that's emerged. And that started years ago when particularly the major Hadoop distro vendors started to rethink how data management was effectively being delivered. And once that data management layer started to be rethought, particularly in terms of, you know, what the schema was on read, what the ability to do MPP and scale out was in terms of how much cheaper it is to bring storage and compete closer to data. What's now coming above that stack is, you know, how do I blend data? How do I be able to give solutions to data analysts who can make better decisions off of what's being stored inside of that petroite scale infrastructure? So we're seeing this new stack emerge where, you know, Cloud Era, Work and Works, MapR are kind of that underpinning and underlying infrastructure, where now our base analytics that revolution provides all tricks for data blending, for analytic work that's in the hands of data analysts, Tableau for visual analysis and dashboarding. Those are basically the solutions that are moving forward as a capability that are packaged and productized better. Is that the game-changing feature right now? Do you think of that integration of the stack or is that the big game changer this moment? That's the hardening that's happening as we speak right now. If you think about the industrialization of big data analytics that, you know, as I think of it as the fourth or fifth inning of the ballgame, that hardening, that ability to take solutions that either, you know, the Accenture's, the KPMG's, the Deloitte's of the world deliver to their clients, but also how people build stuff internally, right? They have much better solutions that work out of the box as opposed to fumbling with, you know, things that aren't, you know, stitched as well together because of the bailing wire and bubble gum that was involved for the last few years. So I got to ask you, one of the big trends you saw, and certainly in the tech world, you mentioned stacks as a success of Amazon, the cloud you're seeing, integrated stacks being a key part of the, kind of the formation that you said, hardening of the stack. But the word horizontally scalable is a term that's used in a lot of these open source environments where you have commodity hardware, you have open source software, so you know, it's horizontally scalable. Now, that's very easy to envision, but think about the implementation in an enterprise or a large organization, horizontally scalable is not a no brainer. What's your take on that? How does that hyperscale infrastructure mindset of scale out horizontally scalable, which is a big benefit of the current infrastructure? How does that fit into the big data software? Well, I think it fits extremely well, right? Because when you look at the capabilities of the last, as we describe it, stack, we almost think of it as vertical hardware and software that's actually built up. But right now, for anyone who's building scale in this world, it's all about scale out and really being able to build that stack on a horizontal basis. So if you look at examples of this, right? Say, for instance, what Cloudera recently announced with their enterprise hub. And so when you look at that capability of the enterprise data hub, a lot of it is about taking what Yarn has become as a resource manager, what HTFS has become as a scale out storage infrastructure, what the new plug-in engines have merged beyond MapReduce as a capability for engines to come into Hadoop. And that is a very horizontal description of how you can do scale out, particularly for data management. When we built a lot of the work that was announced at Strata a few years ago, particularly around how the analytics architecture for a gallery emerged at Altrix, now we have hundreds of apps, thousands of users in that infrastructure. And when we built that, that was actually scaling out on Amazon where the worker nodes and the capability for us to manage workload was very horizontally built out. If you look at servers today, of any layer of that stack, it is really about that horizontal scale out, less so about throwing more hardware, more high-end infrastructure at it, but more about how commodity hardware can be leveraged and used up and down that stack very easily. So George, I had to ask you a question. So why is analytics so hard for so many companies? And you've been in this big data, we've been talking to you since the beginning. And when's it gonna get easier and what are you guys specifically doing to facilitate that? Sure, so a few things that we've seen to date is that a lot of the analytics work that many people do, internal and external to organizations, is very rote, hand-driven coding, right? And I think that's been one of the biggest challenges because the two endpoints in analytics have been either you hard code stuff that you push into a C++ or a Java function and you push it into database, or you're doing lightweight analytics in Excel. And really there needs to be a middle ground where someone can do effective scale out and have repeatability in what's been done and ease of use in what's been done that you don't have to necessarily be a programmer in Java, a programmer in C++ to push an analytic function in database. And you certainly don't have to deal with the limitations of Excel today. And really that middle ground is what Altric serves. We look at it as an opportunity for analysts to start work with a very repeatable, reusable workflow of how they would build their initial constructs around an analytic function that they would wanna deploy. And then the scale out happens because all of the infrastructure works on that analyst's behalf. Whether that be the infrastructure around Hadoop, whether that be the infrastructure or the scale out of how we would publish an analytic function, whether that be how the visualizations would occur inside of a product like Tableau. And so that I think Dave is one of the biggest things that needs to shift over where you don't have the only options in front of you for analytics is either Excel or hard coding a bunch of code in C++ or Java and pushing it in database. Yeah, correct me if I'm wrong, but it seems to be building your partnerships and your ecosystem really around driving that solution and really driving a revolution in the way in which people think about analytics. Ease of use? The idea is that ultimately if you can't get data analysts to be able to not only create work that they can actually self-describe, deploy, and deliver and deliver success inside of an organization and scale that out at the petabyte scale information that exists inside of most organizations, you fail. And that's the job of folks like ourselves to provide great software that delivers that. You mentioned Tableau, you guys have a strong partnership there and Kristen Chavo I think has a good vision and you talked about sort of the choices of the spectrum and neither are good. Can you talk a little bit more about that partnership and the relationship and what you guys are doing together? Yeah, I would say Tableau's our strongest and most strategic partner today. I mean, we were diamond sponsors of their conference. I think I was there at their conference when I was on theCUBE the time before and they are diamond sponsors of our conference. So our customers and particularly our users are one and the same. For Tableau, it really becomes an experience around how visual analysis and dashboarding can be very easily delivered by data analysts. And we think of those same users, the same exact people that Tableau works with to be able to do data blending and advanced analytics. And so that's why the two software products, that's why the two companies, that's why our two customer bases are one and the same because of that integrated experience. So Tableau is basically replacing Excel. That's the mission that they're after. And we feel that anyone who wants to be able to do the first form of data blending, which I would think of as a VLOOK up in Excel, should look at Altrix as a solution for that work. So you mentioned your conference, it's Inspire, right? It is Inspire, yeah. It was coming up in June. June, yeah. Now how many years have you done Inspire? So Inspire is now in its fifth year and... You're gonna bring the cube this year? Yeah, that would be great to have you guys. Yeah, that would be fun. We should do it. So talk about that conference a little bit. I don't know much about it, but I know of it. It's very centered around business users, particularly data analysts and many organizations that cut across retail, financial services, communications where companies like Walmart, AT&T, Sprint, Verizon, bring a lot of their underlying data problems, underlying analytic opportunities that they've wrestled with and bring a community together. This year we're expecting somewhere in the neighborhood of 500 and 5600 folks attending, largely to figure out how to bring this game forward, really to build out this next-grade analytic capability that's emergent for most organizations. And we think that that starts, ultimately, with data analysts. We think that there are well over 2.5 million data analysts that are underserved by the current big data tools that are in this space, and we've just been highly focused on targeting those users, and so far it's been pretty good for us. The data science movement is obviously moving to the casual user at some level, it's going to end up getting there not soon. But I want to ask you the role of the cloud in all this, because what you have underneath the hood is a lot of leverage. You mentioned the integrated stats. I want to get your perspective on the data cloud. Not data cloud, it's just putting data in the cloud, but the role of cloud, the role of DevOps, that intersection, because you're seeing DevOps fueling a lot of that growth, certainly under the hood. Now, at the top of the stack, you have the, I guess it's middle layer, lack of better description, going to use old metaphor, developing. So that's the enablement piece. Ultimately, the end game is fully turned key, data science, personalization, all that's the holy grail we all know. So how do you see that collision with the cloud and the big data movement? Yeah, so cloud has basically become three things for a lot of folks in our space. One is what we talked about, which is scale out. And scale out is something that is much more feasible when you can spin up and spin down infrastructure as needed, particularly on an elastic basis. And so many of us who've built our solutions leverage Amazon being one of the most de facto solutions for cloud-based deployment, that it just makes it easy to do the scale out that's necessary. The second thing it actually enables us, and many of our friends and partners to do, is to be able to bring a lower cost basis to how infrastructure stood up. Because at the end of the day, the challenge for the last generation of analytics and data warehousing that was in the space is your starting conversation is $2 to $3 million just in infrastructure alone before you even buy software and services. And so now if you can rent everything that's involved with the infrastructure, and the software is actually working within days, hours of actually starting the effort as opposed to a 14 month life cycle, it's really compressing the time to success and value that's involved. And so we see almost a similarity to how Salesforce really disrupted the market 10 years ago. I happened to be at Salesforce when that disruption occurred and the analytics movement that is underway really impacted by cloud and the ability to scale out in the cloud is really driving an economic basis that's unheard of. And a developer market that's robust, right? I mean you have easy kind of turnkey development, by tapping data. It is, right? Because there's a robust economy that's surrounding the APIs that are now available for cloud services. So it's not even just at the starting point of infrastructure, but there's definite higher level services where all the way to software as infrastructure is now available. How much growth do you see in that, in that valley of wealth and opportunity that will be created? I mean, from your cost, not only for the companies involved but the companies, customers, they have top line focus. And then the goal of the movement that you're seeing with analytics is you're seeing the CIO who kind of less of a role more of the CEO wants to, the chief data officer who wants the top line drivers to be app focused. So you're seeing a big shift there. Yeah, I mean one of the real proponents of the cloud is now the fact that there is an ability for business analysts, business users and the business line to make impacts on how decisions are done faster without the infrastructure underpinnings that were needed inside the four walls of an organization. So the decision maker and the buyer effectively has become to your point, the chief analytics officer, the chief marketing officer, right? Less so the chief information officer of an organization. And so I think that that is accelerating in a tremendous pace, right? Because even if you look at the statistics that are out there today, the buying power of the CMO is now outstripped the buying power of the CIO probably by 1.2 to 1.3X, right? And that used to be a whole different catalyst that was in front of us before. So I would see that go even faster. So yeah, so now let me just kind of bake this out here real time. So you got IT, which we all know, right? I went to the IT world for a long time. Service little catalog, self-service, you know, service early architectures, whatever you want to call it involving modern era, that's good. But on the business side, there's still a need for this same kind of cataloging of tooling, platform, analytics. So do you agree with that? I mean, do you see that kind of happening that way where there's still some connection, but it's not a complete dependency? That's kind of what we're thinking real time. Do you see that happening? Yeah, I think it's pretty spot on because when you look at what businesses are doing today, they're selecting software that enables them to be more self-reliant. The reason why we have been growing as much among business analysts as we have is we deliver self-reliant software. And in some way, that's what Tableau does. And so the winners in this space are going to be the ones that will really help users get to results faster through self-reliance. And that's really what companies like Autric stand for today. So I want to ask you a follow-up on that CMO, CIO discussion. So given that, that CMO is spending a lot more, where's the, who owns the data is we talk, well I don't know if I asked you this before, but do you see the role of a chief data officer emerging? And is that individual part of the marketing organization, is it part of IT? Is it a separate parallel role? What do you see emerging there? Well, one of the things I will tell you is that as I've seen chief analytics and chief data officers emerge and that is a real category entitled. Real deal. Of folks that have real responsibilities in the organization. The one place that's not is in IT, which is interesting to see, right? Because oftentimes those individuals are reporting straight to the CEO. Or they have very close access to line of business owners, general managers, or the heads of marketing, the heads of sales. So I'm seeing that shift where, wherever that chief data officer is, whether that's reporting to CEOs or line of business managers or general managers of large strategic business units, it's not in the information office. It's not in the CIO's purview anymore. And that is kind of telling for how people are thinking about their data. Data is becoming much more of an asset and a weapon for how companies grow and build their scale, less so about something that we just have to deal with. Yeah, and it's clearly emerging that role in certain industry sectors. Clearly financial services, government, and healthcare. Absolutely. But we have been saying- Big retail, big telecom, big healthcare. I mean, it just crossed the board, right? And one of the reasons why I wrote the article at the end of last year, I literally titled it, analytics is eating the world is this exact idea, right? Because you have this notion that you no longer are locked down with data and infrastructure kind of holding you back, right? This is now much more in the hands of people who are responsible for making better decisions inside their organizations, using data to drive those decisions. And it doesn't matter the size and shape of the data that it's coming in. Data is like the food that just spilled out from the truck and analytics is the Pac-Man eating it. That's right. Okay, George, final question in this segment is, summarize big data SV for us this year. From your perspective, no one was going on now. What's the big game changer? What should the folks know who we're watching and should take note of what should they pay attention to? What's the big story here at this moment? Well, there's definite swim lanes that are being created, as you can see. I mean, and now that the bigger distribution providers, particularly on the Hadoop side of the world, have started to call out what they all stand for, right? You can tell that MapR is definitely about creating a fast, slightly proprietary Hadoop distro for enterprise. You can tell that the folks at Cloudera are focusing themselves on enterprise scale and really building out that hub for enterprise scale. And you can tell Hortonworks is basically embedding, enabling an open source distro for anyone to be able to take advantage of. And certainly, the previous announcements and some of the recent ones give you an indicator of that. So I see these swim lanes forming in that layer, and now what is going to happen is that focus and attention is gonna move away from how that layer has evolved into what I would think of as advanced analytics, being able to do the visual analysis and blending of information. That's where the next battle war turf is going to be in particularly the strata space. So we're really looking forward to that because it basically puts us in a great position as a company and a market leader in particularly advanced analytics to really serve customers in how this new battleground is emerging. Well, we really appreciate you taking the time. You're an awesome guest on theCUBE. Obviously, you have a company that you're running and a great team and you come and share your great knowledge with our fans and audience. Appreciate it. What's next for you this year in the company? What's some of your goals? Let's share that as well. Yeah, we have a few things that are, we mentioned it, of course, Inspire coming up in June. There's a big product release. Most of our product team is actually here and we have a release coming up at the beginning of Q2, which is Altrix 9.0. So that has quite a bit involved in it, including expansion of connectivity, being able to go and introduce a fair degree of modeling capability so that the R-based modeling that we do scales out very well with Revolution and Cloudera in mind, as well as being able to package and deploy analytic apps very quickly from those data analysts in mind. So it's a release that's been almost a year in the works and we are very much looking forward to a big launch at the beginning of Q2. George, thanks so much. Got Inspire coming up. A lot of great success. This is growing market valuations are high and the good news is, this is just the beginning, call it mid-endings in the industry, but in the customers, I call it top of the first. A lot of build out real deployment, real budgets, real deal, big data. It's going to collide with cloud again. I'm going to start a little bit, a lot of innovation. All happening right here, big data SV, all the big data Silicon Valley coverage here at theCUBE. I'm John Furrier, Dave Vellante. We'll be right back with our next guest after this short break.