 Live from New York City, it's theCUBE at Big Data NYC 2014. Brought to you by headline sponsor, Juan Disco, with support from EMC, Mark Logic and TerraData, with hosts, Dave Vellante and Jeff Kelly. Welcome back to the Big Apple, everybody. This is day two of Big Data NYC. We're here concurrent with Strata plus the Duke World. This is our Big Data NYC. We're here at the Times Square Hilton. Last night, we had a great event. We had our capital markets event. We had a room full of people listening to Jeff Kelly present his scenario on big data, big data investment angles. Where do you play big data, right? There's only about three pure play companies, you know, and not even really at the heart of the core of big data, Tableau, CliqueTech and Splunk. I mean, there's certainly big data related, but the core infrastructure people, the tooling folks in big data really aren't public yet. The big distro vendors. We talked a lot about that, Cloudera, Hortonworks and MapR. And then Jeff put forth the scenario about investment angles, potentially finding people who are applying big data to their business and really looking at those as investment opportunities. So it was really insightful. Jeff shared a lot of data with the audience and that presentation is gonna be up on SlideShare next week. And then we had a panel, which was absolutely phenomenal. We had Peter Goldmacher, former common analyst. Amy O'Connor, who is an evangelist, big data evangelist at Cloudera, and former head of Nokia's big data operation. And then Abhimeda, who built the first data factory at Bank of America and now is CEO of Trasada. And it was a really interesting discussion. Goldmacher very much sort of negative on existing large software players, although I think he capitulated on one of the scenarios and essentially viewed Jeff, that they ultimately are gonna win, the rich get richer, he said they always get richer. But at the same time was very antagonistic, I would say, toward Cloudera's valuation. And I thought Amy O'Connor really stole the show. What happened was, I'll sort of set it up, Jeff. Goldmacher was essentially saying, hey, here's the deal, we live in this little echo chamber of big data in this Hadoop world. And I'm in San Francisco and you guys are in Boston and we're now in New York and we're all like, all this love of big data, blah, blah, blah. But the reality is when I fly over, this is Peter Goldmacher talking, we're talking when I fly over, all those states in the middle of the country, they don't give a, I won't say it, about Hadoop. It's a cube, you can say it. No, I'm not saying it. I personally don't swear on the cube. But they don't give a hoot about Hadoop. You know, blah, blah, blah. And then they went on and had a discussion. Amy then chimed in and said, well, let me tell you something. That's nice that you fly over those states, but I land in those states. And then she started talking about all the sort of ways in which organizations in different industries are applying Hadoop. And really that's her wheelhouse, which I thought was really fantastic. And Jeff Frick pointed out to me, Jeff, the other day. Another guy who's sort of in that Amy O'Connor camp is the guy, Bill Schmarzo, the Dean of Big Data that we know well, flying around all the time, talking to those customers. And so that was fantastic. I really thought she held her own against sort of an Abbey and Peter were kind of ganging up on Cloudera, but she really knocked it out of the park, in my opinion. It was just great. We're going to have the video of that panel back up. But the essence of it really was that the practitioners are doing some amazing things, but it is hard, as you pointed out in your presentation, it's going to take a long time. And the return on investment right now is not huge on big data. You showed data, Jeff, which I thought was really fascinating, that for every dollar spent on big data, on average, the return is 55 cents, which ain't very good. The goal of practitioners was to get three and a half X. And some of them, I forget who it was on the panel said, the real great big data practitioners are getting way more than three and a half X. And my comment was whoever's answering that question is not the CEO, because the CEO is looking for more. And the transformative potential of big data is such that you would think it's a 10 X play on an ROI basis. It's a 10 bagger. And I think it has to be. But people in our surveys, a combination of IT practitioners, maybe some line of business folks, much more conservative in their projections. And I think underestimating the potential, but part of that is probably tainted by their existing experiences. Yeah, I agree. Clearly, our thesis, our investment thesis that we put forth yesterday was that practitioners are gonna be the big winners by far. They're gonna create exponentially more value than big data vendors. And I agree. I think that expectation of $3.50 back on the dollar, investment in big data technologies and services is really a lot lower than the reality at the companies that are gonna be the winners. So the companies that do this well, the companies that leverage, as Abhi pointed out, they're data sets that are unique to them. They're unique data sets that provide competitive advantage. Those companies are gonna receive, sorry, are going to achieve significantly higher than a 3X, 4X return on their investment. As you said, it's a 10X. It might be 20X. It could be much larger. That number is an average, which is important to keep in mind. And I think it is in part tainted by what's going on in the early adoption phase. This stuff is challenging. It's difficult from a technology perspective, but it's also difficult from a people-in-process perspective. It's difficult from just identifying the best use cases to get started with is just a challenge. So I think you'll see that number go up. The question is, where are we in kind of that adoption cycle? Are we in the trough of disillusionment at this point? And will that number go up in the short term or a little bit in the longer term? I think it is going to take a while for this market to really shake out and for a lot of the benefits to happen. And frankly, when we do hit this 10X, 20X return, which we're gonna hit, some companies are going to hit, it's gonna be, you know, we're talking a long-term proposition here, five, 10 years. We're not gonna be calling it big data at that point. It's just gonna be data. We're extremely, I think, you know, your point around Peter's comment around, you know, I fly over these states and nobody cares there. Amy was right on. There are definitely companies in those states that do care and that are doing some interesting things. But I think Peter does have a point that it is going to take a lot longer than I think some believe it's going to take to hit true mainstream adoption because this stuff is so challenging. But what we do is there's going to be a huge opportunity for those companies that can harness big data. As part of this larger digital fabric, big data doesn't live in a bubble. There's other areas around cloud, infrastructure as a service, social engagement, trusted transaction processing, privacy security. There's a whole number of areas that companies need to think about and big data fits into that larger architecture approach. But, you know, there is huge opportunity for those companies that can leverage that digital fabric and part of that is big data analytics. I want to talk about that digital fabric a little bit. So the concept of digital fabric, really, and then that term was first put forth to me by a gentleman named David Michela who is the lead, team lead on the CSC's leading executive form, which is essentially a think tank within CSC that actually sells services to CIOs and the like. And his premise that he shared with me is one that we sort of have been batting around here and sort of testing is that historically industries whether retail, manufacturing, finance, et cetera, have a stack, call it a stack, of design, production, sales, distribution, partnerships, et cetera that is pretty hardened and optimized for that particular industry. And his premise is that increasingly you're seeing horizontal layers that cut across that industry, one at the bottom being sort of what I would call infrastructure as a service and cloud, and then above that you've got transaction systems, above that you've got social media now and thinking about our social graphs and social profiles that traverse different industries whether it's Facebook, LinkedIn, and other types of reputation systems could be Amazon ratings, could be Yelp, et cetera, et cetera, excuse me, and then big data on the top of the stack. And what his premise is that organizations are building on top of that digital fabric and the ones that can leverage that digital fabric the best are gonna be the winners. And ultimately they're gonna be able to use those horizontal pipes to traverse industries and it's a fascinating theory. And this guy has been, Dave Michelle has really been right on with a lot of theories. He was the first to describe the notion that the industry, this is back in the 80s, that the notion is moving from a vertically integrated structure to one that was disintegrated. In other words, competition is occurring on individual layers of the value chain. So Intel's gonna dominate microprocessors and EMC's gonna dominate storage and Oracle's gonna dominate database and Microsoft's gonna dominate PC apps. He made that call way before anybody else understood it. And so I like to follow his thinking because he's always been ahead of the curve and I think he's right on. And this is what you're seeing with Airbnb with, it was certainly with Google Amazon and Facebook but Airbnb, Uber, you used the example of Fitbit yesterday. And also you use a couple of interesting examples I thought UPS, Coca-Cola and GE, three existing organizations that are essentially in a way riding on top of that digital fabric using data at that top. So Abhi said something yesterday interesting on the panel. It's not the data that's the competitive advantage. It's how you create data that's different and how you apply it that creates that competitive advantage. So data in and of itself is not the new source of competitive advantage. It's the differentiation of that data that creates the source of competitive advantage. So I thought that was pretty prescient. What were your takeaways from the Capital Markets event yesterday? Well, first of all, thanks everybody that turned out. We've been really excited about this, preparing for this for a while and it was fun to, for the day to get here and actually put forth our thesis and present that. So I had a lot of fun doing that. In terms of the panel, it was one of the more interesting panels I've ever been on. It's great with the lineup that we were able to put together. Some outspoken folks with some really interesting opinions. My takeaway was, what struck me, I think was one, Peter Goldmacher's evolving opinion of the role of and the likely outcomes for the startup vendors in this space. He touched on some of them being overvalued and frankly, when you get those kind of valuations, that puts a lot of pressure on these startups. Now you've really got to deliver. You've got investors that obviously want a pretty big return here. So you've got to start making money, not just revenue, he's talking about profit and we're a long way from that with most of these startups. So that struck me. He also seems to have evolved his position a little bit around what the outcome's going to be for companies like Oracle and Territory or the industry heavyweights. And I think his take that the rich get richer, I think is probably accurate. However, I don't think it's gonna happen quite the way he was proposing. I think for the rich to get richer, for Oracle, for Territory data, for SAP to continue their dominance, I think they need to move a little bit higher up the value chain in the big data space around integrating infrastructure and providing this virtual layer on top of the data lake and higher level services and software where they can actually help companies put a lot of this stuff to use and make the transition from the old to the new. I think that's where they potentially could add value because I still believe that Hadoop NoSQL, the open source big data movement is fundamentally disrupting the traditional database and data warehouse market. And so I don't think they can rely on those lines of business to continue fueling growth for them. They've got to look higher up the stack. So I think Peter and I disagree a little bit on that. I think Abhi's comment around, as you mentioned earlier, what you do with that data and your unique data assets was interesting. And what I would say to that is, one way to look at that is companies like Uber, they have unique data. They've got all the data that their drivers are creating and sending and all the transactions they have with their customers. So that's a unique data set and they're using that to drive pricing and to build their business. But there's also unique data isn't necessarily created that discreetly. Other ways organizations can create unique data sets is by bringing in outside data, merging that with their existing data, doing analytics on that data, coming up essentially with new data sets, new insights and then leveraging that to drive their business. So just wanted to kind of expand on that topic a little bit. But in general, the conversation was great. I thought Amy was fantastic, really provided some great insight on what practitioners are really thinking. She comes from her heritage at Nokia where she led that big data team. She's got some great experience there in her stories about what she hears from customers and what they're concerned about. They're looking at what are these net new analytic applications we can build on this new emerging big data infrastructure, this new emerging digital fabric to really drive our business. She talked a little bit about some of those cost saving things that people are doing around moving workloads, transformation workloads to Hadoop and using Hadoop as a long-term archive to save on storage costs. Storage costs are going up, budgets are flat and that makes sense. But the really interesting and really differentiating potential of Hadoop, of big data are these net new analytic applications in workloads. So overall, that was a great panel. I'm looking forward to doing it again next year. Maybe we'll have a little reunion and get the gang back together again. I thought the other thing that was interesting was, because you and I had been talking about, like what happens if Oracle takes out a Cloudera? Now at this point, I think Cloudera is practically acquisition proof with the $4 billion valuation and all the money that's been raised. But we'll see. I mean, in theory, it could happen. But if, just again, playing chess moves, theoretical chess moves, if Oracle were to do that, my belief was that it would part the Red Sea for Hortonworks, because everybody's non-Oracle customers say, well, screw it. I'm not going to work with Oracle on that. They're going to block me into the Red Stack. I'm going to go work with Hortonworks. Goldmacher had a different take on that. He said, well, Hortonworks and Cloudera, they're in this urinary Olympics. Cloudera says, well, we have 100 sales reps. Hortonworks says, well, we have 75 and blah, blah, blah. We have a bigger distribution channel. Now we got into it. His point was if Oracle with 20,000 sales reps or whatever they have buys a Cloudera or whomever, all of a sudden, that distribution advantage goes away. I'm not sure. I'm not sure that a virtual ecosystem of distribution cannot match a single whales distribution channel. It certainly can't within that whales customer base, but I think it can if you have the right connection points. And I think that there are examples of that in the industry. I think EMC is an example, but maybe not the best, because they're already a very large company, but they're working that ecosystem probably as well as anybody else. But I think that Peter's point, however, nonetheless, was interesting from the standpoint of if that is your primary advantage, it could be somewhat neutralized and maybe dramatically neutralized on an acquisition. Other things we talked about, just sort of riffing on IPOs, you made the statement that you thought that Hortonworks would do an IPO first, probably because they need to first. And what are your thoughts on the timing of that? I know you don't have any inside information, but what does your gut tell you? No, I don't have any inside information. I would suspect my gut tells me that we're looking at some time in the first half of 2015 for an IPO from Hortonworks. Maybe File on S1 sometime in Q1, go public in Q2. That might be the timeframe I'm looking at. I've heard some interesting conversations with folks here at the show that think it could be much sooner than that, simply because, as you mentioned, it's an arms race to some degree. They need to keep up with the massive investment that Cloudera received from Intel. And they need to raise money to keep up in that race. So, perhaps it could be earlier, my timeframe is putting it around maybe late Q1, early Q2, next year. To your question around or your point around distribution channels and how that could be nullified if one of these big players scoops up one or the other, Cloudera, Hortonworks, or Manpower for that matter. Yeah, again, I tend not to agree wholeheartedly with that. I think one of the challenges is, one of the challenges is that, as you mentioned, one of the dynamics that's happening in the big data space is that people are frustrated by the lock-in that they received, that they got into in the database world, the traditional database world, whether that's with Oracle or with Teradata in the data warehouse space. And I think they don't want to repeat that. That's one of the real factors that people are considering. So, if Oracle were to make an acquisition of one of the Hadoop vendors, I do think that would have an impact on their ability to pick up new customers that are not already part of the red stack and have that mentality. Whether it's as, whether they could continue the growth to some degree, I think probably they could, whether the Red Sea part for a Hortonworks in that case, if it was Cloudera that got acquired by Oracle, maybe not the Red Sea, but maybe a smaller opening than that, but it would be an interesting dynamic. And then it depends on timing as well. I mean, if there's an acquisition a little bit further down the road when the market is more, not quite as frothy and there's less hype around it, that could impact things as well. So, we'll see. I mean, it's an exciting market to cover that there's a lot happening. A lot of moving parts and a lot can change quickly when, as we saw, the Intel investment kind of came out of nowhere. The market was not expecting that. So, who knows, maybe something as dramatic could happen again. All right, Jeff, we're gonna wrap there. So, stay with us. We'll be here all day today. Big Data NYC concurrent with Strata plus a dupe world. Check out crowdchat.net slash Big Data NYC. Obviously, check out wikibon.org. Siliconangle.tv has all the videos that we did yesterday. It's gonna have, if it doesn't already, I'm sure it does. The panel discussion yesterday and Jeff Kelly's presentation. Check out siliconangle.com for all the news. Keep it right there, everybody. We'll be back with our first guest right after this word. This is theCUBE.