 From Midtown Manhattan, it's theCUBE, covering Big Data New York City 2017, brought to you by SiliconANGLE Media and its ecosystem sponsors. Hello everyone, welcome back to our day one at Big Data NYC, three days of wall-to-wall coverage. This is theCUBE, I'm John Furrier with my co-host, Jim Kobilis and Peter Burris. We do this event every year. This is theCUBE's Big Data NYC. It's our event that we run in New York City. We have a lot of great content. We have theCUBE going live. We don't go to Strata anymore. We do our own event. In conjunction, they have their own event. You can go pay over there and get the booth space. But we do our media event and attract all the influencers, the VIPs, the executives, the entrepreneurs. We've been doing it for five years. We're super excited and thank our sponsors for allowing us to get here and we really appreciate the community for continuing to support theCUBE. We're here to wrap up day one what's going on in New York. Certainly we've had a chance to check out the Strata situation, Strata data, which is Cloudera and O'Reilly media, mainly O'Reilly media, they run that. Kind of old school event, guys. Let's kind of discuss the impact of the event in context to the massive growth that's going outside of their event. And their event is a walled garden. You got to pay to get in, they're very strict. They don't really let a lot of people in. But okay, outside of that, the event is going global. These activity around Big Data is going global. It's more than Hadoop. We certainly talked about that's old news, but what's the big trend this year? As the horizontally scalable cloud enters the equation? I think the big trend, John, is that, and we've talked about it in our research, is that we have finally moved away from Big Data being associated with a new type of infrastructure. The emergence of AI, deep learning, machine learning, cognitive, all of these different names for relatively common things, are an indication that we're starting to move up into people thinking about applications, people thinking about services that they can use to get access to, or they can get access to to build their applications. There's not enough skills. So I think that's probably the biggest thing is that the days of failure being measured by whether or not you can stand your cluster up are finally behind us. We're using the cloud, other resources, the amount of expertise, the technologies are becoming simpler and more straightforward to do that. And now we're thinking about how we're going to create value out of all of this, which is how are we going to use the data to learn something new about what we're doing with the new organization, combine it with advanced software technologies that actually dramatically reduce the amount of work that's necessary to make a decision. And the other trend I would say on top of that, just to kind of put a little cherry on top of that, kind of the business focus, which is, again, not the speeds and feeds, although under the hood, a lot of great innovation going on from deep learning and there's a ton of stuff. However, the conversation is the business value, how it's transforming work. And but the one thing that nobody's talking about is this is why I'm kind of not bullish on these one shows. One show meets all kind of things like O'Reilly Media does because this multiple personas in a company now in the ecosystem, there are now a variety of buyers of some products. At least for the old days, go talk to the IT CIO and you're in. Not anymore. You have an analytics person, a chief data officer. You might have an IT person. You might have a cloud person. So you're seeing a completely broader set of potential buyers that are driving this chain. We heard Pexata talk about that. This is a dynamic. Yeah, definitely. We see a fair amount of... What I'm sensing about Strata, how it's evolving, these big top shows around data, it's evolving towards a broader, addressing a broader, what we call maker culture. It's more than software developers. It's business analysts. It's the people who build the hardware and the like for the internet of things into which, like AI and machine learning models are being containerized and embedded. I've, one of the takeaways from today so far and the keynotes are tomorrow at Strata, but I've been walking the atrium at the Javits Center having some interesting conversations. In addition, of course, to the ones we've been having here at theCUBE. And what I'm... What else hallway conversations you're having? Yeah. What's going on over there? Yeah, the conversations I've had today, I've been focused on the chief trend that I'm starting to sense here is that the productionization of the machine learning development process or pipeline is super hot. And it spans multiple data platforms. Of course, you got a bit of Hadoop in the refinery layer. You got a bit of like in memory, columnar databases, like the one that Actian discussed today, their own, but the more important, well, not more important, just as important is that what users are looking at is how can we build these DevOps pipelines for continuous management of releases of machine learning models for productionization but also for ongoing evaluation and scoring and iteration and redeployment into business applications. You know, I had conversations with MapR, I had conversations with IBM. I mean, these are atrium conversations about things that they're doing. IBM had an announcement today on the wires and so forth with some relevance to that. So I'm seeing a fair, I'm hearing, I'm sensing a fair amount of it's the apps. It's more than just Hadoop, but it's very much the flow of these, these are the core pieces, like AI, the core pieces of intellectual property in the most disruptive applications that are being developed these days in all, in business and industry and the consumer space. So I was, I did not go over to the show floor yet and I've been over to the atrium, but I'll bet you dollars to donuts that this is indicative of something that always happens in a complex technology environment. And again, this is something we've thought about and predicted and we've talked about here in theCUBE. In fact, we talked about it a little bit as well. And that is, as an organization gave us experience, it starts to specialize. But there's always moments, there's always inflection points in the process of gaining that experience. And by that, or one of the indications of that is that you end up with some people starting to specialize, but not quite sure what they're specializing in. And I think that's one of the things that's happening right now is that the skills gap is significant. At the same time the skills gap is being significant, we're seeing people start to declare their specializations that they don't have skills necessary to perform yet and the tools aren't catching up. So there's still this tension in the model, open source, not necessarily focusing on the core problem, skills, looking for tools, and explosion in the number of tools out there, not focused on how you simplify and streamline and put into operation how all these things work together. It's going to be an interesting couple of years, but the good news ultimately is that we are starting to see for the first time, even in theCUBE interviews today, the emergence of a common language about how we think about the characteristics of the problem. And I think that that is Harold's new round of experience and a new round of thinking about what is the role of the business analyst, the data scientist, the developer, the infrastructure person, the business person. You know, you bring up that comment, those comments, it's about the specialisms and the skills. We talked, Jim and I talked on this segment this morning about toolshed. We were talking about there's so many tools out there and everyone loves a good tool. A hammer, but the old expression is if you're a hammer, everything looks like a nail, it's cliche, but what's happened is there are plethora tools, right? And tools are good. Platforms are better as people start to re-platformize their thing, they could have too many tools. So we asked the chief data officer, he goes, yeah, I try to manage the tool tsunami, but his biggest issue was he buys a hammer and it turns into a lawnmower. That's a vendor mentality of- Oh, Trump, but that's a classic example of what I'm talking about. Or someone who's trying to use a hammer to mow the lawn, right? Again, this is what you're getting at. The companies out there are groping for relevance and that's how you can see the pretenders from the winners. Well, a tool fundamentally is pedagogical. A tool describes the way work is going to be performed. And that's been a lot of what's been happening over the course of the past few years. Now businesses, as they get more experience, they're describing their own way of thinking through how the problem. And they're still not clear on how to bring the tools together because the tools are being generated and put into the marketplace by expanding a way of folks and companies and they're now starting to shuffle for position. But I think ultimately what we're going to see happen over the course of next year, and I think this is an inflection point, and go back to this big tent notion, is the idea that ultimately we are going to see greater specialization over the next few years. My guess is that this show probably should get better or should get bigger. I'm not certain it will because I think it's focused on the problems that we've already solved and not moving into the problems that we need to focus on. Yeah, a lot of the problems I had with the O'Reilly show is that they try to throw the fault leadership out there and there's some smart people that go to that. But the problem is that it's too monetization and they try to make too much money from the event. When this action's happening and this is where the tool becomes, the hammer becomes a lawnmower because what's happening is the vendors are trying to stay alive. And you mentioned this earlier to your point. The customers that are buyers of the technology don't want to have something that's not going to be a fit that's going to be agile from this. They don't want the hammer they bought to turn into something that they didn't buy it for. And sometimes teams can't make that leap skill set wise to literally pivot overnight, especially as a startup. So this is where the selection of the companies makes a big difference. And a lot of the clients, a lot of the customers that we're serving on the end user side are reaching the conclusion that the tools themselves, while important, are clearly not where the value is. The values and how they put them together for their business. And that's something that's going to have to, again, that's a maturation process. Roles, responsibilities, the chief data officer, are they going to have a role in that or not? But ultimately they're going to have to start to finding their pipelines, their process for ingestion after analysis. Let me get your reaction to this tape because one of the things I heard today, and again, this validates a bigger trend as we talk about the landscape of the market from the event to how people are behaving and promoting and building products and companies. The pattern that I'm hearing was set at multiple times on theCUBE today and one from the guy who was basically reading the script of his interview explaining, because it's so factual. I asked him this straight up question, how do you deal with suppliers? What's happening is the trend is don't show me sizzle. I want to see the stake. Don't sell me hype. I got too many business things to work on right now. I need to nail down some core things. I got application development. I got security to build out big time. And then I got all those data challenges underneath. I don't have time for you to sell me a hammer that might not be a hammer in the future. So I need real results. I need real performance that's going to have business impact. That is the theme and that trumps the hype. So I see that becoming a huge thing right now. Your thoughts, reactions guys. Why don't you start, I'll do it. Yeah, what's your reaction? Okay, true or false on the trend? True. Get down to business. I'll say that much true, but go ahead. I'll say true as well, but let me just add some context. I think a show like O'Reilly like Strata is good up to a point, especially to catalyze an industry, a growing industry like Big Data's own understanding of the value that all these piece parts, Hadoop and Spark and so forth, can provide when deployed according to some emerging patterns, whatever, but at a certain point where a space like this becomes well-established, just becomes a pure marketing event and customers at a certain point saying, I come here, I want to get ideas about things that I can do in my environment, in my business that can actually, in many ways, help me to do new things. You can't get that at a marketing oriented show. You can get that as a user, more at a research oriented show. When it's an emerging market like say Spark has been, like Spark Summit was in the beginning, those are kind of like, when industries go through that phase, those are sort of in the beginning sort of research-focused shows where the industry, the people who are doing the development of this new architecture, they talk ideas. Now, I think in 2017 where we're at now is with the ideas that everybody is trying to get their heads around, they're all around AI, what the heck that is. For a show like an O'Reilly or any show to have relevance in a market that's in this much ferment of really innovation around AI and deep learning, there needs to be a core research focus that you don't get at this point in the life cycle of a strata, for example. So that's my take on what's going on. So my take is this, and first of all, I agree with everything you said, so it's not in opposition to anything. Many years ago, I had this thought that I think still is very true, and that is that the value of infrastructure is inversely correlated with the degree to which anybody knows anything about it. So if I know a lot about my infrastructure, it's not creating a lot of business value. In fact, more often than not, it's not working, which is why people end up knowing more about it. But the problem is, the way the technology has always been sold is it's a differentiated some sort of value add thing. And so you end up with this tension in an application domain, very, very complex application domain, like big data. The tension is my tool is so great that, and it's differentiated and all this other stuff, yeah, but it becomes valuable to me if and only if nobody knows it exists. So I think, and one of the reasons why I bring this up, John, is because many of the companies that are in the big data space today that are most successful are companies that are positioning themselves as a service. There's a lot of interesting SaaS applications for big data analysis, pipeline management, all the other things we could talk about that are actually being rendered as a service and not as a product. So that all you need to know is what the tool does. You don't need to know the tool. And I don't know that that's necessarily going to last, but I think it's very, very interesting that a lot of the more successful companies that we're talking to are themselves weird infrastructure SaaS companies. Well, at scale is interesting though. They came in, this is a service, but their service has an interesting value proposition. They can allow you to virtualize the data to play with it so people can actually sandbox data. And if it gets traction, they can then double down on it. So to me, that's a freebie. I mean, to me, I'm a customer. I got to love that kind of environment because you're essentially giving almost a developer-like environment. The value without necessarily having to do. Yeah, the cost. And the guy gets the signal from the marketplace, his customer, of what data resolves. To me, that's a very cool scene. I don't, you say that's bad or? No, no. I think it's interesting. I think, I think it's reflective. So you're saying services. So what I'm saying is, what I'm saying is that the value of infrastructure is inversely proportional to the degree to which anybody knows anything about it, but you've got a bunch of companies who are selling effectively infrastructure software as though it's a value-added thing, and that creates a problem. And a lot of other companies, now that we have the ability to sell something as a service as opposed to a product, they can just put the service forward and people are using the service and getting what they need out of it without knowing anything about the tool. I like that. Let me just maybe possibly restate what you just said. When a market goes towards a SaaS, go-to-market delivery model for solutions, the user, the buyer's focus is shifted away from what the solution can do, I mean, how it works under the cover. What it can do potentially for you. The business, that's right. You don't get distracted by the implementation details. You then, as a user, become laser focused on, wow, there's a bunch of things that this can do for me. I don't care how it works, really. You SaaS provider, you worry about that stuff. I can worry now about somehow extracting value. I'm not distracted. This show or this domain is one of the domains where SaaS has moved. Just as we're thinking about moving up the stack, the SaaS business model's moving down the stack in the big data world. All right, so in summary, the stack is changing predictions for the next few days. What are we going to see come out of Strata data and our big data NYC? Because remember, this show was always a big hit, but it's very clear from the data on our dashboards, we're seeing all the social data. Microsoft Ignite is going on and Microsoft Azure just in the past few years has burst on the scene. Cloud is sucking the oxygen out of the big data event. Or is it? And it was sucking it out of the event, but the cube is not at Ignite. Where's the cube right now? Big data NYC. Oh, it's here, but it's also at the Splunk Show. That's true. And isn't it interesting? We're sucking the data out of two events. There are a lot of people coming in the computer. Exactly. A lot of people coming in the cube. We're live streaming and they streaming data just said that we suck. That's not a record saying we're sucking all the data. So we are sharing data. These videos are data driven. Yeah, absolutely. But the point is ultimately is that Splunk is an example of a company that's putting forward a service about how you do this and not necessarily a product focus. And a lot of the folks that are coming on the cube here are also going on to the cube down in Washington DC, which is where the Splunk show's running. And so I think one of the things, one of the predictions I'll make is that we're going to hear over the next couple of days more companies talk about their SaaS strategy. Yeah. I mean, I just think, I agree with you, but I also agree with the comments about the technology coming together. And here's one thing I want to throw out on the table. I've kind of sent this a few times and I've been connecting the dots on it. We'll put it out publicly for comment right now. The role that communities will play outside of developer is going to be astronomical. I think we're seeing signals, certainly open source communities have been around for a long time. Continue to grow in the shoulders of giants before them. Even these events like O'Reilly, which are a small community that they relied on, is now not the only game in town. You're seeing the notion of a community strategy in things like blockchain. You're seeing it in business. You're seeing how people are rolling out their recruitment for say data scientists. You're seeing a community model developing in business. Yes or no? Yes, but I would say, I would put it this way, John, that it's always been there. The difference is that we are now getting enough experience with things that have occurred, for example, in collaboration community, communal collaboration in open source software that people are now saying, and they've developed a bunch of social networking techniques where they can actually analyze how those communities work together. But now they're saying, hmm, I figured out how to do an assessment and now it's understanding that community. I'm going to see if I can take that same concept and apply it over here to how sales works or how B2B engagement works or how marketing gets conducted or how sales and marketing work together. And they're discovering that the same way of thinking is actually very fruitful over there. So, I totally agree, 100%. So they don't rely on other people's version of a community. They can essentially construct their own. Or enabling their own. That's right. They are bringing that approach to thinking about a community-driven business and they're applying it to a lot of new ways. And that's very exciting. As the world gets connected with mobile and internet of things, as we're seeing, it's the one big online community. We're seeing things, I'm writing a post right now, what B2B markets could learn from the fake news problem. And that is, content and infrastructure are now contextually tied together and related. The payload of the fake news was also related to the gamification of the network effect, hence the targeting, hence the weaponization. We wrote a piece in the three C's of strategy a year and a half ago. Content, community, context. And at the end of the day, the most important thing to what you're saying about is that there is, right now people talk about social networking, social media, they think Facebook. Facebook is a community with a single context. Stay in touch with your friends. Connections. Well, connections. But what you're really saying is that for the first time we're not going to see an enormous amount of technology being applied to the fullness of all the communities. We're going to see a lot more communities being created with the software, each driven by what content does, great value against the context of how it works, where the community is defined in terms of what do we do? Let me focus on, yeah, Vic, bringing, using community as a framework for understanding how the software world is evolving. The software world is evolving towards, and I've said this many times in my Wikibon research, the data scientists or people with data science skills are the core developers in this new era. Now, what is data science all about in its heart? Machine learning, building and training, machine learning models. And so, training, machine learning models is everything towards making sure that they are fit for their predictive purpose or classification. Training data, where are you going to get all this training data from to train all these models? Where are you going to get all the human resources to label, to do the labeling of the data sets and so forth? That you need communities, crowd sourcing and whatnot, and you need sustainable communities that can supply the data and the labeling services and so forth to be able to sustain the AI and machine learning revolution. So content, training data and so forth, really rules in this new era. I'm lacking that. And the interest of machine learning is at an all time high, yes or no? Oh yeah, very much so. God, I agree. I think the social graph, interest graph, now value graph is emerging. I think communities, content, context and communities are relevant. I think a lot of things are going to change and the Scuttlebutt that I'm hearing in this area now is it's not about the big events anymore, it's about the digital component. I think you're seeing people recognize that but they still want to do the face to face. You know what, that's right, that's right. They still want, let's put it this way, that there are, that the whole point of community is we do things together. And there are some things that are still easier to do together if we get together. But B2B marketing, you just can't say we're not going to do events and when there's a whole machinery behind events, lead, gen, batch marketing, we call it. There's a lot of stuff that goes on in that funnel. You can't just say, hey, we're going to do a blog post. People still need to connect. So it's good. But there's some online tools that are happening, so of course, you want to say something? Yeah, I just want to say one thing. Face to face validates the source of expertise. I don't really fully trust an expert, I can't in my heart, to engage with them till I actually meet them and figure out in person whether they really do have the goods or whether they're repurposing some thinking that they got from elsewhere and they gussy it up. So face to, there's no substitute for face to face to validate the expertise, the expertise that you value enough to want to engage in your solution or whatever it might be. Awesome, I agree. Online activities, the content, we're streaming the data, the queue. This is our annual event in New York City. We've got three days of coverage, Tuesday, Wednesday, Thursday, here at theCUBE in Manhattan, right around the corner from Strata Hadoop, the Javits Center. Influencers, we're here with the VIPs, with the entrepreneurs, with the CEOs and all the top analysts here from Wikibon and around the community. Be there tomorrow all day, day one, wrap up. It's done, thanks for watching. See you tomorrow.