 From New York, it's theCUBE. Covering Big Data New York City 2016. Brought to you by headline sponsors, Cisco, IBM, NVIDIA, and our ecosystem sponsors. Now, here are your hosts, Dave Vellante and Peter Burris. Welcome back to New York City, everybody. This is theCUBE, the worldwide leader in live tech coverage. We're here at 37 Pillars, which is on 37th Street, just a short distance from the Javits Center. We're a strata plus a dupe world is going on. We call this Big Data NYC, the event, within the event. Jacques Eastock is here. He is the field CTO for the Americas at Pivotal. Good to see you again, Jacques. Good to see you as well. Thanks for having me. So how's the show going for you guys? What's going on so far? We're right in the thick of things. It is fantastic. So I was just saying a little earlier, it seems a little bit like a reunion. A lot of the same companies, a lot of new companies, a lot of the same people at different companies. I think one of the most notable things is that we've had pretty steady traffic coming through, kind of checking out what Pivotal's been doing over the last year or so. This year, we open sourced all of our data products. And so I think it's been a huge benefit for our customers to avoid vendor lock-in, which is one of the key things that they're looking at from an initiative standpoint. So big theme in this event, I guess they tacked on a sort of a Monday theme of machine learning, deep learning, AI. What do you make of that? What is Pivotal's point of view there? So it's a great question. This second half of 2016, Pivotal took a step forward and actually combined our data science teams with our data engineering teams. And I think the general notion there is that all of these data problems that folks are trying to solve are better solved with a holistic data team. And I think the other thing that we do is when we open sourced all of our products, all the data products are really their genesis or their main focuses around analytics and machine learning and data science. To such an extent that we open sourced a tool called Apache Madlib. Apache Madlib runs basically a bunch of advanced algorithms within our database products in parallel across all the data. It is one of the Apache projects that we sponsor. We just did another release a couple of weeks ago. What that allows you to do is if you look at the market today, while there is a plethora of new ways to do analytics, there is still, in the enterprise, a lot of, you can say a majority of folks that really love and are really good at SQL and R. And our ability to transcend all of the data with that tool is really unparalleled. And then the last thing I would say is internally we do a lot of professional service help for our customers. And the data science team specifically releases a set of tools that we call PDL tools. So that's our Pivotal Data Labs tools, part of Pivotal. It's something that we get when you engage with us, but it takes all of the open source tools that we leverage that customers use and extends them with actual use cases and real world solutions that we've done in the past. So, Jack, when you're in the field talking to customers, I mean, what are some of the hardest problems you're getting that people are asking you to solve? So strangely enough, it's actually one or two things. One, cost is still an issue. So folks are looking to better figure out in a cheaper way how to analyze and store all of this data. One of the moves that's obviously very prevalent is the cloud. And so I think our notion around that particular sentiment is Pivotal makes the perfect bridge to modern. So if you're going to build a modern data architecture, if you're a new company, if you're a startup, it's relatively easy to just go. Most of the enterprises have been around for a long time. There's a lot of legacy technical debt, a lot of legacy applications and products and projects that they have. By leveraging our products, you can actually quickly get to a path to modern and with all of our products being open sourced and software, you can take your existing green plum deployment, for example, or your existing hawk deployment, for example, and move that to Amazon or move that to Google or move that to Azure seamlessly. Okay, so go ahead. So, Jacques, one of the, we were talking at a previous show not too long ago about the rate at which the developer community is entering the fray. And we're moving from the art of big data more into the science of creating value through software. Now Pivotal's got a very, very strong legacy in that world as well. When you talk about creating, using PDL to almost create templates that are more easily repeated across different industries, different businesses, you're starting to encourage common practice, common methods, common approaches to think about how to build big data apps. How's that going? So, you're spot on, 100%. If Pivotal's data legacy is nothing other than we show you how to best wrangle your data, I think it would be a pretty good legacy. That would be a great legacy. I think the, you know, again, our first step, combining data engineering, data science together was part of that. You know, what we see is enterprises to the story of DevOps and actually creating, you know, microservices, specifically for us, like data microservices, and being able to interact on data as it comes in real time, as opposed to, you know, kind of the legacy. And moreover, being able to modify that without impacting all the other services and all the other things that are running is enabled by one side of the Pivotal house, which is our Cloud Foundry platform as a service. All of those microservices, all of the apps eventually have to hit a data store of some sort. And so, Pivotal data is really focused now on teaching customers how to build that modern data architecture from the app side all the way down to the storage side. And that storage, what routinely we used to call like a data lake, we kind of see a deprecation of that term a little bit, but the general story is we need to have some cheap scalable place to store all of this data, and then any number of tools that our users, our data scientists, our business analysts can actually use to hit those. And again, I think there's a new class of folks that are coming up that are using some of the newer tools that you see here at Strata, but there's still a very longevity of existing subject matter experts that are very comfortable with something like SQL. And I think that's why we say that bridge to modern exists. So you've had a lot of experience in helping customers walk this journey, move through this journey to the point where they're actually going to be more successful at generating business value and the outcomes that they seek with a lot of these technologies. And therefore you've had some, I presume some visibility in some of the patterns of things that they tend to do over and over and over. What are some of the kind of things that people don't think about that are crucial to going down that path? Because normally people think about, I got to buy this tool, I got to hire this consulting firm. And I'm not saying there's anything wrong with that, but there's always, there's always unanticipated sidebars that often are the determinant of success or failure. You know, identify, so you mentioned, for example, storage. You can identify some of the things down low and some of the things higher up in the stack that you find people doing over and over and over that's more likely to lead success. So I'd say the number one thing that I've seen lately has been, again, huge interest in the cloud. I think there's a general idea that we will replicate what we're doing internally exactly in the cloud. We can spin up databases, we can spin up servers, you know, we have a scalable storage substrate. But what the cloud provides is actually a different type of thinking and infrastructure. So how you did it over here, even though you can replicate that over there, is not the right way to do it. So if we take a little bit more of a holistic approach of a data pipeline that goes from, you know, creation all the way down to storage, and then, you know, enrichment of that stored data, within the cloud we have a bunch of capabilities that we didn't have here. So your old way of doing ETL to get it into a system so you could do analysis does not have to exist in the same fashion. We provided an ability to do that, but that should be, you know, step one. Step two should be, okay, now, instead of thinking of it as my legacy pipeline, let's think about how we could leverage some of the other services within the cloud, some of the other technologies and development capabilities that we have in-house, and some of the new tools that are being developed and fostered, you know, within the open community. So if I can, let me ask you this. It sounds as though what you're saying is that the typical IT organization, and I completely agree based on the research we've done, thinks about, well, here's my design, I'm going to extend it. And what you're saying is you say, no, no, no, think about a new design. Yes. And how you can incorporate these other resources in that new design. So start with the new design and think how it can consume as opposed to what you have and how you can extend. Exactly, spot on. And, you know, along the way, the business still has to make money, so we can't just do a wholesale change, you know, overnight. It has to be a gradual shift, which is where, you know, products like ours can kind of help. So thinking about the customer base, what percent would you say are pursuing, actively pursuing that type of strategy that Peter just articulated? Is it presumably less than a half or actively pursuing, but maybe not? Maybe sorry. So actively pursuing, I would say, it's probably approaching half in some fashion. We have customers, of course, that are 100% all in and everything new goes to the cloud. And then we have other customers that have said everything that we have, we're moving and we have a timeline and it is 18 months to get from here to here. But everybody's looking. And I think, you know, what we've seen a lot is they're looking, but if you look at the cloud as a one-to-one replacement of what you have on-prem, your actual costs end up going up, you know? So, and I think that gets lost so many times when folks are planning. In reality, what is the cloud really good for? It's self-service, scalability, and ease of support. So if you take that mindset, a hybrid cloud solution actually isn't bad and something that runs in both is super important, which again, if you look at like Cloud Foundry, we enable developers to deploy apps on-prem or off-prem, same infrastructure kind of support. And same thing for our database products. For example, Hawke, for example, runs on top of an ODPI compliant Hadoop distribution. And again, on-prem, off-prem, same basic solution. Same operating model. Exactly, yeah. So I want to come back to this notion of the data engineering and the data science teams. You brought those together. We had a panel of eight data scientists on yesterday and we were talking a lot about, you know, what's the right regime for the team, the data team? Sure. What are you seeing in, well, so first of all, what was the catalyst for that within Pivotal? And what are you seeing within the customer base? Are you bringing those learnings to the customers? So we've learned a lot from Pivotal Labs, which we acquired and is our namesake, Pivotal. And in that, one of the things that we have learned the most is a single in-person operating on any problem is actually not as effective as having multiple people operating on a problem. You have different perspectives, you have different experiences, et cetera. So two years ago, we organized our engineering group into what we call data pods. And they're essentially small, almost Navy SEAL unit teams of practitioners. And they're centered around what we call an anchor. So a technical practitioner who basically go and tackle problems. These groups do both pre-sales and post-sales. So I can take a person who has deployed multi-terabyte implementation and have them talk to a prospect about that implementation. That in and of itself solves two things. One, everybody that we have is super technical. Two, when I'm talking to you from a sales standpoint, I'm talking to you from real experience that I was sitting at the keyboard doing. I think that's super helpful. And then C, when you're a buyer, you pretty much know that everything I said was true because I'm the one who's actually going to do it with you. So I think that solved a lot of things. We integrated engagement management so that the technical folks aren't burdened by a lot of the day-to-day meetings. And then what we realized about a year and a half into it is many of the projects that we work on are what we call data science labs. And within a data science lab, the first thing that has to happen is all data needs to be wrangled together so the data scientists can get to work. And so because of that, there was, you guys subgroups, there's just a lot of inefficiencies that we wanted to correct. But let me build upon that because as I recall, one of the other things that you did is, because we have this huge skills gap in big data and it's having an enormous impact. The rate at which we are moving from pilot to production is nowhere near what it should be. We are failing a lot in pilots. They're too expensive. We're not articulating the use case. We're not getting the numerator or the denominator right. One of the things that you guys, in you in particular, is you found a way to take normal people, people with good people skills that could build out those use cases, could speak at that level and you found ways to make them technical. Talk a little bit about that. So it actually goes back. Well talk a little bit about that and does that, do you think that that points in a general way to how we can think about overcoming the skills gap? So I think it does and I'll go back. I mean, again, we've learned a lot from our developers and our labs process. And I think we don't send our folks through a lot of training. We don't have a curriculum that they go learn. They basically are learning from each other and it is not in a, it's not in a classroom, it's onsite with customers actually solving problems. So I will, in my past, I owned a professional services firm before Green Plum and we did a very similar practice which was there's nobody that is better at actually learning something than somebody who's forced to do it. And if you do it with some guidance, with some expert tutelage, which is kind of our anchor model, then you're effectively almost a, like the mentorship or apprentice kind of model. And it allows us to do, again, kind of two things. One, we can take folks that are a little more junior, a little more green but very passionate in what they want to learn and turn them into data engineers. Actually we do the same thing on the data science side. Turn them into data scientists. And now that we've mingled the two groups, the thing that we see a lot is there are a lot of data engineers who have a super high interest in data science and never figured that was a path for them. Well, now that we have them all in one group, we suddenly have opened up a path for data engineering to actually do data science. And that's been super, you know, rewarding for a lot of our folks. One of the interesting discussion points that we had last night with the data scientists panel, these were a hardcore data scientists, you know, that term citizen data science, which I think is a Gartner term, I don't know. And they bristled at the term. They don't like the term. They think it's dangerous. But many people talk about it. You're in a way talking about extending the reach of data scientists. You know, it's an interesting conversation. I can see why the hardcore data scientists maybe don't like it. Maybe there's some, they're threatened by it somewhat. I don't know. But to me it's a fundamental measure of success to the extent that you can extend those data science skills is key to growth in the marketplace. 100%, this is, you know, from a pivotal standpoint, we exist to sell software, bottom line. In order to do that, we have to enable customers to use that software. We do that on the data side with our data science team and our data engineering team. We're not there for the long haul. We're not destined to be, you know, your consulting partner for the next 10 years. We would love to get in, show you how to do it, give you all the tools you need, and then let you go for it. Yeah, well, you came from a service's background, so you know that business. It's a great business, but the scale and the marginal economics of that business are not consistent with the software company. So one of the challenges that Peter's team has been researching is the degree to which big data customers need hand-holding, you know, versus you made the analogy in Linux, they didn't need hand-holding. They knew Unix, give them Linux, boom, they took off. That's been a challenge for a lot of companies. How have you been dealing with that? Give me automating, putting it more into software, but... So certainly automating, you know, I think, you know, I kind of go back to, if you look at our product suite, it is literally the suite to help get you to a modern place, even if you're not modern. So today, I just saw a study last week of what is the favorite tool for data scientists. And data science has a broad swath, right? So you have the earlier generation and maybe more seasoned folks. And by and large, SQL and R are still the most popular. And of course, you know, for us, both of those run in parallel within our technologies and have for, you know, for as long as I've been around since 2007. But I actually talked to some customers after we talked and Pivotal's reputation in these projects is that your guys are pretty good at staying focused in the use cases and not getting lost in the underlying details. And so I don't know that I would call your guys in that citizen data science class, but I think of the citizen data science person, it's someone who knows a little math, knows a little statistics, and just found out that you can make an enormous amount of money if you change the title on your business card. I don't see, our customers are not seeing your guys in that way. Yeah, that's fair. So that is fair. And actually, I mean, we have one of the larger data science teams, you know, as far as I know, in the Americas for sure, if not. Because of Pivotal Labs, right? Pivotal Labs, and we've been, you know, building it up. We've been building data science since Green Plum Day is, you know, back in 2006, 2007. All right, we got to leave it there. Jock, thanks very much for coming to theCUBE. It was great to see you again. You too, thanks for having me. All right, keep right there, everybody. We'll be back with our next guest at theCUBE. We're live from New York City, right back.