 From Las Vegas, it's theCUBE. Covering InterConnect 2017, brought to you by IBM. Okay, welcome back everyone. We are live in Las Vegas at IBM InterConnect 2017, IBM's cloud and now data show. I'm John Furrier with my co-host Dave Vellante. This is theCUBE. Our next guest is Derek Shuttles, the general manager of Watson Data Platform. And Adam Kukoloski, who's the CTO of the Watson Data Platform. Guys, welcome to theCUBE. Good to see you again, Derek. Great to see you. Welcome Adam. So obviously the data was a big part of the theme. You saw Chris Moody from Twitter up there. Obviously they have a ton of data. I like the joke about, they have a really active user right now in the present United States. Daily State of the Union, I think it was the one two way. Daily State of the Union. But this is the conversation that's happening in all over IT and enterprise and cloud, both public and enterprise, is the data conversation in context to cloud. Super relevant right now. And as architectural as it plays, if it impacts app developers, it impacts architectures. And that's the holy grail. The so-called app data layer or cloud data layer. What's your vision guys on this? Derek start with you. Your vision on this data opportunity. How does IBM approach it? And what's different from, or could be different from the competitors? Yeah, I know it's a, one it's an exciting time. We were just chatting about before we went live is there's so much change taking place in and around data, right? It used to be it's the natural currency. It's everything that everyone's talking about. The reality is it's changing business models, right? It introduces a whole new set of discussions when you introduce cloud, self-service and open source. So when we step back and think about how we can differentiate, how we can make IBM's offer to clients in the broader market interesting is shift to a platform strategy where it says, we have instead of discrete composable services that act independent of one another, that are not I'll say self aware, shift into a platform where you have common governance, you have common management, and you have really a collaborative by design approach where data is at the epicenter. Data is what starts every conversation whether you're on the app dev side, whether you're a data scientist, someone who's at the edge of discovery, and clouds what's enabling that, self-service is what's enabling that, and operationalize is what we do. I mean we spend our days thinking about and then operationalizing feature function and then performance for a lot of different workloads because it used to be, I think the, I was at Vertica, right? And so that was the introduction of volume, variety and velocity, right? Now with the introduction of AI and cognitive, it's really about taking any and all and rationalizing it, and any and all meaning sitting within your corporate structure as well as what's more broadly in the internet out available within social media, right? That to me is the shift that's taking place is all companies are realizing they made a lot of investments, they have a lot of data, and they're not taking advantage of it. And we see the big shift is, people are saying data scientist, but we think about is the merging of data and science. You think of science as cognitive and AI, right? That's a small population that really understands and can take advantage of. You have a whole big market that's out there in traditional data and analytics. Our platform is about merging those two. It's really about merging those experiences. So everyone takes advantage of the benefits of data and science. What's the conversations that you're having, Derek, with customers? Because I think that's, there's a lot of bells going off into the CXO or even practitioners, when you hear about machine learning, you see AI cognitive, autonomous vehicles, censored networks. Obviously that's, the alarms are going off, like I better get my act together. So how do they pull that off? How do your customers pull off making that happen? Because now you've got to bring in to be cloud ready. They have all these decoupled component parts. You got to operate them in the cloud and then you have to kind of have an on-prem component that's hybrid. What are the conversations that you're having with customers in how they're pulling this off? Yeah, so I'll cover the first piece. And I know Adam is spending certainly this week and a lot of time as well with clients on this topic is, the first part of the discussion is, do you believe that the cloud can help you? Most folks are saying, yes, we believe it can help. Second piece is, how do I take advantage of emerging technologies that are moving at a rate and pace that perhaps my skills, my existing IT architecture, and my business model can't fully kind of grasp if not take advantage of? So what we've introduced is a methodology, a data-first method, which literally is a, it sounds simple, but at the end of the day, it is a common, uniform, agile way for us as IBM to engage with partners and clients that literally starts with a discovery workshop that says, how does data inform your business, right? It's not static reporting anymore. It's what is the data that's sitting within your organization? You heard it from James at PlayFab. I mean, data is changing the way people building games today, thinking about how to enrich games, so on and so forth. Data-first method is what we've introduced. So you'll see going forward, IBM will sell data-first. We will engage data-first. So any conversation with someone who says, how do I take advantage of AI? I'm seeing learning or data science experience? Well, let's step back for a second and talk about data, because 30 years ago, 20, that's how every conversation started. You get on a whiteboard, you design a schema, you talk about the relationship. That's how it started, and we're kind of cycling back to that, right? We got to put data-first. So Adam, the geeks are always arguing speeds. And if you've got a Hadoop cluster here, I got this over here. I mean, there's a lot of variety and diversity in terms of how people can manage either databases in middleware or whatnot, right? So how do you see the data-first? How does that play out architecturally, and how does that play out for the solution? I think one of the big advantages we have in the world of the cloud platform, right, is this opportunity to, on the one hand, use more of a broader variety of composable services, but also be able to take different parts of the business that were historically a little bit more separated from one another and bring them together. So you look at a Hadoop flavor data lake on-premises, right? It's a good area to do discovery, a good area to do exploration, but what clients really care about time and time again, a common refrain is the operationalization of the analytics, of the machine learning models. How do I take this insight that my data science team has discovered and have it really influence the business process or be incorporated into an application, right? And in the on-premises architecture, that's oftentimes quite a challenge. In the world of the cloud platform and the Watson data platform, we have an opportunity to be a little bit closer to things like the world of Kubernetes, which are really ideally suited for deploying and scaling microservices and APIs in a cloud-native, fault-tolerant, reliable fashion, right? So you're seeing us take that menu of composable services in the cloud platform and treat the data platform as one such composition, an opinionated way to put together this menu of services specifically to help data professionals collaborate and drive the business forward. So when you guys announced the Watson data platform, I think you called it DataWorks, and it changed the name, about five, maybe six months ago, you messaged that 80% of data professionals' time has spent wrangling data, not enough time doing the fun stuff. And the premise was you coming up with a platform for collaboration that sort of integrates those different roles, as well as, as you pointed out just now, is allows you to operationalize analytics. Okay, so we're five months in, six months in. What kind of proof points do you have? Have you seen it? I mean, some people were skeptical, saying, okay, well, it's IBM, they're putting a nice wrapper on this thing, boiling in some different legacy components, and, you know, nice name. Okay, so what do you say to that, and what evidence do you have that what you said is going to come true is actually coming true? What are you attacking? I can do customer? Yeah, go for customer first. Yeah, so what we've seen is, and if you think about why we ended up at a platform, so if you roll the tape back to when Cloudin got acquired in 2014, the journey that we were on was, everyone was building rich applications, they wanted to be smarter, they wanted to understand what that exhaust was coming off, and they wanted to add different ingredients to it. So instead of a do-it-yourself kit that is a bunch of proprietary interoperability issues, a ton of expense and inefficiency, and can't take advantage of the Cloud, we decided, and very much have been on a path towards, let's build a platform that allows you to easily ingest, govern, curate, and then I'll say present and deploy. So starting in actually June, and this started first with Spark, right? We made a huge bet on Spark because we believe that to be kind of the operational, operating system, if you will, for an analytic fabric. So it started at Spark. Then we announced to the Watson data platform in October, it was, here's how we're going to take our heritage around governance, our heritage around traditional structured and unstructured data repositories, and here's how we're going to take visualization and distribution of data. So that then next went into how we bring it to market, that's data first. So we've been working with large insurance companies, large financial services companies, retailers, gaming companies, and the net that we see is three things. First is, yes, everyone agrees the platform is the right place to go. It's where do we get started? How do I take my existing investment and take advantage of this platform? And that invariably is, I'm going to build a net new application, whether it be, well Watson Conversation, so that runs in the Watson data platform. We want to ingest data. Well, we want that data to be resident on-prem, we want it to be native to the cloud, and so we're going to work through the architectural change to adopt that. Another great example is, we want to start with just an analytic application because we're already hosting with you a mobile app. Well, we're going to run it into your analytic fabric using dash DB, and dash DB works with Watson Analytics, and we're going to build an application that's resident. The really creative, and I think compelling piece here, back to your comment on IBM is, it's really hard to buy things from this company historically. Buying things from IBM is not easy, right? So we've built a platform, we've built a methodology to help you understand how to take advantage of it, and we now have a subscription, the Bluemix subscription is which, you can come in and draw down those services, be it object store, be it a SQL data store, be it a visualization layer. Composability basically. Yeah, but in a common governed framework, right? I mean, the big takeaway, you know, past the admin, governance and security and operationalizing the platform is what we can bring to bear, right? Because we're bringing open source, we're bringing proprietary technologies, but if it's done independent, it doesn't really deliver on the promise of a platform. I will say that architecturally, that's incredibly liberating, right? To know that there is this one common, it's also highly requested by customers too. That's what they want. That's what they want. It's the path to get there that I think is we're at that intersection right now. It's crossing the chasm. So what's liberating? Give us some. Oh, just the fact that you know that if there's a common access control layer under the hood, if there's a common governance layer under the hood that you don't have to compromise and come up with an alternative proposition for taking some capability, maybe deploying a model to a scoring engine, right? You can have the one purpose built scoring engine and know that I can call that in on demand from discovery phase, right? To go to production and I don't have to sort of engage in another separate buying conversation, a separate entitlement conversation, a separate enablement conversation. This catalog is, you know, allowed to work together. It's kill velocity. I mean, that to me like from a team sport perspective is that the steps you have to take. So the thing of ETL, right? ETL really in a modern real time, like getting away from batch and going to real time, that's just flow, right? So the skill set and the ownership and the infrastructure associated with that is evolved especially in cloud where that's just a dynamic where it's going to be a team deciding here's the data I want, here's I want to enrich it, here's I want to govern and curate it. It's a team sport, I love that. We were just at the Strata Hadoop, we had our big data SV event and the collision between batch and real time, they're not mutually exclusive and some people just made bets on batch and forgot real time. They have real time people who don't do batch. So you have to see that kind of coming together. Conversion. So the question, Adam, for you is that, with the world kind of moving in that direction, right? How do you rationalize to the customer saying, hey, I'm cloud native but I also have a hybrid here and I want to be cloud native purely on these net new applications. So there's a conversation happening, I call it the dev ops of data, which is like data ops. Hey, I'm a program, I just want data as code. What, I just don't want to get in the weeds of set up a data warehouse and prepping an ETL, that batch stuff that someone else does. I'm writing some software. I want data native to my app, but I don't want to go in and do the wrangling. I don't want to go out. I just want stuff to magically work. How do you tackle that premise? I mean, I think the dev ops of data pieces is certainly a topic we're going to be hearing a lot more about over the next coming, you know, six months and a year. And I think the reason for that is precisely because of this earlier topic of operationalization. You've got lots of people building up, budding, you know, data science teams and so on, right? And the first thing they're going to do is be working in a discovery area. They won't be in the world of pushing things to production. When they do, it's going to become more important that the folks who truly understand the details of the algorithm are close enough to the deployed, you know, assets so that they can understand how this model is behaving over time, so that they can understand, you know, new data quality issues that might have cropped up, right? And get close to that without, you know, obviously sort of breaking the separation duties that are important for a production system. So I think that is one part of the data ops conversation that hasn't yet been worked out and is going to be a real opportunity for folks to have this. That's an emerging area, you would agree, right? It's a cultural shift too. I mean, that is a rethinking of, because most companies keep data in stovepipes. They're highly regulated. They're rules, there's kind of personalities that own them, so to speak. The proposition that we've been on in every client asks for is, how do I create a common fabric that gives access to people that is governed and curated? So you can almost give a shopping experience. People that work with data do not want to talk about and say, how long does it take to stand up a server? Well, when can I get the data stood up in this staging area so I can actually access it? That's over. It's interesting, we're doing this in Wikibon research on this and this is the point where people look at value extraction of the data so they tend to, it's kind of, if you're a hammer, everyone's like a nail. So if you're an IT, it's infrastructure. If you're on the business line, it's the app. So you're seeing the shift where apps is value creating the value, but the infrastructure is more elastic, more composable. So it's enablement by itself. So, interesting, so your thoughts on that guys, where is that value of the data coming from most right now? Is it the apps, is infrastructure still evolving? The hybrid not? We think about it. I mean, we think there's a value model here, right? Like there's certainly elements of the data pipeline that are purely operational, reporting based and things like that, right? Which drive value on their own. But we also recognize that it's new uses of data and new business processes that are primarily driven by applications, driven by conversational interfaces, driven by these sort of emerging paradigms. And the goal, one of our goals in the data platform is to ensure that clients can move along that curve more aggressively. How are people getting started with the Watson data platform? Are they jumping all in? Is there a community addition? You can try it before you buy it kind of thing. Yeah, so you're signing up at Bluemix. You have access to a set of services around the platform. You have a 30 day window where you can try everything included within it. And then at some point you've got to commit to a credit card or you've got to commit to a 12 month term agreement. I think in parallel we see a lot of the companies that, you know, and the blessing and slash challenge for IBM, we have a lot of clients. We have a lot of clients that we're working with today in traditional architects and infrastructure, helping them, you know, through a methodology, helping them with the right skills. That is a more traditional, hey, come in and try an analytic workload on the platform. We'll give the skills, we'll help do the enablement, and then we're off and running. I think the big difference is, more often than not, clients are paying for and they're willing to pay for. Because we are helping them get to this new model. We're helping them get to the platform. And I think the big thing we're working through is, how do we get to velocity? Right, I think, you know, when you look at, you know, these workloads that are happening, the reason they're happening is, now data is not just in some dark corner, with AI, with machine learning, it's always on. Right, so there's a lot of different ways in which you can unleash that, that then how do you take advantage of it? And that is a cultural shift. It's a, you know, rethinking business models, it's rethinking how you have skills deployed, which is incredibly exciting for us. And I think the market in general, I think back to how, you know, AI I think is cast in many cases as, you know, the robots are going to rule the world. There's a lot of good that can come from exposing vast amounts of data to AI and to frameworks where you can get a lot of value out of it. From how to better position products, to how to, you know, better design of medicines, to, you know, fulfillment chains in, you know, countries that need help. So guys, in the last minute that we have, I want you to take a minute to either together or one of you guys talk about how IBM is helping solve what seems to be the number one question we get on theCUBE or get asked. Hey, how do you help me build the hybrid architecture? I have more data-rich workloads coming on board now. Either I have some heavy, heavy, data-rich workloads that I run on-prem. I got more cloud action coming. I got IoT and I'm investing in data science. So how do you guys specifically help me build a hybrid cloud architecture that's going to spewel and support data-rich workloads and propel my data science operation? Yeah, so I'll take the basics for me. It is the data-first method. It is dash DB, which is an extensible on-prem hybrid in the cloud, so that's a common analytic fabric. There's data connect, which is our ability to move data batch continuous into different end states in the cloud. And then there's data science experience. So data science experience is our offering that brings together community, it brings together content, brings together various tooling for the data scientist or data engineers. And I think the other piece of this is we have something called solutions assurance. So we're literally designing patterns that we stand up in our own environments. That reflect what we see on-premise and what we see workloads going to the cloud with and stamping that as hybrid architectures that are repeatable and we remove risk, the operational risk. But the reality is where you have data is clients have to make sacrifices in getting to the cloud. You have to deprecate, you have to rethink. And that's where some of the smoothing of those rough edges come into the discipline of us saying, here's a supported architecture, here's the destination that you're going to and we're going to have to work together to get there. Which is the fun part. And that's what we're all in this for. It's getting to outcomes. I think the key is not to pretend that these environments are completely identical to one another, right? There are things that the public cloud is uniquely well suited for. So let's make sure that those kinds of use cases are really nailed there, right? And then there are other cases where you're dealing with mainframe systems running critical business processes and you want to be able to infuse that process with some analytics. So you have to look at the use case. Maybe it's training a machine learning model in the cloud, being able to export that model and run it. So do you use proven solutions and be prepared to be handling new ones coming on board? All right, Derek Shuttle, general manager and Adam Kokolowski, the CTO. The leaders of IBM Watson data group. IBM Watson platform. This is theCUBE. Back with more live coverage after this short break.