 Live from Munich, Germany. It's theCUBE, covering IBM Fast Track Your Data. Brought to you by IBM. We're back in Munich, Germany. This is Fast Track Your Data, and this is theCUBE, the leader in live tech coverage. We go out to the events. We extract a signal from the noise. My name is Dave Vellante, and I'm here with my co-host, Jim Kobielus. We just came off of the main stage. IBM had a very choreographed, really beautiful. Kate Silverton was there from a BBC fame, talking to various folks within the IBM community, IBM executives, practitioners, and quite a main stage production, Jim. IBM always knows how to do it right. Manish Goyal is here. He's the director of product management for the Watson data platform, something we covered on theCUBE extensively, that announcement last year in New York City. Manish, welcome to theCUBE. So this is, it really was your signature moment back in last fall at Strata in New York City. We covered that big announcement, a lot of customers there. You guys demonstrated sort of the next generation of platform that you guys are announcing. So take us, bring us up to date. How's it going, where are we at, and what are you guys doing here? So again, thank you for having me. Let me take a minute to just let all the viewers know what is the Watson data platform, right? So the Watson data platform is our cloud analytics platform. And it's really three things, right? It's a set of composable data services, right? For ingest, analyze, persist. It's a set of tailor-made experiences for the different personas, right? Whether you are a data engineer, a business analyst, data scientist, or a data steward. And connecting all of these, or both of these, is a set of data fabric, right? Which is really the secret sauce, right? And think of this as being the governance layer, right? That ensures that everything that we're doing, that everything that is being done by any of these personas is working on trusted data and that the insights that are being generated can be trusted by the risk folks, the business folks, as they put the analytics into production. So just a review for our audience. There are a number of components to the Watson data platform. There's a governance component you mentioned, there's a visualization, there's analytics. Now, these, many people criticized the Watson data platform. They said, oh, it's just IBM putting a bunch of, you know, disparate products together, some acquisitions, and then wrapping some services around it. When we talked to you guys in October, you said, no, no, that's not the case. But can you affirm that? Yeah, no, that is exactly right. That is not the case. It's not just us putting stuff together. I'm calling it a new name and think of that's the platform, right? It's just a set of disparate services. That is absolutely not. And that's why I was emphasizing this common data fabric, right? I've got a couple of components. Let me sort of dive a little deeper into it, right? So the biggest problem that customers and data users in general complain about is extremely hard to find data, right? The tools that they're working with are all siloed, right? So you and I are working on, you know, my analytics project is very hard for me to share. What I'm working on with you, the environment that I'm running on with you, et cetera. And this just, you know, and the third piece is a real issue with, you know, is the data that I'm working with trusted, right? Can I actually believe that this is the best data that I can use that when I put something into production, that I create my machine learning models, I put them into my, you know, production environment that, you know, the risk guys are going to be fine with it, you know, I'm going to be fine with actually the results I'm getting. And so getting this data fabric, right, which is addressing these issues, one is addressing it first and foremost with a data catalog, right, a governance layer, so that it's very clear, irrespective whether you're a data engineer, a business analyst, a data scientist, or the data steward, right, from the CDO's office, you're all working of the same version of the truth, right? Come on, if that's something that DevOps platform, is it like DevOps for data science or for machine learning development, or how would you describe, does that make sense? An automated release pipeline that's in a way, yes, in a way, yes, in a way that's one way to describe it. So that's one aspect, right, making sure that you're working with the trusted data, making it very easy to find the data, right, that's sort of the governance aspect. The second piece that sort of really makes this a platform is that you're working of the same notion of a workspace, we call it a project, right? So you may start out as a data engineer, right, being asked to sort of, you know, take all these different data sources that are coming in and, you know, create and publish a data set that can be consumed for dashboarding, for data analysis, whatever, right? And you're working on that in a project. Now if you have a data science team that needs to be working of the same thing, you can just invite them to the same project, right? So they're working of the same thing, similarly to a business analyst, et cetera. And all of these results, right, and when we talk about governance, it's not about just data sets, it's all analytical products, right? So it is the models that you're creating are being put back into the catalog and govern. Data flows. Model governance, model governance and data governance. Exactly, right, so it's a huge problem that customers have, I was just talking to a large insurance company yesterday and, you know, their question was, what are you doing to make sure that I don't have to spend the enormous amount of time that I have to with the RISC group before I can put a model into production? Because they want complete lineage all the way back, saying, okay, you created this model, you're going to put it into production, whether it's for allowing credit card insurance or whatever your product is that you're selling, right? How do you make sure that there's no bias in the model that was created? Can you show me the data set on which you trained it? And then when you retrained it, can you show me that data set, right? So in case they're audited, there's complete way to go back from the production model all the way back to the data set that was created, right? And which goes even further back from all the different data sources, right? Where it was cleansed, et cetera, right? The ETL process where it was published and then picked up by Data Science Team. So all of these things, right? Putting it together with this data fabric, right? Governance being a huge, huge portion of that that goes across everything that we're doing, giving these tailor-made experiences for the different business personas or sort of the data personas and just making it extremely simple for generating insights that can be trusted. So that is what we're trying to do with the Watson data platform. And that, you know, as, you know, since last fall when we announced it, we've had a huge update on our data science experience. You heard a lot about that in the presentation this morning as well as all of our other cloud data services and the governance portfolio. Data science experience is embedded fundamental to the platform. It is, it is. And, you know, I want to ask you about that because I don't know if you remember, Jim and Manisha, a few years ago, several years ago, Pivotal announced this thing called Chorus. And it went, no, it was a collaboration platform and it really went nowhere. Now, part of the reason it went nowhere was because it was early days, but also there wasn't the analytics solution underneath it. But a lot of people questioned, well, do we really need to collaborate across those personas? Again, maybe they were immature at the time. So convince me that there's a need for that and this is actually, you know, getting used in the world. I mean, there was an example and probably you've always seen these, the WEND diagram of a data scientist, right? With all the different skills that they need, they are a unicorn and there are no unicorns, right? It's extremely hard for our customers. In fact, you know, just finding really good data scientist is extremely hard. It's a very limited supply of that talent. So that's one thing, right? So you can't find enough of these folks to scale out the level of analytics that is needed if you want to use data for a competitive advantage. So that's one aspect of talent being a huge issue. The second aspect of it is you really do need specialized skill in data engineering. You don't want your PhD data scientists spending 60% of their time finding and cleansing data, right? You have folks who really do that well, right? And you want to enable them to work closely with the data science team. And you really do need business analysts who are the key to sort of understanding the business problem that needs to be solved because that's where you always want to start any analytics product. What is it that you're trying to improve, you know, or reduce cost on or whatever your problem is that you're addressing. And so you really need, it is a team sport. You can't just do it without. Now, if it is a team sport, how are these folks going to collaborate, right? And that is why, you know, in all of our interactions with our customers and their data science teams, you know, they absolutely love the collaboration features that we have put in, we've put in a lot of effort in data science experience and the same collaboration features are actually going to extend across the portfolio of these experiences on the data platform. The whole notion of personas is so fundamental to Watson data platform. And I'm wondering, is IBM evolving the range and variety of personas for which you're providing these experiences? What I mean by that is, examples, we see more and more data science application development projects, what we see on, for example, chat bots, that involves human conversation. You need a bit more, possibly, a persona, a computational linguist or, you know, cognitive IoT, like Watson. You know, IoT, that's sensors, that's hardware devices, maybe hardware engineers, hardware engineering experiences. You see what I'm getting at is that data science-centric projects are increasingly moving from the total virtual world to being very much embedded in the physical world and the world of human, you know, machine learning guided conversation. What are your thoughts about evolving the personas mix? So application developers, or the persona I actually missed when I was talking about this before, is actually central, right? Because almost anything that the data science team is doing is going to create, at the end of the day, could create models, but the hope is that it's going to put into production systems, right? And that job, typically, is the role of an application developer, right? Now, Jim, you mentioned sort of, you know, there's a lot of emphasis these days on conversation chatbots, right? And again, right, I mean, at the end of the day, you know, with data science projects and, you know, you are in many ways trying to improve or the experience that you're giving your customers, right? Or personalizing the experience that you're giving your customers, the celebrity experience that Rob talked about this morning. And there are other personas involved in that sense, right? So to get a chatbot right, right? I mean, there is data that you can obviously harvest and use to create that flow, right? And intelligence in chatbot. But there are elements where you do need a subject matter expert to curate that, right? To make sure that it doesn't seem robotic, that it does feel genuine, right? And so there is a role for a subject matter expert, you know, we sort of club it with a business analyst role, right, or persona. But yes, right, all of these roles play an important part in sort of putting together the entire package, where it just feels seamless. And that's why I sort of come back to saying that, you know, it is a team sport. And if you do not enable the teams to work closely together, right, and enhance their productivity, you can't go after all the data that's being generated and all the opportunity that data is presenting to enterprises to gain a competitive advantage. One of the things, many of you demonstrated last fall, was this sort of, it was sort of a recommendation engine and very personalized. And it was quite a nice demo, and it wasn't a fake demo from what I understood, it was real data. Can you share with us in the time we have remaining, just some of your favorite examples of how people are applying the Watson data platform and affecting business? Sure, yeah, so I'll take a couple of examples. So I was actually in London earlier this week meeting with a customer, and they are using DSX, a data science experience with a couple of utility companies, right? One is a water company, water utility company. And the problem that they're trying to solve is, you know, they're supplying water in a hilly area, right? And they want to optimize the power that they use to power the pumps, to pump out water, right? Because it can be very expensive if the pumps are running all the time, et cetera. And so they're using data science experience to optimize when and how, right? And how long the pumps need to run, right? To enable that, you know, the customers are happy with the level of water supply that they're getting and the force that they're getting into it, while the utility company is optimizing the expense in actually powering these things, right? So that's just a recent example that comes to mind. There are others, there's a logistics, huge logistics and transportation company who's using data science experience to optimize how the refrigeration of these storage units that are going all across the globe are transporting sort of food and other articles like that. How they can optimize the temperature of the goods that they're transporting. Again, to make sure that, you know, there's absolutely the minimum amount of wastage that occurs in the transportation process, but at the same time, right, optimizing the cost that they incur because all of that sort of shows up in the end product that you and I buy from retailers. And is there instrumentation in the field involved in that? Is that kind of a semi-IoT example? Absolutely, right? So in this case, in actually both of these cases, right? In one case, there's smart meters that are throwing out data every 15 minutes. In the other example of the logistics one, right? It is, you know, data that is almost streaming coming in, right? So in one case, you can use batch processing, even though it's coming in at 15 intervals to predict out what you want to do. In the other case, it's streaming data which you want to analyze as a stream. Excellent, all right, well, exciting times here for you and your group. Congratulations on getting the product out and getting it adopted, so glad to see that. And thanks for coming on theCUBE. Thank you. All right, keep it right there, everybody. Jim and I will be back. We're live from Munich, Germany, unscripted, bringing theCUBE to you, bringing FastTrack, your data. We'll be right back.