 Live from New York, it's theCUBE. Covering the IBM Machine Learning Launch Event, brought to you by IBM. Now, here are your hosts, Dave Vellante and Stu Miniman. Welcome back to New York City, everybody. This is theCUBE, the leader in live tech coverage. We're here at the IBM Machine Learning Launch Event, bringing machine learning to the Z platform. Steve Asterino is here. He's the VP for development for the IBM Private Cloud Analytics Platform. Steve, good to see you. Thanks for coming on. Hi, how are you? Good, thanks. How are you doing? Good, good. Down from Toronto. So this is your baby. It is. This product, right? And so you developed this thing in the labs and then you pointed at platforms, right? So talk about sort of what's new here today specifically. So today we're launching and announcing our IBM Machine Learning product. And it's really a new solution that allows, obviously, machine learning to be able to be automated and for data scientists and line of business, business analysts to work together and create models to be able to apply machine learning and do predictions and build new business models in the end, right? So provide better services for their customers. So how is it different than what we know is Watson Machine Learning? Is it the same product? Sort of pointed at Z or is it different? It's a great question. So Watson is our cloud solution. It's our cloud brand. So we're building something on private cloud for the private cloud customers and enterprises. So same product, but built for private cloud as opposed to public cloud. So think of it more of a branding and Watson is a bigger solution set on the cloud. So again, it's your product, your baby. Tell what's so great about it. How does it compare to sort of what else is on the market? Why should we get excited about this product? So actually a bunch of things. It's great for many angles. What we're trying to do, obviously, it's based on open source. It's an open platform just like we've been talking about with the other products that we've been launching over the last six months to a year. It's based on Spark. We're bringing in all the open source technology to your fingertips, as well as we're integrating with our IBM's top notch research and capabilities that we're driving in-house, integrating them together and being able to provide one experience to be able to do machine learning. So that's at a very high level. Also, if you think about, there's three things we're calling out as there's freedom, basically being able to choose what tools you wanna use, what environment you wanna use, what language you wanna use, whether it's Python, Scala, R, right? There is productivity. So we really enable and make it simple to be productive and build these machine learning models and then an application developer can leverage and use within their application. And the other one is trust. I mean, IBM is very well known for its enterprise level capabilities, whether it's governance, whether it's trust of the data and how to manage the data. But also, more importantly, we're creating something called the feedback loop, which allows the models to stay current and the data scientists, the administrators know when the model, for example, is degrading to make sure that it's giving you the right outcome. Okay, so you mentioned it's built on Spark. When I think about the efforts to build a data pipeline, I think I've got to ingest the data, I got to explore, I've got to process it, clean it up, and then I ultimately serve whomever, the business, somewhere. What pieces of that does Spark unify and simplify? So we leverage Spark to be able to, obviously for the analytics, right? But so when you're building a model, one have your choice of tooling that you want to use, or whether it's programmatic or not, and that's one of the value propositions that we're bringing forth. But then we create these models, we train them, we evaluate them, we leverage Spark for that. And then, obviously, we're trying to bring the models where the data is. So one of the key value propositions is we operationalize these models very simply and quickly. Just at a click of a button, you can say, hey, deploy this model now. And we deploy it right on where the data is. In this case, we're launching it on mainframe first. So Spark on the mainframe, we're deploying the model there, and you can score the model directly in Spark on the mainframe. It's a huge value add, get better performance, right? Right, okay. And then just in terms of, well, in terms of differentiation of the competition, you're the only company, I think, providing machine learning on Z, right? So. Definitely, definitely. That's pretty easy. But in terms of the capabilities that you have, how are you different than the competition? When you talk to clients and they say, well, what about this vendor or that vendor? How do you respond? So let me talk about one of the research technologies that we're launching as part of this as well. It's called CATS, Cognitive Assistant for Data Scientist. And this is a feature where essentially, it takes the complexity out of building a model where you tell it, or you give it the algorithms you want to work with, and the CATS Assistant basically returns which one is the best, which one performs the best. And now all of a sudden, you have the best model to use without having to go and spend potentially weeks of figuring out which one that is. So that's a huge value proposition. Okay, so that's automated the choice of the algorithm, algorithm to choose the algorithm. Right. And what have you found in terms of its level of accuracy in terms of the best fit? It's actually, it works really well. And in fact, we have a live demo that we'll be doing today where it shows CATS coming back with a 90% accurate model in terms of the data that we're feeding it and the outcome that it will give you in terms of what model to use. It works really well. And that's not a, I mean, choosing an algorithm is not like choosing a programming language, right? I mean, there's bias if I like Scala or R or whatever, Java, Python. Right. I've got skill sets associated with it, but the algorithm choice is one that's more scientific, I guess, right? It is more scientific. It's based on the algorithm, on the other statistical algorithms, right? And the selection of the algorithm or the model itself is a huge deal because that's what you're going to drive your business on. If you're offering a new service, that's where you're going to be providing that solution from, right? So it has to be the right algorithm, the right model so that you can build that more efficiently. What are you seeing as the big barriers to customers adopting machine learning? So I think everybody, I mean, it's the hottest thing around right now. Everybody wants machine learning. It's great. It's a huge buzz. I think the hardest thing is they know they want it. They don't really know how to apply it into their own environment or they think they don't have the right skills. So that's actually one of the things that we're going after to be able to enable them to do that, right? So we're, for example, working on building different industry-based examples to showcase here's how you would use it in your environment, right? So last year when we did the Watson Data Platform, we did a retail example. Now today we're doing a finance example, a churn example of with customers and potentially churning and leaving a bank, right? So we're looking at all those different scenarios and then also we're creating hubs, locations we're launching today, also announcing today. Actually, Dinesh will be doing that. There's a hub in Silicon Valley where we'll allow customers to come in and work with us and we help them figure out how they can leverage machine learning. So this is a great way to interact with our customers and be able to do that. So Steve Nirvana is, and you gave that example, the retail example in September when you launched Watson Data Platform, the Nirvana in this world is you can use data and maybe put an offer or save a patient's life or effect an outcome in real time. So the retail example was just that. If I recall, you were making an offer real time and it was very fast and it was a live demo. It wasn't just a fakey. The example on churn, the outcome is to effect that customer's decision so that they don't leave, is that? Pretty much, essentially what we're looking at is we're using live data, we're using social media data, bringing in Twitter a sentiment about a particular individual, for example, and trying to predict if this customer or if this user is happy with the service that they're getting or not. So for example, people will go and socialize, oh, I went to this bank and I hated the experience or they really got me upset or whatever. So we will be bringing in that data from Twitter, so open data into and emerging it with the banks data, they have banks have a lot of data which they can leverage and monetize, right? And then making a sentiment, using machine learning to predict is this customer going to leave me or not? What probability do they have that they're going to leave me or not based on the machine learning model? And then if we think the example or the scenario that we're using now is if we think they're going to leave us, we're going to make special offers to them, right? So it's a way to enhance your service for those customers so that they don't leave you. So operationalizing that would be a call center has some kind of dashboard that says this is red, green, yellow and boom, here's an offer that you should make and that's done in near real time. We define real time as before you lose the customer. That's right, that's right. As good a definition as anything. But it's actually real time and when we call it the scoring of the data, right? So as the data is coming in, the transaction is coming in, you can actually make that assessment in real time. It could actually be done, it's called in transaction scoring where you can make that right on the fly and be able to determine is this customer at risk or not and then be able to make smarter decision to that service that you're providing on whether you want to offer something better, right? So is the primary use case for this, those streams, those areas where I'm getting, whether it be, you mentioned Twitter data, maybe IOT, you're getting center data, can we point machine learning at just archives of data and things looking historically or is it mostly the streams? It's both, of course. And machine learning is based on historical data, right? So that's how the models are built. The more accurate the, or more data you have on historical data, the more accurate that you pick the right model and you'll get the better prediction of what's going to happen next time, right? So it's exactly, it's both. How are you helping customers with that initial fit? My understanding is, how big of a data set do you need? Do I have enough to really model where I have, how do you help customers work through that? So my opinion is obviously to a certain extent, but the more data you have as your sample set, the more accurate your model is going to be. So we definitely, if you have one that's too small, your prediction is going to be inaccurate, right? So it really depends on the scenario. It depends on how many features or fields you have that you're looking at within your data set. So it depends on many things and it's variable depending on the scenario. But in general, you want to have a good chunk of data, the historical data that you can build expertise on, right? Yeah, so you've worked on both the Watson services in the public cloud and now there's private cloud. Yes. Is there any differentiation? Do you see a significant use case difference between those two, or is it just kind of where the data lives and we're going to do similar activities there? So it is similar. I mean, at the end of the day, we're trying to provide similar products on both public cloud and private cloud. But for this specific case, we're launching it on mainframe, right? So that's a different, I guess, angle at this. But we know that that's where the biggest banks, the insurance comes in, the biggest retailers in the world are, and that's where the biggest transactions are running. And we really want to help them leverage machine learning and get their services to the next level. I think it's going to be a huge differentiator for them. So you gave an example before of Twitter sentiment data. So how would that fit into this announcement? So I've got this ML on Z, and then I what? API into the Twitter data, and how does that sort of all get ingested and consolidated? So we allow hooks to be able to access data, different sources, right? Different bring in data. So that is part of the ingest process. And then once you have that data there into data frames, into the machine learning product, now you're feeding it into a statistical algorithm to figure out what the best prediction is going to be, the best model is going to be. So I have a slide that you guys are sharing on the data scientist workflow. It starts with ingestion and then selection, preparation, generation, transform, model. It's a complex set of tasks. And typically, historically, we've had, at least the last five or six years, is different tools to do each of those and not only different tools, multiples of different tools that you have to cobble together. If I understand it correctly, the Watson data platform was designed to really consolidate that and simplify that, provide collaboration tools for different personas. So my question is this, because you were involved in that product as well. I was. And I was excited about it when I saw it and I talked to people about it. Sometimes I hear the criticism of, well, IBM just took a bunch of legacy products, threw them together, put an abstraction layer on top and is now going to wrap a bunch of services around it. Is that true? Absolutely not, actually. In fact, you might have heard a while back, IBM has made a big shift into design first, design methodology. So we started with the Watson data platform, the data science experience. They started with design first approach. And so we looked at this, what do we want the experience to be? For which persona do we want the one to target? And then we understood what we wanted the experience to be and then we leveraged the IBM's analytics portfolio to be able to feed in and provide those services, integrate those services together to fit into that experience. It's not a dumping ground for, I'll take this product, oh, it's part of Watson data platform. No, that was not at all the case. It was designed first and then integrate for that experience. Okay, so there are some so-called legacy products in there, but you're saying that you picked the ones that were relevant and then was there additional design done? There was a lot of work involved to be able to kind of take those from a traditional product to be able to componentize, create a microservice architecture. I mean, the whole works to be able to redesign and fit into this new experience. So microservices architecture runs on cloud, really, I think it only runs on cloud today, right? Correct, correct. Okay, maybe roadmap without getting too specific. What can you tell us? What should we be paying attention to in the future? What should we expect? So right now we're doing our first release. Definitely we want to target any platform behind the firewall, so we don't have specific dates, but right now we started with machine learning on mainframe and we want to be able to target the other platforms on behind the firewall in a private cloud environment. Definitely we should be looking at that. Our goal is to make, I talked about the feedback loop a little bit, so that is essentially once you've deployed a model, we actually look at that model, you can schedule an evaluation automatically within the machine learning product to be able to say is this model still good enough, right? And if it's not, we automatically flag it, and then we look at the retraining process and then the redeployment process to make sure you always have the most up-to-date model. So this is truly machine learning where it requires very little to no intervention from a human. And so we're going to continue down that path on continuing that automation and providing those capabilities, right? So there's a bigger roadmap, but there's a lot of things we're looking at. We've sort of looked at our big data analyst, George Gilbert, has talked about, well, you had batch and you had interactive and now the sort of emergent workload is this continuous streaming data. How do you see the adoption of that class, first of all, is that a valid assertion that this is a new class of workload and how do you see that adoption occurring? Is it going to be a dominant force over the next 10 years? Yeah, I think so. I mean, right now, like I said, there's a huge buzz around machine learning in general, artificial intelligence, deep learning, all of these terms you hear about. But yeah, so I think as users and customers get more comfortable with understanding how they're going to leverage this in their enterprise, you know, this real-time streaming of data and being able to do analytics on the fly and machine learning on the fly, I think is going to, it's a big deal, right? And it really helps them be more competitive in their own space at the services they're providing, right? Okay, Steve, thanks very much for coming on theCUBE. I'm going to give you the last word. The event, very intimate event. A lot of customers coming in very shortly here, just in a couple of hours. Give us the bumper sticker. Oh, that's very exciting. We're very excited. This is a big deal for us. That's why, you know, whenever IBM does a signature moment, it's a big deal for us and we got something cool to talk about. So we're very excited about that. Lots of clients coming. So there's an entire session this afternoon which will be live streamed as well. And yeah, so it's great. It's, I think we're have differentiating product and we're already getting that feedback from our customers. Well, congratulations. I love the cadence that you're on. We saw some announcements of September. We're here in February and I expect we're going to continue to see more innovation coming out of your labs in Toronto and across IBM. So thanks very much for coming on theCUBE. Thank you. You're welcome. All right, keep it right there. Everybody, we'll be back with our next guest right after this short break. This is theCUBE, we're live from New York City.