 Live from Las Vegas, it's theCUBE. Covering IBM Think 2018, brought to you by IBM. Hello everyone, I'm John Furrier. We are here in theCUBE at IBM Think 2018. Great conversations here in the Mandalay Bay in Las Vegas for IBM Think, which is six shows wrapped into one, all combined into a big tent event. Good call by IBM, great branding. Our next guest is Rob Thomas, CUBE alumni, general manager of IBM Analytics. Great to see you. John, great to see you, thanks for being here. We'd love having you on CUBE alumni many times. I mean, you've seen the journey. I can remember when I talked to you, it was almost four or five years ago. Data, Hadoop, big data analytics, data lakes, evolve significantly now, where Ginny's major keynote speech has data at the center of the value proposition. I mean, we've said that before. Yes. The data is the center of the value proposition. Every company is finally waking up. And then I had coined the term, the innovation sandwich. Blockchain on one side of the data, and you got AI on the other, essentially software. This is super important with multi-cloud. You got multiple perspectives. You got regions all around the world, GDPR, which everyone's been talking about, you guys have been doing lately. But the bigger question is, the technical stacks are changing. 30 years of stacks evolving. The technology under the hood is changing, but the business models are also changing. This puts data as the number one conversation. That's your division. Your keynote here, what are you guys talking about? Are you hitting that note as well? So number one is think of this ladder to AI. We've talked about that before. Every client's on a journey towards AI. And there's a set of building blocks. Everybody needs to get there. We used the phrase once before, there's no AI without IAA. Meaning if you want to get to that endpoint, you got to have the right information architecture. We're going to focus a lot on that. We've got a new product release called IBM Cloud Private for Data, which takes all of the assembly out of the data process, a really elegant solution to see all your enterprise data. That's going to be the focus for me this week. I want to get into that, but I also heard Scott, your VP in marketing now, talk about bad data can cripple you. And so I want to explain what that actually means, because it's always been dirty data. It's been kind of a data science word or data warehouse word, clean data, data cleanliness. But if you're going to use AI as a real strategic thing, you need high quality data. You do. Your thoughts? I think backwards from the shiny object, because everybody loves the shiny object, which is some type of an AI outcome, customer centricity, making you feel like a celebrity. There's two things that have to happen before that, really three. One is you need some type of inferencing, a model layer, where you're actually automating a lot of the predictive process. Before that, you need to actually understand what the data is. That's the data governance, the data integration. And before that, you need to actually have access to the data, meaning know where it's stored. Without those things, you just have a shiny object, and not necessarily an outcome. That's why these building blocks are fundamental, and the clients that get to this point are the ones that try to jump to the shiny object, and they don't have the data to support that. And then you've got companies going on digital transformation, which is basically all their data legacy, trying to modernize it. Then you have the modern companies like Uber, and we saw the first fatality of an Uber car this week. Again, points out the reality that real-time is real-time. And data, the importance of having data, whether it's sensing data, we're not, it's coming there, but you can start to see it happening. Real-time data is key. That means data mobility is critical. And you mentioned private, public. Storing the data and moving data around, having data intelligence is the most important thing. Real-time data in motion, intelligence, you know, where are we? Is that a setback with the Uber incident? Is it a step forward? Is it learning? What's your view of the data quality of movement and real-time? I think data ingestion is one of the least talked-about topics. That is one of the most important. With IBM Cloud Private for data, we can ingest 250 billion events a day. Let me give you some context for that. 2016, the entire credit card industry, everywhere in the world, did 250 billion transactions. So what credit cards do in a year, we can do in a day. Biggest stock trading day ever on the New York Stock Exchange, what got done in that entire day, we can do in the first 40 minutes of trading. But that value there is, how fast can you bring data in to be analyzed? And can you do a decent bit of that pre-processing or analytics on the way in? That's how you start to solve some of the problems you're describing, because it's instant. And it's unsurpassed amounts of data. So ingestion is a key part of the value chain, if you will, on data management, the new kind of data management. Ingesting and understanding context, then is that where AI kicks in and where does the AI kick in? Because the ingestion speaks to the information architecture. AI. Now I got to put AI on top of that data, so is the data different? Talk about the dynamic between, okay, I'm ingesting data for the sake of ingesting. Where is AI connected? So you got the data. AI starts where you're saying, all right, now we want to automate this. We're going to build models. We're going to use the data that we've got in here to train those models. As we get more data, the models are going to get better. Now we're going to connect it to how humans want to interact. Maybe it's natural language processing. Maybe it's visualizing data. That's the whole lineage of how somebody gets towards this AI idea. What are some of the conversations you're having with customers and how have they changed? And give some color. I mean, only a few years ago, we were talking about data lakes. Right. Okay, what is the conversation now and give some context of how far that conversations has gone down the road towards advancement? I think we're going from data lakes to an idea of a fluid data layer, which is all of your data assets managed as a single system, even if they sit in different architectures, because there's no one, we all know this, we've been around this industry forever. There's no one way to support or manage data that's going to support every use case. So this idea of a fluid data layer becomes critical for every organization. That's been one big change. Other big change is containers. What we're doing with Cloud Private for data is based on Kubernetes. That's how people want to consume applications, but nobody's really solved that for data. I think we're solving that for data. Let's dig into that. It was one of my topics I want to drill down on. Containers have been great for moving workloads around. Certainly, Kubernetes has been a great orchestration tool. How does that fit for data? Do I'm just putting a container in datasets who's addressing the envelope of that container? How is that addressable? I mean, how does it work? Let me give you an analogy. So you go back to your 1955, there is no standards in any shipping port around the world. Everybody's literally building their own containers, building their own ships, building their own trucks. It's incredibly expensive and takes forever to get cargo to move from one place to the next. 1956, a guy named Malcolm McClain, he invents the first intermodal shipping container. It patents it, it becomes the standard. So now every port, every container looks identical. What's the benefit? Well, sure, it's saved, we've made more flexibility. Saved a lot of money, 90% of the cost came out of shipping a container. But the biggest thing is it changed commerce. So you look at GDP at that time, it took off all because of the standardization around a form factor that made it accessible to everybody. Now, let's put that in the IT world. We got containers for the application world, made it much easier to deploy a standard again. And program around. More cost-effective, more, yep, exactly. What's the cargo in IT? It's data, data is the cargo. That's what's sitting inside the container. Now you have to say, how do we actually take the same concepts that we did for applications, make that available for data so that my data can fit anywhere? That's what we're doing. How does that work and what's the impact to the customer? Is it IBM software that you're doing? Is it Kubernetes open start software? Just tie that together for me. So IBM Cloud Private is our Kubernetes distribution with some different pieces we've put on it. When you add the Cloud Private for data, it's got a spark engine. Like everything we do is based on open source to start with. And then we have an experience for a data scientist, an experience for a data analyst. It's your view to your enterprise data. You'll love the UI when you see it. First, above the fold, all my machine learning models in the organization, what's working, what's not working. Below the fold, what's my data, structured or unstructured? Sensitive, nonsensitive. I click it on, I can see all of my data. Hadoop, Cloud A, Cloud B, Cloud C, on-premise system. It's, get a view to all of your data. So is the purpose to move the data around? No, the purpose is actually the exact opposite. Leave the data in place, but be able to treat it as a single data environment. We're doing a lot of work with Federation, our SQL technology, which historically, as we all know, Federation hasn't really performed, we haven't performed. Okay, so I'm just, in the use case of my head, so I've stored the data on my Private, secure, comfortable, feeling good about it, but I have a public cloud app. How does that work? Is it a replica of the data? Is it just a container that makes it addressable? How does that move across? So click a button, move the data. If you want it to be a replica, click a button and say replicate. If you want to just move it, just click a button and move it. It's literally that easy. And so the customers can choose where to put the data. Yes. Can they do public version of this or only private? Both, it connects to public as well. Okay, so that was genius mention. Okay, cool, what's the most exciting thing for you this week going on in your world? Obviously, you know, the center of the value proposition, and Jenny used your line, so I'm sure you fed her some good sound bites there because she was basically taking your pitch as the headline for the keynote. Is that the highlight or is it customer activity? I think the exciting thing, and Jenny did talk about it, is connecting data to AI. I'd say many clients have kind of thought of those as two different topics. We do that in three ways. We say common machine learning fabric. You can build a model and Watson, you can deploy it where your enterprise data is or vice versa. We do that with the metadata. You create business or technical metadata on premise. You can push that to Watson or vice versa. And like we just talked about, we make the data movement incredibly easily. So we're uniting these two worlds of data and AI that have tended to be different parts of an organization and many clients. We're uniting that. I think that's pretty interesting. All right, so final question I got to ask the tough one, which is, okay, Rob, I love it, but I'm really not paying attention to data because that my hands full in my IT transformation and we're making critical decisions on cloud. Globally, you have multiple reach to deal with. I got different issues outside in each digital nation, but I'm going to get the data after. What's in it for me? Your whole pitch. I'm dealing with cloud right now. So what should I be cross-connecting with the cloud decision and the cloud conversations that relate to the benefits of what you're doing? If you're not paying attention to data, you're not going to be around. So your cloud decisions are kind of worthless because you're not going to be around if you're not paying attention to the data. So I'm going to make a bad cloud decision if I don't factor in what? I believe you have to think about your data strategy. Look, every organization is going to be multi-cloud, but you have to have a single data strategy regardless of what your cloud strategy is. You've got to think about all those building blocks I talked about. Manage data, collect data, govern data, analyze data. That has to be one strategy regardless of cloud. If you're not thinking about that, you're in trouble. Or making sure that I have Kubernetes. Is that a good decision? That is a great decision. Makes it really easy. Seamless to deploy applications, to deploy data, to move it around clouds, makes it really easy. And what's the business model for containers? It kind of shifts to being a commodity? I think over time, yes, but there's so much to do around containers because containers, again, go back to the analogy. It's just the crate. It's not the cargo. It's not the ship. It's just the crate. It's one piece. Yeah, and there's no lock. A lot of choice there. Two clients can do whatever they want. Yeah. All right, we'd love Kubernetes. We'll be at KubeCon in Copenhagen next month. So keep a look out there for us. This is Rob Thomas here inside theCUBE here at IBM Think, breaking down all the action in the data science world, data world, it's the center of the value proposition. The main story here at IBM Think is the data at the center of the value proposition for the modern enterprise. I'm John Furrier inside theCUBE. Be back with more after this short break.