 Live from San Francisco, it's theCUBE. Covering IBM Think 2019, brought to you by IBM. Everyone, welcome back to the live coverage here in Moscone North in San Francisco for IBM Think. This is theCUBE's coverage. I'm Jeffery Dave Vellante, I got two great guests here. Carlos Gavila, Chief Data Officer, Claro Columbia and Carlos Appagacy. Appagacy. Engagement manager, IBM's data science elite team, customer of IBM, conversation around data science. Welcome to theCUBE, thanks for joining us. Thanks for having us. All right, so we're here, the streets are shut down, AI anywhere is a big theme, multi-cloud, but it's all about the data everywhere. People are trying to put end to end solutions together to solve real business problems. Data's at the heart of all this, moving data around from cloud to cloud, using AI and technology to get insights out of that. So take a minute to explain your situation, what you guys are trying to do. Okay, okay, perfect. Right now we're working a lot about the business thing because we need to use the machine learning models or other artificial interiors to take best decisions for the company. We were working with Carlo in a short model in order to know how can we avoid the customers to leave the company, because for us it's very important to maintain our customer, to know how the behavior is from them. The artificial intelligence is an excellent way to do it. That we have a lot of challenge about that because we have a lot of data, different systems that are running the data. We need to put all the information together to run the models. The lead team that Carlo is leading right now is helping to us a lot because we know how to handle that, we know how to clean the data. We need how to do the right governance for the data and the IBM equity is very compromised with us in order to do that. Cephi, that is one of the engineers that is very close to us right now, she was working a lot with my team in order to run the models. You see, she was doing a lot for our fight and right now we are trying to do it over the Hadoop system running Spark and that is the good way that we are thinking that it's going to get the gold for us. We need to maintain our customers. So you guys are the largest telecommunications piece, Claro, in Mexico for voice and home services. Is that the segments you guys are targeting? Yeah. Just scope, size, how big is that? Claro is the largest company in Colombia for telecommunications. We have maybe 30 million customers in Colombia, more than 50% of the market share. Also, we have many, maybe 2.5 millions of homes in Colombia that is more than 50% of the customers for home services. And you know that is a big challenge for us because the competitors are all the time trying to take our customers and the churn model helps us to how to avoid that and how to do the artificial intelligence to do it machine learning is a very good way to do that. So a classic problem in telecommunications is churn. Right, so it's a data problem. So how did it all come about? So these guys came to you? Yeah, they came to us, we got together, we talked about the problem and churn was at the top. These guys have a ton of data. So what we did is the team got together. We have really the way the data-sensitivity team works is we really help clients in three areas. It's all about the right skills, the right people, the right tools, and then the right process. So we put together a team, we put together some agile approaches on what we're going to do. And then we started by spinning up an environment. We took some data in, we took their, and there was a lot of data, it was terabytes of data. We took their user data, we took their usage data, which is like how many texts, cell phone, and then billing data. We pulled all that together in environment. Then the data scientists alongside with Carlos's team really worked on the problem. And they addressed it with machine learning, obviously targeting churn. They tried a variety of models, but XGBoost ended up being one of the better approaches. And they came up with pretty good accuracy, about 90, 92% precision on the model. So predicting. On prediction, churn. So what did you do with that data? That is a very good question because the company is preparing to handle that. I have a funny history. I say to the business people, okay, these customers are going to leave the company. And I forget about that. And two months later, I was asking, okay, what happened? They say, okay, your model is very good. All the customers goes. Oh my God, what is happening with that? They were not working with the information. That is the reason that we are thinking that the good way is to think from the right to the left because which is the purpose? The purpose is to maintain our customers. And in that case, we lose 15,000 customers because we didn't do nothing. We are closing the circle. We are taking care about that. Prescripting models could help us to do it. And okay, if maybe that is an invoice problem, we need to correct them to fix the problem in order to avoid that. The first part is to predict, to get an score for the charm and to handle that with the people. Obviously, working also at the root cause analysis because we need to fix from the root. Carlos, so walk us through the scope of just the project because this is a concern we see in the industry. I got a lot of data. How do I attack it? What's the scope? Do you just come in and ingest it into a data lake? How do you get to the value of these insights quickly because obviously they're starving for insights. Take us through that quick process. Well, you know, every problem's a little different. We help hundreds of clients in different ways, but this particular problem, it was a big data problem because we knew we had a lot of data. They had a Hadoop environment, but some of the data wasn't there. So what we did was we spun up a separate environment. We pulled some of the big data in there. We also pulled some of the other data together. And we started to do analysis on that kind of separately in the cloud, which was a little different. But we're working now to push that down into their Hadoop data lake because not all the data's there, but some of the data's there and we want to use some of that compute. So you had to almost do an audit, almost figure out what you want to pull in first. Absolutely. Tie it to the business. On the business side, what were you guys like, waiting for the answers, or what was some of the, on your side of the process, how did it go down? Thinking about business, we were talking a little bit about that, about the architecture to handle that. ICP for data, we think that is a very good solution for that because we need the infrastructure to help us in order to get the answers. Because finally we have a question. We have a question, why the customers are leaving us? And the answer is the data. And the data handling in a good way, with governance, with data cleaning, with the right models to do that. And right now our concern is business action and business offer. Because the solution for the company is that with all the new products that are coming from the data. So 10 years ago, you probably didn't have a Hadoop cluster to solve this problem. And the data was maybe, maybe it was in a data warehouse, or maybe it wasn't. And you probably weren't a chief data officer back then. That role kind of didn't exist. So a lot has changed in the last 10 years. My question is, do you, first of all, I'd be interested in your comment on that. But then do you see a point in which you can now take remedial action or maybe even automate some of that remedial action using machine intelligence and that data cloud or however else you do it to actually take action on behalf of the brand before humans or without even human involvement. You just foresee a day. Yeah. So just a comment on your thought about the times, I've been doing technology for 20 something years and data science is something that's been around, but it's kind of evolved in software development. My thought is, we have these roles of data scientists, but a lot of the feature engineer and data prep does require traditional people that were DVAs and now data engineers and variety of skills come together. And that's what we try to do in every project just to add that comment. As far as predicting ahead of time, I think you're trying to say what data, help me understand your question a little bit. So you've got 93% accuracy, okay? So I presume you take that, you give it to the business, business says, okay, let's maybe reach out to them, maybe do a little incentive or what kind of action, can the machines take action on behalf of your brand? Do you foresee a day where that can happen? Yeah. So my thought is for Clara, Columbia and Carlos, but obviously this is, to me, remain is the predictive models we built will obviously be deployed and then it would interact with their digital mobile applications. So in real time, it'll react for the customers. And then obviously, you want to make sure that Clara and company trust that and it's making accurate predictions. And that's where a lot more, we have to do some model of validation and evaluation of that so they can begin to trust those predictions. I think is where we're at. I want to get your thoughts on this because you're doing a lot of learnings here. So can you guys each take a minute to explain the key learnings from this? As you go through the process, certainly in the business side, there's a big imperative to do this. You want to have a business outcome that keeps the users there. But what did you learn? What were some of the learnings you guys got from the project? The most important learning from the company that was cleaning the data. That sounds funny, but we say in Annalise, garbage in, garbage out. And that was very, very important for us. That was one of the things that we learned that we need to put cleaning data over the system. Also the governance, many people forget about the governance of the data. And right now we're working again with IBM in order to put that governance to it. So data quality problem. Yeah, data quality. And do you report into like a COO or the CIO, you're a peer of the CIO? How does that organization work? That is another funny history because the company is, right now I am working for planning. Yes, we're working for planning for the company. For business planning. Yeah, for business planning. I was coming for an engineering. And right now I'm working for planning and trying to make money for the company. And you know, that is an engineer thinking how to get more money for the company. I was talking about some kind of analytics that is geospatial analytics. And I was using that engineer to know how the network is handling, how the quality of the network. And right now I'm using the same software, the same knowledge to know which is the better points to do sales. It's a good combination. Finally I'm working right for planning. And my boss, the planning chief is working for the CEO. And I heard about different organizations. Somebody is in financial, the CDOs in financial or the CDO for IT is different. That depends on the company. Right now I'm working for planning, how to handle the things, how to make more money for the company, how to handle the chart. And it's interesting because all the knowledge that I have for engineering is perfect to do it. Well, I would argue that's the job of a CDO is to figure out how to make money with data or save money. Yeah, absolutely. It's a number one anyway, start there. Yeah, the thing we always talk about is really proving value. It starts with that use case, identify where the real value is, and then the technology can come and the development can work after that. So I agree with 100%. Carlos, thanks for coming in. Largest telecommunications in Columbia. Great customer reference. Carlos, take a minute to explain real quick. Get a plug in for your data science elite team. What do you guys do? How do you engage? What are some of the projects you work on? Right, yeah. So we're a team of about 100 data scientists worldwide. We work side by side with clients. And our job is to really understand the problem from end to end and help in all areas, from skills, tools, and technique. And we work in prototype in a three agile sprints. We use an agile methodology, about six to eight weeks, and we try to develop a real, we call it a proof of value. It's not a MVP just yet, or POC, but at the end of the day, we prove out that we can get a model, we can do some prediction, we get a certain accuracy, and it's going to add value to it. You guys just jump. It's not a freebie, right? It actually is freebie. I'm sorry, I'm sorry. It's not a four-page service. It's a freebie, right? Yeah, it's no cost. But you got to, We don't like to use free, but that's okay. But you got to be interesting, right? That's short. It's a good lead. We don't charge. We don't charge it. But it's something that clients can take advantage of if they've got an interesting problem, and they're potentially going to do some business. If you're the largest telecommunication provider in the country, you get a freebie, and then... That's exact. The key is you guys dig in. You guys dig in. We dig in. It's practitioners, real practitioners, with the right skills, working on the problem. Great sales, by the way, Claro, Columbia's team, they were amazing in Columbia. We had a really good time, six to eight weeks, working on it, a problem, and those guys all loved it too. They were, before they knew it, they were coding in Python and R, and they already knew a lot of this stuff, but they were digging in with the team, and it came in well together. This is the secret to modernization of digital transformations having the sales process is getting co-creating together. Absolutely. You guys do a great job, and I think this is a trend we'll see more of. Of course, theCUBE is bringing you live coverage here in San Francisco at Moscone, New York. That's where our set is. They're shutting down the streets for IBM Think 2019 here in San Francisco. More CUBE coverage after this short break. We'll be right back.