 Live from San Jose, in the heart of Silicon Valley. It's theCUBE, covering DataWorks Summit 2018, brought to you by Hortonworks. Welcome back to theCUBE's live coverage of DataWorks here in San Jose, California. I'm your host, Rebecca Knight, along with my co-host, James Kobielus. We have with us Vimal and Duran. He is the global data business group ecosystem-leaded Accenture. He's coming out to us straight from the Motor City. So welcome, Vimal. Thank you, thank you, Rebecca. Thank you, Jim. Looking forward to talk to you the next 10 minutes. So before the cameras were rolling, we were talking about how data veracity and how managers can actually know that the data that they're getting, that they're seeing is trustworthy. Can you, what's your take on that right now? Right, so in the today's age, the data is coming at you in a velocity that you never thought about, right? So today, the organizations are gathering data probably in the magnitude of petabytes. And this is a new normal. We used to talk about geeks and now it's a petabyte. And the data coming in form of images, video files, from the edge, you know, edge devices, sensors, and all social media and everything. So the amount of data, this is becoming the fuel for the new economy, right? So the companies who can, you know, find a way to take advantage and figure out a way to use this data, going to have a competitive advantage over their competitors. So for that purpose, even though it's coming at that volume and velocity, doesn't mean it's useful, right? So the thing is, if they can find a way to make the data can be trustworthy by the organization, and at the same time it's governed and secured, that's what's going to happen. So, I mean, it used to be, it's called data quality, we call it, right? When the structure, it's okay, everything is maintained in SAP or some system, it's good it's coming to you. But now, you need to take advantage of the tools like machine learning, artificial intelligence, combining these algorithms and tool sets, and ability of people's mind, putting there in there, and making it somewhat, things can be happened to itself, at the same time, trustworthy. We have offerings around that essentially is developing place, it differs from industry to industry. Given the fact, if the data coming in some, it's only worth for 15 seconds. After that, it has no use, other than, you know, understanding how to prevent something, right? From a sense of data. So we have offerings putting into place to make the data, you know, trustworthy, governed, secured, for organization to use it, and helping organization to get there. That's what we're doing. The standard user of your tools, is it a data steward in this, sort of the traditional sense? Or is it a data scientist or data engineer who's trying to, for example, compile a body of training data for use in building and training machine learning models? Or do you see those kinds of customers for your data veracity offerings, that customer segment growing? Yes, we see both sides of it, pretty much all walk of customers in our life, right? So the fact is, you have to hit the nail on the head. Yes, we do see that type of aspect. And also becoming, you know, the data scientist also, you're also getting another set of people, the citizen data scientist, right? The people. What is that? That's a controversial term. I've used that term on a number of occasions and a lot of my colleagues and peers in terms of other analysts, bat me down and say, now that demeans the profession of data science by calling it, but you tell me how your accenture is defining that. So the thing is, it's not demeaning. The fact is, to become a citizen data scientist, you need the help of data scientist. Basically, every time you need to build a model and then you feed some data to learn and then have an outcome to put that out. So you have a data scientist create an algorithm. But what we do is, you know, citizen data scientists mean, say like, if I'm not a data scientist, I should be able to take advantage of a model, build for my business scenario, feed something data in, whatever I need to feed in, get an output, and that programmer, that tool's going to tell me, go do this, or you know, don't do this kind of things. So I become a data scientist by using a predefined model that developed by an expert. Or minds of, you know, many experts together. But rather than me, you know, I instead of going and hiring 100 experts, I go and buy a model and able to have one person maintain or tweak this model continuously. So how can I enable that, that larger volume of people by using more models? That's what- If a predictive analytics tool like you would be licensed from whatever vendor, if that includes pre-built machine learning models for particular tasks in it, does that, do you as a user of that tool, do you become automatically a citizen data scientist or do you need to do something in terms of do some actual act of work with that model or data to live up to the notion of being a citizen data scientist? It's a good question. So in my mind, I don't want to do it. My job is something else, to make something for the company. So my job is not creating a model and doing that. My job is I know where's my sets of data. I want to feed it in and I want to get the outcome that I can go and say like, I can increase my profit, increase my sales, that's what I want to do. So I may become a citizen data scientist without me knowing, right? I won't even be told that I'm using a model. I will take this set of data, feed it in here, it's going to tell you something. So our data veracity point of view, we have these models built into our summer platforms. That can be a tool from Hortonworks, taking advantage of the data storage tool or something, or any other in our own algorithms put in. So that helps you to create and maintain the data veracity to a scale of, if you say one to five, one has been low, five has been bad, to maintain at the five level. So that's what the objective of... Yeah, so you're democratizing the tools of data science for the rest of us to solve real business problems. So the data veracity side here is saying that the user of these tools is doing something to manage the, to correct or enhance or augment the data that's used to feed into these pre-built models to achieve these outcomes? Yes, I mean, I wouldn't say they augment the data or not, they feed the data, training data, it comes out with an outcome to say, go do something, it tells you to perform something, or do not, it's still an action, comes out with an action to achieve a target. Very good. That's what it's going to be. You mentioned Hortonworks, and since we are here at DataWorks and the Hortonworks show, tell us a little bit about your relationship with that company. Definitely, so Hortonworks is one of our premier, you know, strategic partners, and we've been the number one implementers, the partners for the last two years in a row, implementing their technology across many of our clients, and from a partnership point of view, we have jointly developed offerings, so what actually is best is we have very good industry knowledge, and so with our industry knowledge and with their technology together, what we're doing is we're creating some offerings that you can take to market. For example, we used to have data warehouses like using Teradata and older technology data warehouses. They're still good, but at the same time, people also want to take this unstructured data, images files, and to be able to incorporate into the existing data warehouses, and how I can get the value out of the whole thing together, right? So for that, that's where Hortonworks type of tools comes to play, so we have developed offerings called Modern Data Warehouse, taking advantage of your legacy systems you have, plus this new data coming together, and immediately you can create an analytics case, use case, to do something. So we have pre-built programs and different scripts that take in different types of data, moving into a data lake, Hortonworks data lake, and then use your existing legacy data and all those together, help you to create analytics use cases. So we have that called Data Modernization Offering, we have one of that. And then we- So these are pre-built models for specific vertical industry requirements or specific business functions, predictive analytics, anomaly detection, and natural language processing, am I understanding correctly? Yes, we have industry-based solutions as well, but also to begin with the data supply chain itself, right? To bring the data into the lake to use it. And that's one of the offerings we'll play. The data solution pipeline, it's like pre-packaged models and rules and so forth. Right, pre-packaged data ingestion, transformation, that pre-packaged to take advantage with the new data sets along with your legacy data. That's one offering called Data Modernization Offering. And that to cloud, right? So we can take to cloud, Hortonworks in a cloud, it can be Azure, AWS, HPE, any cloud, plus moving data. So that's one type of offering. And today actually we announced another offering jointly with Hortonworks Atlas and Ranger tool to help GDPR compliance. Oh, right. You explain what that tool does specifically to help customers with GDPR, please. And does it work out of the box with Hortonworks Data Steward Studio? Well, to me, I can get your answer from my colleagues that who are much more technical on that. But the fact is, I can tell you functionally what the tool does is. Okay, please. So you, today the GDPR is basically, there's regulations about, you need to know about your personal data. Yes. And you can control, you have the, you have your own destiny about your personal data. You can call the company and say, forget about me if you're an EU resident. Or say, modify my data. They have to do it within certain timeframe. If not, they get fined. The fine can be up to 4% of the companies. So it's going to be a very large fine. So what we do is basically take this tool, put it in, working with Hortonworks Atlas and Ranger tool. We can go in and scan your data leak and we can scan it at the metadata level and come and showcase, put it in a place. Then you know where is your personal data information about a consumer in a lies. And now I have, I know everything because I know it used to be in a legacy situation. You know, data originates someplace. Somebody takes it and put it in a system. Then somebody else download to an Excel. Somebody else put it in an access database and these kind of things. Now your data is pollinated across. You don't know where that lies. In this case, in the lake, we can scan it, put this information, the metadata and the lineage information. Now you immediately know where this data lies. When somebody calls, Rebecca calls and say, no longer use my information, I exactly know it's stored in this place, in this table, or this column. Let me go and take it out from here so that Rebecca doesn't exist anymore. So that's the idea behind it. And also we can catalog the entire data lake and we know it, not just personal information. Other information and everything about all the dimensions as well. And we can use it for our business advantage. So that's what we announced today. We're almost out of time, but I want to finally ask you about talent because this is a pressing issue in Silicon Valley and beyond in really the tech industry. Finding the right people, putting them in the right jobs and then keeping them happy there. So recruiting, retaining. What's the, what's Accenture's approach? Especially this area, talent is the hardest part. Thanks to Hortonworks and Hortonworks point of view. Something to Detroit where the housing is far more. Not a bad idea. Exactly, but the fact is- We're both Detroiters, yeah. What we did was we, Hortonworks, we basically Accenture has access to Hortonworks University, all the educational aspects. So we decided we're going to take that advantage and we're going to enhance our talent by bringing the people from our, retraining the people, taking the people to the new. People who know the legacy data aspects. It's to take them to see how we take to the new world. So that we have a plan to using their, Hortonworks together, the university, the materials and their people help. Together we're going to train about 500 people in different geos, 500 per piece. And also our delivery centers in India, Philippines, these places. So we have a larger plan to retrain the legacy into new, as well as go and get people from the out of the college and stuff. Start building them from there, from an analyst to a consultant to a technical level. And so that's the best way we are doing. And actually the group I work with, our group technology officer, Sanjeev Vohra, he basically in charge of training about 90,000 people in different technologies around that space. So it's a magnitude as high, but even that's our approach to go and train people and take it to help clients. Are you training them all to be well rounded professionals and all things data? Are you training them for specific specialties? Very, very good question. We do have this called master data architect program. So basically in the different levels, after these trainings people go through, especially you have to do so many projects, come back, have an interview with a panel of people and you get certified within the company at certain level. At the master architect level, you go and help a customer train there, you know, transform their data transformation, architecture, vision, where do you want to go to that level? So we have the program within our university and that's the way we take it step by step to people to that level. Great. Vimal, thank you so much for coming on theCUBE. It was really fun talking to you. Thank you so much. Thank you so much for having me. Thank you. I'm Rebecca Knight for James Kobielus. We will have more, we actually will not have any more coming up from DataWorks. This has been the DataWorks show. Thank you for tuning in. And we'll see you next time.