 So today's talk is going to be on how do we leverage data and analytics and get improved business performance. So as Professor was talking about, I'm a data analytics consultant and an author. I've consulted for some big names such as SAP and PNG and GE, but for small companies as well. But whether you talk about the big companies or the small companies, the problem is the same or the question is the same. How do we get value out of data analytics? That's a question the companies have been asking me. The size, complexity and scale might vary, but fundamental question remains the same. How do we get value out of data analytics? And this is today's presentation. I'll give you a quick overview about how I approach this problem. I've written three books, which you can see right behind me. Data for Business Performance and Best Practices and Data Quality, which is a recent one. I'm a blogger at Forbes, SAP Insider, and I teach data analytics at IE Business School in Madrid, Spain. I have a PhD MBA and so on and so forth. So what I did was in most of the places I've been going these days, people have been asking about, hey Prashant, can we trust ChatGPT? Is it really good? And all those things. So what I did was I put this question to ChatGPT. I asked ChatGPT, who is this guy, Prashant Sotheclub? And ChatGPT gave this answer. So he says it's an, what is it, expert in the field of data analytics, specifically in the area of data driven decision making. I didn't put this anywhere in the internet. So it looks like ChatGPT compiled all this information and presented this to me. So it looks pretty good. So I've been using a version of this in my recent talks. So long story short. So if you are going to ask me about the accuracy and the reliability of ChatGPT, so you know where I would be going in that direction. Okay. So this is what I have for you guys today. So first we'll talk about data and business value in my own words. How I define data and how is it tied to business value? Then we'll talk about five strategies on how do you transform data into a business asset. And I'll talk about a case study or a project which I did for a retail and CPG firm, which is based in the Washington state, which is south of Canada. So just to give you an idea about how it's actually done. It's not some theory or conceptual thing which I'm talking about. How did I apply all those things which I've been talking about? So let's start about analytics. So if somebody were to ask you, hey, what's analytics? How do you define analytics? What would you tell them? Type your answers in the chat box. What's your definition of data analytics? Getting insight from data? That's good. I like two keywords there, insight and data in Mingfang's response. Lauren says information, which is synonymous to insights most of the time. Very good, Lauren. Again, very similar to what Mingfang was talking about. I'll go for one more answer, wisdom. Viana says data to wisdom. Great answers. So we have a lot of definitions, especially in the industry side, when we talk about the definition of data analytics, some people talk about insight, some people say decision making, some people talk about performance management and all those things. So overall, especially in the industry, it all depends on whom you ask. For example, let's take oil and gas. So if you talk to people in oil and gas and ask them what's data analytics is. So their definition of data analytics is mainly centered around data capture. Because all said and done, data analytics in oil and gas has been pretty much regulatory centric, so they need the data to show to the regulators if something goes wrong. But on the other hand, if you go and ask the CPG and the retail firms, hey, what's your definition of data analytics? They say our definition of data analytics is about deriving insight so that we can understand the product, the customer, so on and so forth better. But on one hand, if you look at the oil and gas companies, which are focused mostly on markets and commodities. But on the other hand, the retail and the CPG firms are focused mostly on products and customers. So it all depends on where the focus is. Okay, now within this industry, let's talk about functions. So if you talk to people in finance, what's your definition of data analytics? Their definition of analytics is mostly on reporting. Because in North America, we have GAAP and the rest of the world, we have IFRS and all the standards. So they need to produce reports to comply to those standards. So if you talk to finance, they say data analytics is mostly on reporting. Whereas if you talk to people and supply chain, they say data analytics is more about understanding the customer. Again, on one hand, we have the regulators and on the other hand, we have the customers. So and if you talk to the tech companies, if you talk to SAP, I worked in SAP for a long time. And if you talk to SAP, their definition of analytics is pretty much what they're selling. When I left SAP in 2012, they were selling over 200 products, right from ERP to CRM to GRC, BI, BW, you name it, SAP has a product. So if you talk to SAP, they talk about the whole data lifecycle, right from data capture, data integration, data science, decision science, everything. Whereas if you talk to SAS, one of the companies I advised, they don't do much work on data capture. Their work is mostly on data science and ML. Their definition of analytics is more centered around data science. So on SAP looks at all four stages of the data lifecycle, but when you talk to SAS and Power BI and Tableau, their definition of analytics is more centered around data science. So it's almost like six blind men looking at an elephant and defining the elephant in their own words. Each one comes with their own perspectives and start describing analytics in their own ways. So to encompass all the definitions, all the kind of varied value proposition of what analytics is all about, my definition of data analytics is asking questions to derive insights from data to measure and improve business performance. So if you don't believe in the concepts of improving and measuring business performance, you are not going very far with data analytics. So when I talk to my clients, they say, hey Prashant, should we talk about moving data to the cloud? Should we talk? Should we hire a Python developer and all those things? So I tell, that's all important. But what's required first is questions for which you are seeking answers for. If you don't have questions, stop your analytics projects right now and find out those questions for which you are seeking insights for. No questions practically means no analytics and the purpose of analytics is basically about measuring and improving business performance. If you don't believe in the concepts of measure measurement and improvement, there is no point in doing analytics once again. So any questions on this definition? In fact, I even wrote a Forbes blog on this on data culture, which is all about measurement and the focus and the role of KPIs in analytics. Yeah, you can read that blog later. Just go to Forbes and put Prashant Sautekal. You'll see, I have written about 20 blogs in Forbes. You can read all of them. So with this, let's look at the challenges. One of the key things is about getting good quality data. But in many places, there's a lot of challenges. Gartner said that 80% of the analytics projects failed. Harvard Business Review report says that just 3% of the data in a company meet data quality standards. And IBM and Carnegie Mellon said that 90% of the data in a company is unused. Forrest has said it's 73%. IBM and Carnegie Mellon said it's 90%. So overall, a lot of data is captured, which is called as the dark data in the company, which is not very useful for how to run the company, how to run analytics. So overall, in most of the projects, there is no data. Or even if there is data, there is no quality in that. So overall, most of the companies are challenged with quality data. So my question to you, when it comes to quality data, what's your definition of quality data? Of course, it is a very broad term. It's a very big term. If somebody were to ask you, how do you define quality data? What would you tell them? Type your answers in the chat box. I said analytics is all about questioning, right? So I'll be asking a lot of questions here. And getting insights from you. So how do you define quality data? Okay, data can be used for answering the question. Good. Data fit for use. I like Ivana's answer. Fit for use. So let's elaborate the term use, Ivana. What do you mean the utility of the data? Where is it used for? Fit for use. Great answer. Quality requirements. Liz is also a very good quality requirements. Claire has got little bit technical things about getting into metadata. So on. Use depends on the application. Okay. What are the applications, Ivana? Use depends on the application. Fit for use depends on the application. What are the applications of data? Where is it used for? Okay. Great answers. Allow all your answers on data quality. So basically I define data quality in my book, Data Quality, which was published by Wiley this year. I define data quality is good. If it is fit for use for three main reasons. The purpose of data is for three main reasons. What are the three main applications of data? Number one, operations. Number two, compliance. Number three, analytics. These are the three main reasons why enterprises, whether it's universities or private or for-profit companies are capturing data to run their business operations, to derive to for compliance activity and to derive insights. So let's take a simple example of a procurement department in a company, whichever that company might be. So you are issuing a purchase order to all your vendors. Why are you issuing purchase orders? What is the purpose of this purchase order data to run your operations because you say that I need 1000 quantities of this ball bearings. You are issuing the purchase orders to this vendor and you want to know how many quantities have been delivered or when is it coming? Where is it delivered? So on and so forth. So it's an operational document. Number two, it's also a compliance document because when you give a purchase order to the vendor to supply 100 quantities of ball bearings, if something goes wrong with the delivery, you can hold the vendor accountable for not delivering those 100 quantities of ball bearings. The vendor can also hold you accountable if you are not making the payment on time based on the prices that is discussed in the purchase order. So it's also a compliance document. So I've done many projects in oil and gas sector. In oil and gas sector, most of them are capital projects. So once a vendor gets a multimillion-dollar purchase order, their stock prices just goes up. Oh, I got a $1 billion purchase order from this big oil company, Shell, Chevron and so on. So their stock prices just goes up because they got a huge deal of project. So it's also a compliance document. And lastly is for analytics. Let's take over a period of time. I issued 1,000 purchase orders in the year 2020. I can look at the data of all the purchase orders that have been issued and glean insights, mine the insights from those 1,000 purchase orders and derive different insights. For example, if the quantity goes up, the price is going down because I'm leveraging the economies of scale and so on and so forth. So overall, what Iwana was talking about fit for use. What is that use? Three main reasons, operations, compliance and decision-making. So how do you measure this or analytics? How do you measure this, whether the data quality is good for your purpose? So which is basically looking at the 12 key dimensions, which really matter. So in the interest of time in my book, Data Quality, I've explained, I've spent almost like 25 pages writing about each of these 12 key dimensions. But in the interest of time, I'll talk about two things here, which is correctness and accuracy, which I've seen is often misunderstood in most of the projects which I work for. So I'll give you a story about how these terms really helped me in one of my projects. So this was a big telecom company in Toronto. So I was hired by this company and we had a big for consulting company also doing the project along with us. So one day, the team comes to us and says, hey guys, your customer data is really messed up. What do you mean? 99% of the customer records which you have are practically useless. Oh my gosh, this is a big number, 99% and trust me, this was a $20 billion telecom company. How did we run business for so many years when 99% of the customer data is of bad quality? So I said, can we know more about what you are talking? What do you mean? How do you define bad quality? The address data is wrong. So I said, so what? We are not able to send invoices to them or bills to them on time. But hold on, people move and when people move, most of them don't call the call centre and say that their address has changed because they have signed up for email notifications on their invoices, on their bills. So let's look at those people who have got out of this 99%, how many of them have signed up for email notifications? And it turns out to be it's 98% of them have signed up for email bills. Okay, now the discussion is not 99%, it's 1% data quality issues which we have, the real thing that matters. But even 1% is really bad. So let's discuss about 1%. What is this 1%? For example, in Canada we have address such as street is abbreviated as ST, court is abbreviated as CT, boulevard is abbreviated as BLVD, so on and so forth. I'm sure even in Australia and New Zealand it might be very similar. So there's a lot of abbreviations. But it's not that postman will not understand. If you put ST in the address, he knows that it is straight. If you put CT, he knows that it is court and all those things. So what really matters is the attribute where you can distinguish between accuracy and correctness. For example, telephone number needs to be correct, the zip code needs to be correct, whereas the address needs to be accurate. There is a degree of correctness associated with accuracy. So what we said was let's look at those attributes where it has to be correct. And then we are going to define what the data quality is all about. So when we look at the zip code, it came down to 0.0004% issues. We had a different API to compare that and read that comparison. Actually, the issue was not 99% it came down to just 0.0004% of the customer data had really data quality issues. So why I'm saying this? Two things. Number one, analytics is all about asking questions. Getting to know the root of the things is about what is the impact, what will happen and all those things. Number one, next number two, the definition of all this data quality dimensions, especially accuracy and correctness. Telephone number needs to be correct. First name and last name needs to be accurate. In many places where I go, my first name which ends with H is the H people drop that letter. It's not a big deal. But if my phone number is wrong, my email address is wrong. That's practically that data becomes for the company. So this is where the 12 key dimensions really matter. If you're interested to know more, you can read the book on data quality which is by published by Wiley. So before we go to the next topic which is transforming data into a business asset, any questions or thoughts on what we covered now? Where we said what is analytics? The key component for analytics is quality data. We define quality data as utility in operations, compliance and decision making and we use the 12 key dimensions to measure the level of data quality which is there. Thanks to Ivana and all of you for the help. But before I go to the next topic any questions on what we discussed so far? Is my speech clear? Yes. No question for now. Please go ahead. Good. Okay. So now let's get into the heart of the topic. Data or how do you transform this data into a valuable business asset for the company? We have been hearing a lot of terms like data is oil, data is blood, data is oxygen and all those things which is all good. I take it with a pinch of salt. But what really matters is how do you improve the data quality or the performance of data in business? In business when you look at the top the CEO level kind of discussions, they need data analytics to help in three major things. All profitability discussions which I've been having with my clients ultimately boils down to three main purposes. Why business goals are there? So what do you think are the three main goals of a business? Why do business exist? What are the three primary objectives of a CEO for example? Type your answers in the chat box. Profit very good. Profit is one thing what else? Profit I'll again break that profit into two components. One is revenues. Second one is cost. What are the other things which the CEO is responsible for? I'll wait for one answer okay. So overall there are three at the highest level at the C-suite level, the goals of the company are number one increase revenues, number two decrease cost and third one is to mitigate risk or including whatever things mitigating risk including reputation investor relations and everything. So pretty much the data analytics projects which we are doing in the company should be directly or indirectly associated with these three main purposes because that's where the value is all about. So let's start first with the transforming data into a business asset for these five reasons. These five reasons or three five strategies will help you transform the data into a business asset. So let's talk about the first one which is data should be tied to a purpose. Can somebody remind the group why data is there in a company? Why do you even require data? Just to recap of what I just said or extension of what Ivana's utility or application discussion was. So basically companies need data for three main reasons which we said not because somebody told its oil blood or oxygen or somebody told hey the most valuable company in the world is Facebook or Meta because they have tons and tons of data not all those reasons. Companies need data for three main reasons. Number one to run their operations, compliance or analytics. You capture data for these three reasons. If you just capture the data just because you don't have you're not able to attribute them for these three reasons you will end up with lots and lots of dark data in the company. For a study and they found that 73% of the data in a company is unused. IBM and Carnegie Mellon data study and they found that 90% of the data in a company is unused. So if you capture the data which is not needed for these three reasons you will end up with lot of dark data which is practically useless for your company. Number two use different types of data. Broaden the horizon of data. Most of the data which we talk in a company is the first party data which is the data the company originates. Company creates like for example invoices, remittances orders, contracts and all those things are data which the company generates. But there are other types of data as well which the company is concerned about. What is that? Zero party data which is all about the prospects all about the potential counterparties the company is working with. Number two the data about the second party data which is the data about the partners like for example world pool appliances are sold in North America through Home Depot. L'Oreal's perfume products are sold in the retail stores at Sephora's retail stores. So the data about all the perfume products of or all the cosmetic products of L'Oreal is there with Sephora. So L'Oreal would also be interested to know what how the consumers are buying their products. World pool would also be interested to know how the customers are buying their products in Home Depot Best Buy and so on and so forth. The second party data is all the data about your counterparties and the third party data is the data which you can buy for a fee or even get it for free like for example you want to get all the data about the weather you can go to weather the weather dot com and get the data about the weather. You want to know the crude oil prices you can go to Argus or Bloomberg and get the crude oil crude oil price data. This is the data which you can buy for a free or which is available easily for not just for you but even for your competitors. But unfortunately most of the data quality discussion or the data discussion is happening on the first party data which in my experience is just about 30% of the data that is needed in the company. Majority of the data the company is concerned with zero party data second party third party data and all those data is outside the company and the companies are not thinking about this data and how to access it how to integrate into the data landscape so that they can better use it for operations compliance and decision making. Cambridge Analytica got into trouble because the data they used from Facebook was who I didn't meet their privacy standards and that resulted in the ultimate closure of Cambridge Analytica. So how did it impact Facebook? It impacted Facebook because they were fine millions of their dollars so this was like the second party data between the Facebook and Cambridge Analytica which was not managed properly and because of that it resulted in serious issues both for Cambridge Analytica and Facebook as well. So what's the bottom line? The bottom line here is that many companies are just working on the first party data good but that's just about 30% of the data which is really concerning you the remaining 60 to 70% of data is outside your company which could be a zero party data on potential prospects. The second party data the data about you which is there in your counterpart which is there with your counterparty and the third party data which you can enhance your data and get the right insight. So broaden your horizon and expand your data footprint. Again three main purposes of data the data that is used in compliance and operations is defined predictable is in a proprietary format and the focus is on business process like purchase orders. It is required for operations and compliance is defined in a standard format it's captured in a native format and so on and so forth. But analytics is very different. It depends on hypothesis depends on the questions which you are asking the data might be there the data might not be there but the data is always required in a structured format for example if you are running linear regression for example both the dependent and the independent variable have to be numeric in nature so that you can derive those insights and unlike operations and compliance where the focus is on process here the focus is on business insight so overall the MIT took my quote and they put it in one of their articles which is on data analytics they said companies today have which I said basically and lots of data it's not about data collection but using it in the right using it in the right way so that's the second one which is all about utility of the data the fourth one is about focusing on business transactions business transactions are the ones which is really needed because they are focusing on business resources like transactional data like order data invoicing data, remittance data delivery data all those things are focused on business resources there is a money associated with this there is a relationship between the counterparties here it has got a twofold impact on accounting give and take and all transactions focus on performance and decision making so you might have tons of data in your company but if you want to get value from your data you have to focus on transactions for this five key reasons lastly you do the data governance because if you don't do the data governance they say the data quality degrades by two to seven percent every month and the data has to be governed throughout the data lifecycle right from data capture data integration data science data and decision science basically wherever data is used you need to govern this so I put below the system system architecture a typical system architecture right from MDM all the way to BI analytical system on where the data governance needs to happen because in each of those each of those each of the systems could have their own system of records to manage to manage the data so these are the five key strategies that you can use to transform your data into an into an asset because if you don't manage the data well the data might become a liability which is very hard to which is very hard to dispose of now before I go to the example about how to do the how to do how I applied all those things in a project any questions on thoughts questions probably need more time to discuss but so we can ask questions later after your presentation okay sure no problem so let me talk about an example or a project which I did on profitability analysis in a retail company in the US so the analytics which we discussed all about measurement and improving performance for a company can be broadly classified into three types descriptive analytics what happened it's on historical performance predictive analytics what will happen which is on the future and prescriptive analytics which is about the best course of action to derive those future state for example let's take we have Nancy who is the CFO and Bob is the data scientist Nancy goes to Bob and she asked the question Bob how much money we made in 2020 Bob says 350 million dollars that is descriptive analytics Nancy says Bob assuming the business condition remain the same how much money we will make in 2024 Bob says this is about predictive analytics let me go and run some trend analysis regression analysis and I'll come back to you in a few minutes Nancy after one hour he goes to Nancy's office and says hey Nancy assuming business conditions remain the same for the year 2024 we are going to make 375 million dollars Nancy looks disappointed Nancy says hey Bob my CFO tells me that we need to make 400 million dollars can you use data analytics and tell me what are the factors which will help me get a profitability of 400 million dollars Bob says yes I can run some scenario planning predictive analysis and all the prospective analytics techniques and I'll come back to you in a few minutes after a few minutes Bob goes to Nancy's office and says hey Nancy to make it happen to get 400 million dollars as profit there are 4 things which you need to do one is improve the price number 2 sell more number 3 reduce the direct cost and next number 4 reduce sales and administrative expenses these are the 4 factors which will help you get profit of 400 million dollars so overall when we talk about analytics it's not like a siloed conversation that's happening it's a continuum that is happening between Nancy who is an insight consumer and Bob who is an insight producer there is a back and forth conversation happening between insight production and insight consumption the more questions you have you will get again more questions so questions generate more questions so that you get a holistic understanding of what is happening so what I applied in this project was basically these 3 types of analytics the first one is descriptive analytics so we wanted to know what percentage of SKUs or the stock keeping units generated 80% of the revenue same with customers how many customers generate 80% of the revenues given that you have approximately 1600 customers and 480 SKUs we did the Pareto analysis for 18 months and we found that just 17 SKUs or 480 SKUs generated 80% of the revenue just 28 customers out of 1600 customers you have in your network generated 80% of the revenues so just a simple descriptive analytics on Pareto analysis the next one is about predictive analytics we said for this 17 SKUs and 28 customers what is the prediction revenue for the next 18 months so we ran regression analysis for the last 3 years and we said assuming business condition remain the same in the next 18 months the revenue from this 17 SKUs and 28 customers will go up by 4.6% good the third one is prescriptive analytics so what is the impact on gross profit if I increase the price 17 SKUs for this 28 customers by just 1% so we ran sensitivity analysis and we found that if we increase the price of the 17 SKUs by just 1% the gross profit will go up by 8.2% simple questions but powerful insight for the company so overall this is the key takeaways most of the companies just focus on first party data flywheel lose the 4 different types of data and get a holistic understanding of what is happening in the company data quality is very contextual given the 12 key dimensions definitions thanks to Ivana for defining the purpose and everything but still quality is very contextual based on time location and your objective data quality achievement or data governance is not like a one time project you just do it and you run away from your project it's an ongoing continuous improvement initiative transactional data are most important when you talk about data quality data should move that's what transactional data is all about and ultimately when we talk about data management it's about change management which includes process data IT systems and people so more on this is on my book on data quality it's available in all the stores including amazon the reason why I wrote this book in most of my consulting project the company's needs and aspirations are very high it's just like drawing a horse for example they start on the left side and they want to have a picture like this but when the project starts the standards come down so the standards are somewhere in the middle and ultimately when the project goes live you get something on the right side so I wrote this book based on the success and failures of different projects I have seen in my experience and the research and all the discussions I had with many smart people and I wrote this book and these are a couple of feedbacks which I got as well there are few more in amazon as well so ultimately the last slide of the presentation as things become much much clear if you think up by putting the customer at the center of your data analytics project because customer is a reason why the business organization exists in the first place so thank you very much for your time so this is my email id and I am active in LinkedIn as well please connect me in LinkedIn if you are there if you are not in LinkedIn my email id is here and I am here to answer your any questions or thoughts or comments stop sharing thanks to professor Yana and Mingfeng thanks for the invite can you guys all hear me can I have a while and see my slides yes all where okay good sorry for the technical issue which happened okay so today's talk is going to be on how to leverage data analytics and get improved business performance so as professor was talking about I am a data analytics consultant and an author I have consulted for some big names such as SAP D&G but for small companies as well but whether you talk about the big companies or the small companies the problem is the same or the question is the same how do we get value out of data analytics that's a question the companies have been asking me the size complexity and scale might vary but fundamental question remains the same how do we get value out of data analytics and this is today's presentation I'll give a quick overview about how I approach this problem I've written three books which you can see right behind me data for business performance analysis best practices and data quality which was which is a recent one I'm a blogger at Forbes SAP insider and cfo.university and I teach data analytics at IE business school in Madrid Spain I have a PhD MBA and so on and so forth so what I did was in most of the places I've been going these days people have been asking about hey Prashant can we trust chatGPT is it really good and all those things so what I did was I put this question to chatGPT I asked chatGPT who is this guy Prashant Sothika and chatGPT gave this answer so he says it's an he's what is it expert in the field of data analytics specifically in the area of data driven decision making I didn't put this anywhere in the internet so it looks like chatGPT compiled all this information and presented this to me so it looks pretty good so I've been using a version of this in my recent talks so long story short so if you are going to ask me about the accuracy and the reliability of chatGPT so you know where I would be going in that direction okay so this is what I have for you guys today so first we'll talk about data and business value in my own words how I define data and how is it tied to business value then we'll talk about five strategies on how do you transform data into a business asset and I'll talk about a case study or a project which I did for a retail and CPG firm which is based in the Washington state which is south of Canada so just to give you an idea about how it's actually done it's not some theory or conceptual thing which I'm talking about how did I apply all those things which I've been talking about so let's start about analytics so if somebody were to ask you hey what's analytics how do you define analytics what would you tell them type your answers in the chat box what's your definition of data analytics getting insight from data that's good I like two keywords there insight and data in Mingfeng response Lauren says information which is synonymous to insights most of the time very good Lauren again very similar to what Mingfeng was talking about what else I'll go for one more answer wisdom Diana says data to wisdom great answers so we have lot of definitions especially in the industry side when we talk about the definition of data analytics some people talk about insight some people say decision-making some people talk about performance management and all those things so overall especially in the industry it all depends on whom you ask for example let's take oil and gas so if you talk to scientists and ask them what's data analytics is so their definition of data analytics is mainly centered around data capture because all said and done data analytics in oil and gas has been pretty much regulatory centric so they need the data to show to the regulators if something goes wrong but on the other hand if you go and ask the CPG and the retail firms hey what's your definition of data analytics they say our definition of data analytics is about deriving insight so that we can look at the product the customer so on and so forth better but on one hand if you look at the oil and gas companies which are focused mostly on markets and commodities but on the other hand the retail and the CPG firms are focused mostly on products and customers so it all depends on where the focus is okay now within this industry let's talk about functions so if you talk to people in finance what's your definition of data analytics their definition of analytics is mostly on reporting because in North America we have gap and the rest of the world we have IFRS and all the standards so they need to produce reports to comply to those standards so if you talk to finance they say data analytics is mostly on reporting whereas if you talk to people in supply chain they say data analytics is more about understanding the customer again on one hand we have the regulators and on the other hand we have so and if you talk to the tech companies if you talk to SAP I worked in SAP for a long time and if you talk to SAP their definition of analytics is pretty much what they are selling when I left SAP in 2012 they were selling over 200 products right from ERP to CRM to GRC, BI, BW you name it SAP as a product so if you talk to SAP they talk about the whole data life cycle right from data capture data integration data science decision science everything whereas if you talk to SAS one of the companies advised they don't do much work on data capture their work is mostly on data science and ML their definition of analytics is more centered around data science so on SAP looks at all four stages of the data life cycle but when you talk to SAS and Power BI and Tableau their definition of analytics is more centered around data science so it's almost like 6 blind men looking at an elephant and defining the elephant in their own words each one comes with their own perspectives and start describing analytics in their own ways so to encompass all the definitions all the kind of varied value proposition of what analytics is all about my definition of data analytics is asking questions to derive insights from data to measure and improve business performance so if you don't believe in the concepts of improving and measuring business performance you are not going very far with data analytics so when I talk to my clients they say hey Prashant should we talk about moving data to the cloud should we talk about should we hire a python developer and all those things so I tell that's all important but what's required first is questions for which you are seeking answers for if you don't have questions stop your analytics projects right now and find out those questions for which you are seeking insights for no questions practically means no analytics and the purpose of analytics is basically about measuring and improving business performance if you don't believe in the concepts of measure measurement and improvement there is no point in doing analytics once again so any questions on this definition in fact I even wrote a Forbes blog on this on data culture which is what which is all about measurement and the focus and the role of KPIs in analytics yeah you can you can read that blog later just go to Forbes and put Prashant Swathakal you will see I have written about 20 blogs in Forbes you can read all of them so with this let's look at the challenges one of the key things is about getting good quality data but in many places there is lot of challenges Gartner said that 80% of the analytics projects fail our business review report says 3% of the data in a company meet data quality standards and IBM and Carnegie Mellon said the 90% of the data in a company is unused Forrest has said 73% IBM and Carnegie Mellon said it's 90% so overall lot of lot of data is captured which is called as the dark data in the company which is not very useful for how to run the company how to run analytics so overall in most of the projects there is no data or even if there is data there is no quality in them so overall most of the companies are challenged with quality data so my question to you when it comes to quality data what's your definition of quality data of course it is a very broad term it's a very big term if somebody were to ask you how do you define quality data what would you tell them type your answers in the chat box I said analytics is all about questioning so I will be asking lot of questions here and getting insights from you so how do you define quality data okay data can be used for answering the question but data fit for use I like Ivana's answer fit for use so let's elaborate the term use Ivana what do you mean the utility of the data where is it used for fit for use great answer quality requirements Liz is also very good quality requirements Claire has got little bit technical things about getting into metadata so on use depends on the application okay what are the applications Ivana use depends on the application fit for use depends on the application what are the applications of data where is it used for okay great answers love all your answers on data quality so basically I define data quality in my book data quality which was published by Wiley this year I define data is of the data quality is good if it is fit for use for three main reasons is a purpose of data is for three main reasons what are the three main applications of data number one operations number two compliance number three analytics these are the three main reasons why enterprises whether it's universities or or for profit companies are capturing data to run their business operations to derive to for compliance activity and to derive insights so let's take a simple example of a procurement department in a company whichever that company might be so you are issuing a purchase order to all your vendors why are you issuing purchase orders what is the purpose of this purchase order data to run your operations because you know that I need thousand quantities of this ball bearings you are issuing the purchase orders to this vendor and you want to know how many quantities have been delivered or when is it coming where is it delivered so on and so forth so it's an operational document number two it's also a compliance document because when you give a purchase order to the vendor to supply 100 quantities of ball bearings if something goes wrong with the delivery you can hold the vendor accountable for not delivering those hundred quantities of ball bearings the vendor can also hold you accountable if you are not making the payment on time based on the prices that is discussed in the purchase order so it's also a compliance document so I have done many projects in oil and gas sector in oil and gas sector most of them are capital projects so once a vendor gets a multi-million dollar purchase order their stock prices just goes up oh I got a one billion dollar purchase order from this big oil company Shell Chevron and so on so that stock prices just goes up because they got a huge deal project so it's also a compliance document and lastly is for analytics let's take over a period of time I issued thousand purchase orders in the year 2020 I can look at the data of all the purchase orders that have been issued and glean insights mine the insights from those thousand purchase orders and derive different insights for example if the you if the quantity goes up the price is going down because I'm leveraging the economics of scale and so on and so forth so overall what you and I was talking about fit for use what is that use three main reasons operations compliance and decision making so how do you measure this or analytics how do you measure this whether the data quality is good for your purpose so which is basically looking at the 12 key dimensions which really matter so in the interest of time in my book data quality I've explained I spent almost like 25 pages writing about each of these 12 key dimensions but in the interest of time I'll talk about two things here which is correctness and accuracy which I've seen is often misunderstood in most of the projects which I work for so I'll give you a story about how these terms really helped me in one of my one of my projects so this was a big telecom company in Toronto so I was I was hired by this company and we had a big for consulting company also doing the project along with along with us so so one day the team comes to us and says hey guys your customer data is really messed up what do you mean 99% of the customer records which you have are are are practically useless oh my gosh this is a big number 99% and trust me this was a billion dollar telecom company how did we run business for so many years when 99% of the customer data is of bad quality so I said can we know more about what you are talking what do you mean how do you define bad quality the address data is wrong so I said so what we are not able to send invoices to them or bills to them on time but hold on people move and when people move most of them don't call the call center say that my address has changed because they have signed up for email notifications on their invoices on their bills so let's look at those people who have got out of this 99% how many of them have signed up for email notifications it turns out to be 98% of them are signed up for email bills now the discussion is not 99% it's 1% data quality which we have the real thing that matters but even 1% is really bad so let's discuss about 1% what is this 1% for example in Canada we have address such as street is abbreviated as ST court is abbreviated as CT Boulevard is abbreviated as BLVD so on and so forth I'm sure even in Australia and New Zealand it might be very similar so there's a lot of abbreviations but it's not that we will not understand if you put ST in the address he knows that it is straight if you put CT he knows that it is court and all those things so what really matters is the attribute where you can distinguish between accuracy and correctness for example telephone number needs to be correct the zip code needs to be correct whereas the address needs to be accurate there is a degree of correctness associated with accuracy so what we did was let's look at those attributes where it has to be correct and then we are going to define what the data quality is all about so when we look at the zip code that number came down to 0.0004% issues we had a different API to compare that and with that comparison actually the issue was not 99% it came down to just 0.0004% so of the customer data had really data quality issues so why we are doing this two things number one analytics is all about asking questions getting to know the root of the things is about what is the impact how did it happen and all those things number one next number two the definition of all this data quality dimensions especially accuracy and correctness telephone number needs to be correct first name and last name needs to be accurate in many places where I go my first name which ends with H is the H people people drop that letter it's not a big deal but if my phone number is wrong my email address is wrong that's practically that data becomes useless for the company so this is where the 12 key dimensions really matter if you are interested to know more you can read the book on data quality which is by published by Wiley so before we go to the next topic which is transforming data into a business asset any questions or thoughts on what we covered where we said what is analytics the key component for analytics is quality data we define quality data as utility in operations compliance and decision making and we use the 12 key dimensions to measure the level of data quality which is there thanks to Ivana and all of you for for the help but before I go to the next topic any questions on what we discussed so far is my speech clear yes no question for now good so now let's get into the heart of the topic how do you get this data or how do you transform this data into a valuable business asset for the company we have been hearing a lot of terms like data is oil data is blood data is oxygen and all those things which is all good I take it with a pinch of salt but what really matters is how do you improve the data quality or the performance of data in business in business when you look at the top the CEO level kind of discussions they need data analytics to help in three major things all profitability discussions which I've been having with my clients ultimately boils down to three main purposes why business business goals are there so what do you think are the three main goals of a business why do business exist what are the three primary objectives of a CEO for example type your answers in the chat box profit very good profit is one thing where what else profit I'll again break that profit into two components one is revenues second one is cost what are the other things which the CEO is responsible for I'll wait for one answer okay so overall there are three at the highest level at the C-suite level the goals of the company are number one increase revenues number two decrease cost and third one is to mitigate risk or including whatever things risk including reputation investor relations and everything so pretty much the data analytics projects which we are doing in the company should be directly or indirectly associated with these three main purposes because that's where the value is all about so let's start first transforming data into a business asset for this five reasons these five reasons or three five strategies will help you transform the data into a business asset so let's talk about the first one which is data should be tied to a purpose can somebody remind the session group why data is there in a company why do you even require data just to recap of what I just said or extension of what Iwana's utility or application discussion was so basically companies need data for three main reasons which we said not because somebody told it's oil blood or oxygen or somebody told hey the most valuable company in the world is Facebook or Meta because they have tons and tons of data not all those reasons companies need data for three main reasons number one to run their operations compliance or analytics you capture data for these three reasons if you just capture the data just because you don't have you're not able to attribute them for these three reasons you will end up with lots and lots of dark data in the company for a study and they found that 73% of the data in a company is unused IBM and Carnegie Mellon data study and they found that 90% of the data in a company is unused so if you capture the data which is not needed for these three main reasons you will end up with a lot of dark data which is practically useless for your company number two use different types of data broad on the horizon of data most of the data which we talk in a company is the first party data which is the data the company originates company creates like for example invoices remittances orders contracts and all those things are data which the company generates other types of data as well which the company is concerned about what is that zero party data which is all about the prospects all about the potential counterparties the company is working with number two the data about the second party data which is the data about the partners like for example Whirlpool appliances are sold in North America through Home Depot Loreal's perfume products are sold in the retail stores at Sephora retail stores so the data about all the perfume products of all the cosmetic products of Loreal is there with Sephora so several Loreal would also be interested to know what how the consumers are buying their products Whirlpool would also be interested to know how the customers are buying their products in Home Depot, Best Buy and so on and so forth the second party data is all the data about your counterparties and the third party data is the data which you can buy for a fee or even get it for free like for example you want to get all the data about the weather you can go to weather.com and get the data about the weather you want to know the crude oil prices you can go to Argus or Bloomberg and get the crude oil price data this is the data which you can buy for a free or which is available easily for not just for you but even for your competitors but unfortunately most of the data quality discussion or the data discussion is happening on the first party data which in my experience is just about 30% of the data that is available needed in the company majority of the data the company is concerned with zero party data, second party third party data and all those data is outside the company and the companies are not thinking about this data and how to access it how to integrate into the data landscape so that they can better use it for operations, compliance and decision making Cambridge Analytica got into trouble because the data they used from Facebook didn't meet the privacy standards and that resulted in the ultimate closure of Cambridge Analytica so how did it impact Facebook it impacted Facebook because they were finding millions of dollars so this was like the second party data with Facebook and Cambridge Analytica which was not managed properly and because of that it resulted in serious issues both for Cambridge Analytica and Facebook as well so what's the bottom line many companies are just working on the first party data good but that's just about 30% of the data which is really concerning you the remaining 60 to 70% of data is outside your company which could be a zero party data on potential impacts the second party data the data about you which is there in your counterpart which is there with your counterparty and the third party data which you can further use to enhance your data and get the right insights so broaden your horizon and expand your data footprint again three main purposes of data the data that is used in compliance and operations is defined predictable is in a proprietary format and the focus is on business process like purchase orders it is required for operations and compliance is defined in a standard format it is captured in a native format and so on and so forth but analytics is very different it depends on hypothesis depends on the questions which you are asking the data might be there the data might not be there but the data is always required in a structured format for example if you are running linear regression for example both the dependent and the independent variable have to be numeric in nature so that you can derive those insights and unlike operations and compliance where the focus is on process here the focus is on business insights so overall the MIT took my quote and they put it in the one of their articles which is on data and analytics they said companies today have which I said basically and lots of data it's not about data collection but using it in the right using it in the right way so that's the second one which is all about utility of the data the fourth one is about focusing on business transactions business transactions are the ones which is really needed because they are focusing on business resources like transaction data like order data invoicing data, remittance data, delivery data all those things are focused on business resources there is a money associated with this there is a relationship between the data that is here it has got a two-fold impact on accounting give and take and all transactions focus on performance and decision making so you might have tons of data in your company but if you want to get value from your data you have to focus on transactions for this key reasons lastly you do the data governance because if you don't do the data governance they say the data quality degrades by two to seven percent every month and the data has to be governed throughout the data lifecycle right from data capture data integration data science and data and decision science basically wherever data is used you need to govern this so I put below the system architecture a typical system architecture right from MDM all the way to BI analytical system on where the data governance needs to happen because in each of those each of the systems could have their own system of records to manage to manage the data so this these are the five key strategies which you can use to transform your data into an into an asset because if you don't manage the data well the data might become a liability which is very hard to which is very hard to dispose of now before I go to the example about how to do the how to do how I applied all those things in a project any questions on thoughts questions I think there are questions probably need more time to discuss but so we can ask questions later after your presentation okay sure okay no problem so let's let me talk about an example or a project which I did on profitability analysis in a retail company in the US so the analytics which we discussed all about measurement and improving performance for a company can be broadly classified into three types descriptive analytics what happened it's on historical performance predictive analytics what will happen which is on the future and prescriptive analytics which is about the best course of action to derive derive those future state for example let's take we have Nancy who is the CFO and Bob is the data scientist Nancy goes to Bob and she asked a question Bob how much money we made in 2020 Bob says 350 million dollars that is descriptive analytics Nancy says Bob assuming the business condition remain the same how much money we will make in 2024 Bob says this is predictive analytics let me go and run some trend analysis regression analysis and I'll come back to you in a few minutes Nancy after one hour he goes to Nancy's office and says hey Nancy assuming business conditions remain the same for the year 2024 we are going to make 375 million dollars Nancy looks disappointed Nancy says hey Bob my CEO tells me that we need to make 400 million dollars can you use data analytics and tell me what are the factors which will help me get a profitability of 400 million dollars Bob says yes I can run some scenario planning sensitive analysis and all the prescriptive analytics techniques and I'll come back to you in a few minutes after a few minutes Bob goes to Nancy's office and says hey Nancy to make it happen to get 400 million dollars as profit there are four things which you need to do one is improve the price number two sell more number three reduce the direct cost and next number four reduce SG&A which is sales and administrative expenses these are the four factors which will help you get profit of 400 million dollars so overall when we talk about analytics it's not like a siloed conversation that's happening it's a continuum that is happening between Nancy who is an insight consumer and Nancy and Bob who is an insight producer there is a back and forth conversation happening between insight production and insight consumption the more questions you have again more questions so questions generate more questions so that you get a holistic understanding of what is happening so what I applied in this project was basically these three types of analytics the first one is descriptive analytics so we wanted to know what percentage of customers or the stock keeping units generated 80% of the revenue same with customers how many customers generate 80% of the revenues given that you have approximately 1600 customers and 480 skews we did the Pareto analysis for 18 months and we found that just 17 skews out of 480 skews generated 80% of the revenue just 28 customers out of 1600 customers you have in your network generated 80% of the revenue so just a simple descriptive analytics on Pareto analysis the next one is about predictive analytics we said for this 17 skews and 28 customers what is the prediction revenue for the next 18 months so we ran regression analysis for the last 3 years and we said assuming business condition remain the same in the next 18 months the revenue from this 17 skews and 28 customers will go up by 26% the third one is prescriptive analytics so what is the impact on gross profit if I increase the price of the skews 17 skews for this 28 customers by just 1% so we ran sensitivity analysis and we found that if we increase the price of the 17 skews by just 1% the gross profit will go up by 8.2% simple questions but powerful insights for the company so overall this is the key takeaways most of the companies just focus on first party data use the data flywheel use the 4 different types of data and get a holistic understanding of what is happening in the company data quality is very contextual I have given that 12 key dimensions definitions thanks to Ivana for defining the purpose and everything but still quality is very contextual based on time location and your objective data quality achievement or data governance is not like a one time project you just do it and you run away from your project it's an ongoing continuous improvement initiative transactional data are most important when you talk about data quality data should move that's what transactional data is all about and ultimately when we talk about data management it's about change management which includes process, data IT systems and people so more on this is on my book on data quality it's available in all the stores including Amazon the reason why I wrote this book is in most of my consulting project the companies needs and aspirations are very high it's just like drawing a horse for example they start on the left side and they want to have a picture like this but when the project starts the standards come down so the standards are somewhere in the middle and ultimately when the project goes live you get something on the right side so I wrote this book based on the success and failures of different projects I've seen in my experience and the including the research and all the discussions I had with many smart people and I wrote this book and these are a couple of feedbacks which I got as well there are few more in Amazon as well so ultimately the last slide of the presentation as things become much much clearer if you think up by putting the customer at the center of your data analytics projects because customer is a reason why the business or the organization exists in the first place so thank you very much for your time so this is my email ID and I'm active in LinkedIn as well please connect me in LinkedIn if you are there if you're not in LinkedIn my email ID is here and I'm here to answer your any questions or thoughts or comments and stop sharing thank you for sharing for your very clear presentation and very interactive interesting one so today's talk is going to be on how do we leverage data analytics and get improved business performance so as professor was talking about I'm a data analytics consultant and an author I've consulted for some big names such as SAP and PNG and GE but for small companies as well but whether you talk about the big companies or the small companies the problem is the same or the question is the same how do we get value out of data analytics that's a question the companies have been asking me the size complexity and scale might vary but fundamental question remains the same how do we get value out of data analytics and this is today's presentation I'll give a quick overview about how I approach this problem I've written three books which you can see right behind me data for business performance analysis best practice and data quality which was which is a recent one I'm a blogger at SAP inside at university and I teach data analytics at business school in Madrid Spain I have a PhD MBA and so on and so forth so what I did was in most of the places I've been going these days people have been asking about hey Prashant can we trust chat GPT is it really good and all those things so what I did was I put this question to chat GPT I asked chat GPT who is this guy Prashant Sothika and chat GPT gave this answer so he says it's an he's what is it expert in the field of data analytics specifically in the area of data driven decision making I didn't put this anywhere in the internet so it looks like chat GPT compiled all this information and presented this to me so it looks pretty good so I've been using a version of this in my recent talks so long story short so if you are going to ask me about the accuracy and reliability of chat GPT so you know where I would be going in that direction okay so this is what I have for you guys today so first we'll talk about data and business value in my own words how I define data and how is it tied to business value then we'll talk about five strategies how do you transform data into a business asset and I'll talk about a case study or a project which I did for a retail and CPG firm which is based in the Washington state which is south of Canada so I'll just to give you an idea about how it's actually done it's not some theory or conceptual thing which I'm talking about how did I apply all those things which I've been talking about so let's start about analytics so if somebody were to ask you hey what's analytics how do you define analytics what would you tell them type your answers in the chat box what's your definition of data analytics getting insight from data that's good I like two keywords there insight and data in Mingfang's response Lauren says information which is synonymous to insights most of the time very good Lauren again very similar to what Mingfang was talking about I'll go for one more answer wisdom Diana says data to wisdom great answers so we have a lot of definitions especially in the industry side when we talk about the definition of data analytics some people talk about insight some people say decision making some people talk about performance management and all those things so overall especially in the industry it all depends on whom you are for example let's take oil and gas so if you talk to people in oil and gas and ask them what data analytics is so their definition of data analytics is mainly centered around data capture because all said and done data analytics in oil and gas has been pretty much regulatory centric so they need the data to show to the regulators if something goes wrong but on the other hand if you go and ask the CPG and the retail firms hey what's your definition of data analytics they say our definition of data analytics is about deriving insight so that we can understand the product customers so on and so forth better but on one hand if you look at the oil and gas companies which are focused mostly on markets and commodities but on the other hand the retail and the CPG firms are focused mostly on products and customers so it all depends on where the focus is ok now within this industry let's talk about functions so if you talk to people in finance what's your definition of data analytics their definition of analytics is mostly on reporting because in North America we have GAAP and the rest of the world we have IFRS and all the standards so they need to produce reports to comply to those standards so if you talk to finance they say data analytics is mostly on reporting whereas if you talk to people in supply chain they say data analytics is more about understanding the customer again on one hand we have the regulators and on the other hand we have the customers and if you talk to the tech companies if you talk to SAP I worked in SAP for a long time and if you talk to SAP their definition of analytics is pretty much what they are selling when I left SAP in 2012 they were selling over 200 products right from ERP to CRM to GRC, BI, BW you name it SAP as a product so if you talk to SAP they talk about the whole data life cycle right from data capture, data integration data science, decision science everything whereas if you talk to SAS one of the companies I advised they don't do much work on data capture their work is mostly on data science and ML their definition of analytics is more centered around data science so on SAP looks at all four stages of the data life cycle but when you talk to SAS and Power BI and Tableau their definition of analytics is more centered around data science so it's almost like six blind men looking at an elephant and defining the elephant in their own words each one comes with their own perspectives and start describing analytics in their own ways so to encompass all the definitions all the kind of varied value proposition of what analytics is all about my definition of data analytics is asking questions to derive insights from data to measure and improve business performance so if you don't believe in the concepts of improving and measuring business performance you are not going very far with data analytics so when I talk to my clients they say hey Prashant should we talk about moving data to the cloud should we talk should we hire a python developer and all those things so I tell that's all important but what's required first is questions for which you are seeking answers for if you don't have questions stop your analytics projects right now and find out those questions for which you are seeking insights for no questions practically means no analytics and the purpose of analytics is basically about measuring and improving business performance if you don't believe in the concepts of measure measurement and improvement there is no point in doing analytics once again so any questions on this definition in fact I even wrote a Forbes blog on this on data culture which is all about measurement focus on the role of KPIs in analytics yeah you can read that blog later just go to Forbes and put Prashant you will see I have written about 20 blogs in Forbes you can read all of them so with this let's look at the challenges one of the key things is about getting good quality data but in many places there is lot of challenges Gartner said that 80% of the analytics projects failed our business review report says that just the data in a company meet data quality standards and IBM and Carnegie Mellon said 90% of the data in a company is unused, Forrester said 73% IBM and Carnegie Mellon said it's 90% so overall lot of data is captured which is called as a dark data in the company which is not very useful for how to run the company or how to run analytics so overall in most of the projects there is no data or even if there is data there is no quality in them so overall most of the companies are challenged with quality data so my question to you when it comes to quality data what's your definition of quality data of course it is a very broad term it's a very big term if somebody were to ask you how do you define quality data how do you what would you tell them type your answers in the chat box you know I said analytics is all about questioning right I will be asking a lot of questions here and getting insights from you so how do you define quality data okay data can be used for answering the question good data fit for use I like Ivana's answer fit for use so let elaborate the term use Ivana what do you mean the utility of the data where is it used for fit for use great answer quality requirements list is also very good quality requirements Claire has got little bit technical things about getting into metadata so on use depends on the application okay what are the applications Ivana use depends on the application fit for use depends on the application what are the applications of data where is it used for mm-hmm okay great answers allow all your answers on data quality so basically I define data quality in my book data quality which was published by Wiley this year I define data is of the data quality is good if it is fit for use for three main reasons the purpose of data is for three main reasons what are the three main applications of data number one operations number two compliance number three analytics these are the three main reasons why enterprises whether it's universities or for profit companies are capturing data to run their business operations to derive to for compliance activity and to derive insights so let's take a simple example of a procurement department in a company whichever that company might be so you are issuing a purchase order to all your vendors why are you issuing purchase orders what is the purpose of this purchase order data to run your operations because you say that I need thousand quantities of this ball bearings you are issuing the purchase orders to this vendor and you want to know how many quantities have been delivered or when is it coming where is it delivered so on and so forth so it's an operational document number two it's also a compliance document because when you give a purchase order to the vendor to supply a hundred quantities of ball bearings if something goes wrong with the delivery you can hold the vendor accountable for not delivering those hundred quantities of ball bearings the vendor can also hold you accountable if you are not making the payment on time based on the prices that is discussed in the purchase order so it's also a compliance document so I have done many projects in oil and gas sector in oil and gas sector most of them are capital projects so once a vendor gets a multi-million dollar purchase order their stock prices just goes up oh I got a one billion dollar purchase order from this big oil company Shell Chevron and so on so that stock prices just goes up because they got a huge deal project so it's also a compliance document and lastly it's for analytics let's take what a period of time I issued thousand purchase orders in the year 2020 I can look at the data of all the purchase orders that have been issued and glean insights mine the insights from those thousand purchase orders and derive different insights for example if the quantity goes up the price is going down because I'm leveraging the economics of scale and so on and so forth so overall what Yvana was talking about fit for use what is that use three main reasons operations compliance and decision making so how do you measure this or analytics how do you measure this whether the data quality is good for your purpose so which is basically looking at the 12 key dimensions which really matter so in the interest of time in my book data quality I've explained I spent almost like 25 pages writing about each of this 12 key dimensions but in the interest of time I'll talk about two things here which is correctness and accuracy which I've seen is often misunderstood in most of the projects which I work for so I'll give you a story about how this terms really helped me in one of my one of my projects so this was a big telecom company in Toronto so I was I was hired by this company and we had a big consulting company also doing the project along with us so one day the team comes to us and says hey guys your customer data is really messed up what do you mean 99% of the customer records which you have are are practically useless oh my gosh this is a big number 99% and trust me this was a 20 billion dollar telecom company how did we run business for so many years when 99% of the customer data is of bad quality so I said can we know more about what you are talking what do you mean how do you define bad quality the address data is wrong so I said so what we are not able to send invoices to them or bills to them on time but hold on people move and when people move most of them don't call the call center say that my address has changed because they have signed up for email notifications on their invoices on their bills so let's look at those people who have got out of this 99% how many of them have signed up for email notifications and it turns out to be it's 98% of them are signed up for email bills okay now the discussion is not 99% it's 1% data quality issues which we have the real thing that matters but even 1% is really bad so let's discuss about 1% what is this 1% for example in Canada we have address such as street is abbreviated as ST court is abbreviated as CT boulevard is abbreviated as BLVD so on and so forth I'm sure even in Australia and New Zealand it might be very similar so there's a lot of abbreviations but it's not that postman will not understand if you put ST in the address he knows that it is straight if you put CT he knows that it is court and all those things so what really matters is the attribute where you can distinguish between accuracy and correctness for example telephone number needs to be correct the zip code needs to be correct whereas the address needs to be accurate there is a degree of correctness associated with accuracy so what we said was let's look at those attributes where it has to be correct and then we are going to define what the data quality is all about so when we look at the zip code that number came down to 0.0004% issues we had a different API to compare that and with that comparison actually the issue was not 99% it came down to just 0.0004% of the customer data had really data quality issues so why I'm saying this no thanks, number one analytics is all about asking questions getting to know the root of the things is about what is the impact how did it happen and all those things number one, next number two the definition of all this data quality dimensions especially accuracy and correctness telephone number needs to be correct first name and last name needs to be accurate in many places where I go my first name which ends with H is the H people drop that letter it's not a big deal but if my phone number is wrong my email address is wrong that's practically that data becomes useless for the company so this is where the 12 key dimensions really matter if you're interested to know more you can read the book on data quality which is published by Wiley so before we go to the next topic which is transforming data into a business asset any questions or thoughts on what we covered now we said what is analytics the key component for analytics is quality data we define quality data as utility in operations compliance and decision making and we use the 12 key dimensions to measure the level of data quality which is there thanks to Ivana and all of you for the help but before I go to the next topic any questions on what we discussed so far is my speech clear yes no question for now please go ahead good okay so now let's get into the heart of the topic how do you get this data or how do you transform this data into a valuable business asset for the company we have been hearing a lot of terms like data is oil data is blood data is oxygen and all those things which is all good I take it with a pinch of salt but what really matters is how do you improve the data quality or the performance of data in business in business when you look at the top the CEO level kind of discussions they need data analytics to help in three major things all profitability discussions which I've been having with my clients who ultimately boils down to three main purposes why business of a business goals are there so what do you think the three main goals of a business why do business exist what are the three of your primary objectives of a CEO for example type your answers in the chat box profit very good profit is one thing where what else profit I'll again break that profit into two components one is revenues second one is cost what are the other things which the CEO is responsible for I'll wait for one answer okay so overall there are three at the highest level at the C-suite level the goals of the company are number one increase revenues number two decrease cost and third one is to mitigate risk or including whatever things mitigating risk including reputation investor relations and everything so pretty much the data analytics projects which we are doing in the company should be directly or indirectly associated with these three main purposes because that's where the value is all about so let's start first with the transforming data into a business asset for these five reasons these five reasons or three five strategies will help you transform the data into a business asset so let's talk about the first one which is data should be tied to a purpose can somebody remind the group why data is there in a company why do why do you require data just to recap of what I just said or extension of what Iwana's utility or application discussion was so basically companies need data for three main reasons which we said not because somebody told it's oil blood or oxygen or somebody told hey the most valuable company in the world is Facebook or meta because they have tons and tons of data not all those reasons companies need data for three main reasons number one to run their operations compliance or analytics you capture data for these three reasons if you just capture the data just because we don't have you are not able to attribute them for these three reasons you will end up with lots and lots of dark data in the company for a study study and they found that 73% of the data in a company is unused IBM and Carnegie Mellon data study and they found that 90% of the data in a company is unused so if you capture the data which is not needed for these three main these three reasons you will end up with lot of dark data which is practically useless for your company number two use different types of data broaden the horizon of data most of the data which we talk in a company is the first party data which is the data the company originates company creates like for example invoices remittances orders contracts and all those things are data which the company generates but there are other types of data as well which the company is concerned about what is that zero party data which is all about the prospects all about the potential counter parties the company is working with number two the data about the second party data which is the data about the partners like for example world pool appliances are sold in North America through Home Depot L'Oreal's perfume products are sold in the retail stores except for L'Oreal's retail stores so the data about all the perfume products of all the cosmetic products of L'Oreal is there with Sephora so several L'Oreal would also be interested to know what how the consumers are buying their products world pool would also be interested to know how the customers are buying their products in Home Depot Best Buy and so on and so forth the second party data is all the data about your counterparties which you can buy for a fee or even get it for free like for example you want to get all the data about the weather you can go to weather weather.com and can get the data about the weather you want to know the crude oil prices you can go to Argus or Bloomberg and get the crude oil crude oil price data this is the data which you can buy for a free or which is available easily for not just for you but even for your competitors unfortunately most of the data quality discussion or the data discussion is happening on the first party data which in my experience is just about 30% of the data that is available needed in the company majority of the data the company is concerned with zero party data second party third party data and all those data is outside the company and the companies are not thinking about this data and how to access it how to integrate into the data landscape so that they can better use it for operations compliance and decision and decision making Cambridge Analytica got into trouble because the data they used from Facebook was didn't meet the privacy standards and that resulted in the ultimate closure of Cambridge Analytica so how did it impact Facebook it impacted Facebook because they were fine millions of their dollars so this was like the second party data between the Facebook and the Cambridge Analytica which was not managed properly and because of that it resulted in serious issues both for Cambridge Analytica and Facebook as well so what's the bottom line the bottom line here is that many companies are just working on the first party data good but that's just about 30% of the data which is really concerning you the remaining 60 to 70% of data is outside your company which could be a zero party data on potential prospects the second party data the data about you which is there in your which is there with your counterparty and the third party data which you can further use to enhance your data and get the right insight so broaden your horizon and expand your data footprint again three main purposes of data the data that is used in compliance and operations is defined predictable is in a proprietary format and the focus is on business process like purchase orders it is required for operations and compliance it is defined in a standard format it is captured in a native format and so on and so forth but analytics is very different it depends on hypothesis depends on the questions which you are asking the data might be there the data might not be there but the data is always required in a structured format for example if you are running linear regression for example both the dependent and the independent variable have to be numeric in nature they have those insights and unlike operations and compliance where the focus is on process here the focus is on business insights so overall the MIT took my quote and they put it in one of their articles which is on data and analytics they said companies today have which I said basically have lots of data it is not about data collection but using it in the right way so that is the second one which is all about utility of the data the fourth one is about focusing on business transactions business transactions are the ones which is really needed because they are focusing on business resources like transaction data like order data invoicing data remittance data delivery data all those things are focused on business resources there is a money associated with this there is a relationship between the counterparties here it has got a two-fold impact on accounting give and take and all transactions focus on performance and decision making so you might have tons of data in your company but if you want to get value from your data you have to focus on transactions for this 5k reasons lastly you do the data governance because if you don't do the data governance they say the data quality degrades by 2 to 7% every month and the data has to be governed throughout the data life cycle right from data capture data integration data science and data and decision science basically wherever data is used you need to govern this so I put below the system system architecture a typical system architecture right from MDM all the way to BI analytical system on where the data governance needs to happen because in each of those each of those systems could have their own system of records to manage the data so these are the 5 key strategies which you can use to transform your data into an asset because if you don't manage the data well the data might become a liability which is very hard to dispose of now before I go to an example about how to do the how I applied all those things in a project what are some of the questions on thoughts questions probably need more time to discuss so we can ask questions later after your presentation okay sure no problem let me talk about an example or a project which I did on profitability analysis in a retail company in the US so the analytics which we discussed all about measurement and improving performance for a company can be broadly classified into three types descriptive analytics what happened it's on historical performance predictive analytics what will happen which is on the future and prescriptive analytics which is about the best course of action to derive derive those future state for example let's take we have Nancy who is the CFO and Bob is the data scientist Nancy goes to Bob and she asked the question Bob how much money we made in 2020 Bob says $350 million that is descriptive analytics Nancy says Bob assuming the business condition remain the same how much money we will make in 2024 Bob says this is about predictive analytics let me go and run some trend analysis regression analysis and I'll come back to you in a few minutes Nancy after one hour he goes to Nancy's office and says hey Nancy assuming business conditions remain the same for the year 2024 we are going to make $375 million Nancy looks disappointed Nancy says hey Bob my CEO tells me that we need to make $400 million can you use data analytics and tell me what are the factors which will help me get a profitability of $400 million Bob says yes I can run some scenario planning sensitive analysis and all the prescriptive analytics techniques and I'll come back to you in a few minutes after a few hour minutes Bob goes to Nancy's office and says hey Nancy to make it happen to get $400 million as profit there are four things which you need to do to improve the price number two sell more number three reduce the direct cost and next number four reduce SG&A which is sales and administrative expenses these are the four factors which will help you get profit of $400 million so overall when we talk about analytics it's not like a siloed conversation that's happening it's a continuum that is happening between Nancy who is an insight consumer and Bob who is an insight producer there is a back and forth conversation happening between insight production and insight consumption the more questions you have you will get more again more questions so questions generate more questions so that you get a holistic understanding of what is happening so what I applied in this project was basically these three types of analytics the first one is descriptive analytics so we wanted to know what percentage of SKUs or the stock keeping units generated 80% of the revenue same with customers how many customers generate 80% of the revenues given that you have approximately 1600 customers and 480 SKUs we did the Pareto analysis for 18 months and we found that just 17 SKUs out of 480 SKUs generated 80% of the revenue just 28 customers out of 1600 customers you have in your network generated 80% of the revenues so just a simple descriptive analytics on Pareto analysis the next one is about predictive analytics we said for this 17 SKUs and 28 customers what is the prediction revenue for the next 18 months so we ran regression analysis for the last 3 years and we said assuming business condition remain the same in the next 18 months the revenue from this 17 SKUs and 28 customers will go up by 4.6% good the third one is prescriptive analytics so what is the impact on gross profit if I increase the price of the SKUs 17 SKUs for this 28 customers by just 1% so we ran sensitivity analysis and we found that if we increase the price of the 17 SKUs by just 1% the gross profit will go up by 8.2% simple questions but powerful insights for the company so overall this is the key takeaways most of the companies just focus on first party data use the data flywheel use the 4 different types of data and get a holistic understanding of what is happening in the company data quality is very contextual I have given that 12 key dimensions definitions thanks to Ivana for defining the purpose and everything but still quality is very contextual time location and your objective data quality achievement or data governance is not like a one-time project you just do it and you run away from your project it's an ongoing continuous improvement initiative transactional data are most important when you talk about data quality data should move that's what transactional data is all about and ultimately when we talk about data management it's about change management which includes process data IT systems and so more on this is on my book on data quality it's available in all the stores including Amazon the reason why I wrote this book is in most of my consulting project the companies needs and aspirations are very high so when they well it's just like drawing a horse for example they start on the left side and they want to have a picture like this but when the project starts the standards come down so the standards are somewhere in the middle and ultimately when the project goes live you get something on the right side so I wrote this book based on the success and failures of different projects I've seen in my in my experience and the including the research and and all the discussions I had with many smart people and I wrote this book and these are couple of feedbacks which I got as well there are few more in Amazon as well so ultimately the last slide of the presentation as things become much much clear if you think up by putting the customer at the center of your data analytics projects because customer is a reason why the business or organization exists in the first place so thank you very much for your time so this is my email ID and I'm active in LinkedIn as well please connect me in LinkedIn if you are there if you're not in LinkedIn my email ID is here and I'm here to answer your any questions or thoughts or comments and stop sharing the thanks to professor Yana and and Mingfeng thanks for the thanks for the invite can you guys all hear me well and see my slides yes all where okay good so sorry for the for the technical issue which happened okay so today's talk is going to be on how to be leverage data and analytics and get improved business performance so as a professor was talking about I'm a data analytics consultant and and an author I've consulted for some big names such as SAP and PNG and GE but for small companies as well but whether you talk about the big companies or the small companies the problem is the same or the question is the same how do we get value out of data analytics that's a question people the companies have been asking me the size complexity and scale might vary but fundamental question remains the same how do we get value out of data analytics and this is today's presentation I'll give a quick overview about how I approach this problem. I've written three books which you can see right behind me data for business performance and it's best practice and data quality which was which is a recent one I'm a blogger at Forbes SAP Insider and CFR. University and I teach data analytics at IE Business School in Madrid Spain I have a PhD MBA and so forth so what I did was in most of the places I've been going these days people have been asking about hey Prashant can we trust chat GPT is it really good and all those things so what I did was I put this question to chat GPT I asked chat GPT who is this guy Prashant Sothika and chat GPT gave this answer so he says it's an he's what is it expert in the field of data analytics specifically in the area of data driven decision making I didn't put this anywhere in the internet so it looks like chat GPT compiled all this information and presented this to me so it looks pretty good so I've been using a version of this in my recent talks so long story short so if you are going to ask me about the accuracy and the reliability of chat GPT so you know where I would be going in that direction okay so this is what I have for you guys today so first we'll talk about data and business value in my own words how I define data and how is it tied to business value then we'll talk about five strategies on how do you transform data into a business asset and I'll talk about a case study or a project which I did for a retail and CPG firm which is based in the Washington state which is south of South of Canada so I'll just so just to give you an idea about how it's actually done it's not some theory or conceptual thing talking about how did I apply all those things which I've been talking about so let's start about analytics so if somebody were to ask you hey what's analytics how do you define analytics what would you tell them type your answers in the chat box what's your definition of data analytics getting insight from data that's good I like two keywords there insight and data in Ming Fang's response Lauren says information which is synonymous to insights most of the time very good Lauren again very similar to what Ming Fang was talking about or that's I'll go for one more answer wisdom Yana says data to wisdom great answers so we have lot of definitions especially in the industry side when we talk about the definition of data analytics some people talk about insight some people say decision making some people talk about performance management and all those things so overall especially in the industry it all depends on whom you ask for example let's take oil and gas so if you talk to people in oil and gas and ask them what's data analytics is so their definition of data analytics is mainly centered around data capture because all said and done data analytics in oil and gas has been pretty much regulatory centric so they need the data to show to the regulators if something goes wrong but on the other hand if you go and ask the company and the retail firms hey what's your definition of data analytics they say our definition of data analytics is about deriving insight so that we can understand the product the customer so on and so forth better but on one hand if you look at the oil and gas companies which are focused mostly on markets and and commodities but on the other hand the retail and the CPG firms are focused mostly on products and customers so it all depends on where the focus is ok now within this industry let's talk about functions so if you talk to people in finance what's your definition of data analytics their definition of analytics is mostly on reporting because in North America we have gap and the rest of the world we have IFRS and all the standards so they need to produce reports to comply to those standards so if you talk to finance they say data analytics is mostly on reporting whereas if you talk to in supply chain they say data analytics is more about understanding the customer again on one hand we have the regulators and on the other hand we have the customers so and if you talk to the tech companies if you talk to SAP I worked in SAP for a long time and if you talk to SAP their definition of analytics is pretty much what they are selling when I left SAP in 2012 they were selling over 200 products right from ERP to CRM to GRC BI, BW SAP has a product so if you talk to SAP they talk about the whole data lifecycle right from data capture data integration data science decision science everything whereas if you talk to SAS one of the companies are advised they don't do much work on data capture their work is mostly on data science and ML their definition of analytics is more centered around data science so on SAP looks at all four stages of the data lifecycle but when you talk to SAS and Power BI and Tableau their definition of analytics is more centered around data science so it's almost like six blind men looking at an elephant and defining the elephant in their own words each one comes with their own perspectives and start describing analytics in their own ways so to encompass all the definitions all the kind of varied value proposition of what analytics is all about my definition of data analytics is asking questions to derive insights from data to measure and improve business performance so if you don't believe in the concepts of improving and measuring business performance you are not going very far with data analytics so when I talk to my clients they say hey Prashant should we talk about moving data to the cloud should we talk or should we hire a python developer and all those things so I tell that's all important but what's required first is questions for which you are seeking answers for if you don't have questions stop your analytics projects right now and find out those questions for which you are seeking insights for no questions practically means no analytics and the purpose of analytics is basically about measuring and improving business performance if you don't believe in the concepts of measure measurement and improvement there is no point in doing analytics once again so any questions on this definition in fact I even wrote a Forbes blog on this on data culture which is what which is all about measurement and the focus and the role of KPIs in analytics yeah you can read that blog later just go to Forbes and put Prashant Satekal you will see I have written about 20 blogs in Forbes you can read all of them so with this let's look at the challenges one of the key things is about getting good quality data but in many places there is a lot of challenges Gartner said that 80% of the analytics projects failed our business review report says that just 3% of the data in a company meet data quality standards and IBM and Carnegie Mellon said that 90% of the data in a company is unused Forrest has said 73% IBM and Carnegie Mellon said it's 90% so overall a lot of data is captured which is called as the dark data in the company which is not very useful for how to run the company how to run analytics so overall in most of the projects there is no data or even if there is data there is no quality in that so overall most of the companies are challenged with quality data so my question to you when it comes to quality data what's your definition of quality data of course it is a very broad term it's a very big term how do you define quality data what would you tell them type your answers in the chat box I said analytics is all about questioning right so I will be asking a lot of questions here and getting insights from you so how do you define quality data okay data can be used for answering the question Gartner data fit for use I like even as answer fit for use so let's elaborate the term use Ivana what do you mean the utility of the data where is it used for fit for use great answer quality requirements Liz is also very good quality requirements Claire has got little bit technical things about getting into metadata so on use depends on the application okay what are the applications Ivana use depends on the application fit for use depends on the application what are the applications of data where is it used for mm-hmm okay great answers love all your answers on data quality so basically I define data quality in my book data quality which was published by Wiley this year I define data is of the data quality is good if it is fit for use for three main reasons the purpose of data is for three main reasons what are the three main applications of data number one operations number two compliance number three analytics these are the three main reasons why enterprises whether it's universities or or for profit companies are capturing data to run their business operations to derive to for compliance activity and to derive insights so let's take a simple example of a procurement department in a company whichever that company might be so you are issuing a purchase order to all your vendors why are you issuing purchase orders what is the purpose of this purchase order data to run your operations because you say that I need thousand quantities of this ball bearings you are issuing the purchase orders to this vendor and you want to know how many quantities have been delivered when is it coming where is it delivered so on and so forth so it's an operational document number two it's also a compliance document because when you give a purchase order to the vendor to supply a hundred quantities of ball bearings if something goes wrong with the delivery you can hold the vendor accountable for not delivering those hundred quantities of ball bearings the vendor can also hold you accountable if you are not making the payment on time based on the prices that is discussed in the purchase order so it's also a compliance document so I have done many projects in oil and gas sector in oil and gas sector most of them are capital projects so once a vendor gets a multi-million dollar purchase order their stock prices just goes up oh I got a one billion dollar purchase order from this big oil company Shell Chevron and so on so that stock prices just goes up because they got a huge deal project so it's also a compliance document and lastly it's for analytics let's take over a period of time I issued thousand purchase orders in the year 2020 I can look at the data of all the purchase orders that have been issued and glean insights mine the insights from those thousand purchase orders and derive different insights for example if the quantity goes up the price is going down because I am leveraging the economics of scale and so on and so forth so overall what Iwana was talking about fit for use what is that use three main reasons operations compliance and decision making so how do you measure this or analytics how do you measure this whether the data quality is good for your purpose so which is basically looking at the 12 key dimensions which really matter so in the interest of time in my book data quality I've explained I've spent almost like 25 pages writing about each of this 12 key dimensions but in the interest of time I'll talk about two things here which is correctness and accuracy which I've seen is often misunderstood in most of the projects which I work for so I'll give you a story about how these terms really helped me in one of my one of my projects so this was a big telecom company in Toronto so I was hired by this company and we had a big for consulting company also doing the project along with along with us so so one day the team comes to us and says hey guys your customer data is really messed up what do you mean 99% of the customer records which you have are are practically useless oh my gosh this is a big number 99% and trust me this also $20 billion telecom company how did we run business for so many years when 99% of the customer data is of bad quality so I said can we know more about what you are talking what do you mean how do you define bad quality the address data is wrong so I said so what we are able to send invoices to them or bills to them on time but hold on people move and when people move most of them don't call the call centre say that my address has changed because they have signed up for email notifications on their invoices on their bills so let's look at those people who have got out of this 99% how many of them have signed up for email notifications and it turns out to be 98% of them have signed up for email bills okay now the discussion is not 99% it's 1% data quality issues which we have the real thing that matters but even 1% is really bad so let's discuss about 1% what is this 1% for example in Canada we have address such as street is abbreviated as ST court is abbreviated as CT boulevard is abbreviated as BLVD so forth I'm sure even in Australia in New Zealand it might be very similar so there's lot of abbreviations but it's not that postman will not understand if you put ST in the address he knows that it is straight if you put CT he knows that it is court and all those things so what really matters is the attribute where you can distinguish between accuracy and correctness for example telephone number needs to be correct zip code needs to be correct whereas the address needs to be accurate there is a degree of correctness associated with accuracy so what we said was let's look at those attributes where it has to be correct and then we are going to define what the data quality is all about so when we look at the zip code that number came down to 0.0004% issues we had a different API to compare that and with that comparison actually the issue was not 99% it came down to just 0.0004% of the customer data had really data quality issues so why I'm saying this two things number one analytics is all about asking questions getting to know the root of the things is about what is the impact how did it happen and all those things number one next number two the definition of all this data quality dimensions especially accuracy and correctness telephone number needs to be correct first name and last name needs to be accurate in many places where I go my first name which ends with H is the H people drop that letter it's not a big deal but if my phone number is wrong my email address is wrong that's practically that data becomes useless for the company so this is where the 12 key dimensions really matter if you are interested to know more you can read the book on data quality by published by Boyle so before we go to the next topic which is transforming data into a business asset any questions or thoughts on what we covered now where we said what is analytics the key component for analytics is quality data we define quality data as utility in operations compliance and decision making and we use the 12 key dimensions to measure the level of data quality which is there thanks to Ivana and all of you for the help but before I go to the next topic any questions on what we discussed so far is my speech clear yes no question for now okay good okay so now let's get into the heart of the topic how do you get this data or how do you transform this data into a valuable business asset for the company we have been hearing a lot of terms like data is oil data is blood data is oxygen and all those things which is all good I take it with a pinch of salt but what really matters is how do you improve the data quality or the performance of data in business in business when you look at the top the CEO level kind of discussions they need data analytics to help in three major things all profitability discussions which I've been having with my clients who ultimately boils down to three main purposes why business business goals are there so what do you think are the three main goals of a business why do business exist what are the three primary objectives of a CEO for example type your answers in the chat box profit very good profit is one thing where what else profit I'll again break that profit into two components one is revenues second one is cost what are the other things which the CEO is responsible for I'll wait for one answer okay so overall there are three at the highest level at the C-suite level the goals of the company are number one increase revenues number two decrease cost and third one is to mitigate risk or including whatever things risk including reputation investor relations and everything so pretty much the data analytics projects which we are doing in the company directly or indirectly associated with these three main purposes because that's where the value is all about so let's start first with transforming data into a business asset for these five reasons these five reasons or three five strategies will help you transform the data into a business asset so let's talk about the first one which is data should be tied to a purpose can somebody remind the group why data is there in a company why do when they require data just to recap of what I just said or extension of what Ivana's utility or application discussion was so basically companies need data for three main reasons which we said not because somebody told it's oil blood or oxygen or somebody told hey the most valuable company in the world is Facebook or Meta because they have tons and tons of data not all those reasons companies need data for three main reasons number one to run their operations compliance or analytics you capture data for these three reasons if you just capture the data just because you don't have you're not able to attribute them for these three reasons you will end up with lots and lots of dark data in the company for a study and they found that 73% of the data in a company is unused IBM and Carnegie Mellon data study and they found that 90% of the data in a company is unused so if you just if you capture the data which is not needed for these three main these three reasons you will end up with lot of dark data which is practically useless for your company. Number two use different types of data broaden the horizon of data most of the data which we talk in a company is the first party data which is the data the company originates company creates like for example invoices, remittances orders, contracts and all those things are data which the company generates but there are other types of data as well which the company is concerned about what is that zero party data which is all about the prospects all about the potential counterparties the company is working with. Number two the data about the second party data which is the data about the partners like for example Whirlpool appliances are sold in North America through Home Depot L'Oreal's perfume products are sold in the retail stores at Sephora's retail stores so the data about all the perfume products of all the cosmetic products of L'Oreal he is there with Sephora so several L'Oreal would also be interested to know what how the consumers are buying their products Whirlpool would also be interested to know how the customers are buying the products in Home Depot, Best Buy and so on and so forth the second party data is all the data about your counterparties and the third party data is the data which you can buy for a fee or even get it for free like for example you want to get all the data about the weather you can go to weather.com and get the data about the weather you want to know the crude oil prices you can go to Argus or Bloomberg and get the crude oil price data this is the data which you can buy for a free or which is available easily for not just for you but even for your competitors but unfortunately most of the data quality discussion or the data discussion is happening on the first party data which in my experience is just about 30% of the data that is needed in the company majority of the data the company is concerned with zero party data second party third party data is outside the company and the companies are not thinking about this data and how to access it how to integrate into the data landscape so that they can better use it for operations compliance and decision and decision making Cambridge Analytica got into trouble because the data they used from Facebook was who I didn't meet the privacy standards and that resulted in the ultimate closure of Cambridge Analytica so how did it impact Facebook it impacted Facebook because they were fine millions of dollars so this was like the second party data between the Facebook and Cambridge Analytica which was not managed properly and because of that it resulted in serious issues both for Cambridge Analytica and and Facebook as well so what's the bottom line the bottom line here is that many companies are just working on the first party data good but that's just about 30% of the data which is really concerning you the remaining 60 to 70% of data is outside your company which could be a zero party data on potential prospects the second party data the data about you which is there in your counterpart which is there with your counterparty and the third party data which you can further use to enhance your data and get the right insight so brought on your horizon and expand your data footprint again three main purposes of data the data that is used in compliance and operations is defined predictable is in a proprietary format and the focus is on business process like purchase orders it is required for operations and compliance is defined in a standard format it's captured in a native format and so on and so forth but analytics is very different it depends on hypothesis depends on the questions which you are asking the data might be there the data might not be there but the data is always required in a structured format for example if you're running linear regression for example both the dependent and the independent variable have to be numeric in nature so that you can derive those insights and unlike operations and compliance where the focus is on process here the focus is on business insights so overall the MIT took my quote and they put it in the one of their articles which is on data analytics they said companies today have which they said basically and lots of data it's not about data collection but using it in the right using it in the right way so that's the second one which is all about utility utility of the data the fourth one is about focusing on business transactions business transactions are the ones which is really needed because they are focusing on business resources like transactional data like order data invoicing data, remittance data delivery data all those things are focused on business resources there is a money associated with this there is a relationship between the counterparties here it has got a two-fold impact on accounting give and take and all transactions focus on performance and decision making so you might have tons of data in your company but if you want to get value from your data you have to focus on transactions for this 5k reasons basically you do the data governance because if you don't do the data governance they say the data quality degrades by 2-7% every month and the data has to be governed throughout the data life cycle right from data capture data integration, data science and decision science basically wherever data is used you need to govern this so I put below the system architecture a typical system right from MDM all the way to BI analytical system on where the data governance needs to happen because in each of those each of the systems could have their own system of records to manage the data so these are the 5 key strategies which you can use to transform your data into an asset because if you don't manage the data well the data might become a liability which is very hard to which is very hard to dispose of now before I go to an example about how to do the how to do how I applied all those things in a project any questions on thoughts questions I think there are questions probably need more time to discuss but so we can ask questions later after your presentation okay sure okay no problem so let me talk about an example or a project that I did on profitability analysis in a retail company in the US so the analytics which we discussed all about measurement and improving performance for a company can be broadly classified into three types descriptive analytics what happened it's on historical performance predictive analytics what will happen which is on the future and prescriptive analytics which is about the best course of action to derive those future state for example let's take we have Nancy who is the CFO and Bob is the data scientist Nancy goes to Bob and she asked the question Bob how much money we made in 2020 Bob says 350 million dollars that is descriptive analytics Nancy says Bob assuming the business condition remain the same how much money we will make in 2024 Bob says this is about predictive analytics let me go and run some trend analysis regression analysis and I will come back to you in a few minutes Nancy after one hour he goes to Nancy's office and says hey Nancy assuming business conditions remain the same for the year 2024 we are going to make 375 million dollars Nancy looks disappointed Nancy says hey Bob my CEO tells me that we need to make 400 million dollars can you use data analytics and tell me what are the factors which will help me get a profitability of 400 million dollars Bob says yes I can run some scenario planning sensitive analysis and all the prescriptive analytics techniques and I will come back to you in a few minutes after a few minutes Bob goes to Nancy's office hey Nancy to make it happen to get 400 million dollars as profit there are 4 things which you need to do one is improve the price number 2 sell more number 3 reduce the direct cost and next number 4 reduce SG&A which is sales and administrative expenses these are the 4 factors which will help you get profit of 400 million dollars so overall when we talk about analytics it's not like a siloed conversation that's happening it's a continuum that is happening between Nancy who is an insight consumer and Bob who is an insight producer there is a back and forth conversation happening between insight production and insight consumption the more questions you have again more questions so questions generate more questions so that you get a holistic understanding of what is happening what I applied in this project was basically these three types of analytics the first one is descriptive analytics so we wanted to know what percentage of SKUs or the stock keeping units generated 80% of the revenue same with customers how many customers generate 80% of the revenues given that you have approximately 1600 customers and 480 SKUs we did the Pareto analysis for 18 months and we found that just 17 SKUs generated 80% of the revenue just 28 customers out of 1600 customers you have in your network generated 80% of the revenues so just a simple descriptive analytics on Pareto analysis the next one is about predictive analytics we said for this 17 SKUs and 28 customers what is the prediction revenue for the next 18 months so we ran regression analysis for the last 3 years and we said assuming business condition remain the same in the next 18 months the revenue from this 17 SKUs and 28 customers will go up by 4.6% good the third one is prescriptive analytics so what is the impact on gross profit if I increase the price of the SKUs 17 SKUs for this 28 customers by just 1% so we ran sensitivity analysis and we found that if we increase the price of the 17 SKUs by just 1% the gross profit will go up by 8.2% simple questions but powerful insights for the company so overall this is the key takeaways most of the companies just focus on first party data use the data flywheel use the 4 different types of data and get a holistic understanding of what is happening in the company data quality is very contextual 12 key dimensions definitions thanks to Iwana for defining the purpose and everything but still quality is very contextual based on time location and your objective data quality achievement or data governance is not like a one-time project you just do it and you run away from your project it's an ongoing continuous improvement initiative transactional data are most important when you talk about data quality data should move that's what transactional data is all about and ultimately when we talk about data management it's about change management which includes process, data IT systems and people so more on this is on my book on data quality it's available in all the stores including amazon the reason why I wrote this book is in most of my consulting project the companies needs and aspirations are very high it's just like drawing a horse for example they start on the left side they want to have a picture like this but when the project starts the standards come down so the standards are somewhere in the middle and ultimately when the project goes live you get something on the right side so I wrote this book based on the success and failures of different projects I've seen in my in my experience and the including the research and all the discussions I had with many smart people and I wrote this book and these are a couple of feedbacks which I got as well there are a few more in Amazon as well so ultimately the last slide of the presentation as things become much much clear if you think up by putting the customer at the center of your data analytics projects because customer is a reason why the business or organization exists in the first place so thank you very much for your time so this is my email ID and I'm active in LinkedIn as well please connect me in LinkedIn if you are there if you are not in LinkedIn my email ID is here and I'm here to answer your any questions