 Hello, and welcome to Data University Talks, a podcast where we discuss with industry leaders and experts how they have built their careers around data. I'm your host Shannon Kemp, and today we're talking to Dr. Prashant Southakal, the founder and managing principal at DBP Institute and author of the recently published book Data Quality, Empowering Business with Analytics and AI. With a robust catalog of courses offered on-demand and industry-leading live online sessions throughout the year, the DataVersity Training Center is your launchpad for career success. Browse the complete catalog at training.dataversity.net and use code DBTalks for 20% off your purchase. Hello and welcome. My name is Shannon Kemp and I'm the chief digital officer at DataVersity, and this is my career in data, a DataVersity Talks podcast dedicated to learning from those who have careers in data management to understand how they got there and to be talking with people who help make these careers a little bit easier. To keep up to date in the latest in data management education, go to dataversity.net forward slash subscribe. Today we are joined by Dr. Prashant Southakal, the founder and managing principal at DBP Institute and author of the recently published book Data Quality, Empowering Business with Analytics and AI. Normally, this is where a podcast host would read a short bio of the guest, but in this podcast, your bio is what we're here to talk about. I love the book. Prashant, hello and welcome. Thanks. Thanks, Shannon. It's a pleasure to talk to you again. I love talking to you and thanks for the opportunity to discuss about this interesting topic. Glad to help you. I'm so glad to have you with us. I've known you for a while now and just so excited about all the things that you do and I'm excited to hear and learn about how you got there. You're the founding and founder and managing principal at DBP Institute. What is DBP Institute and what is it that you do? Yeah. DBP Institute stands for data for business performance. This is the name of the first book which I wrote, which is in 2017. I took the name of my book and made it into a company as well. The reason why I started this was mainly to help companies get value out of data and of course, analytics. Many companies, when I talk to business people, when I talk to stakeholders from numerous companies, they have some varied definitions or interpretation of what data can do and what data cannot do. We started this company to help them better leverage data and get improved business performance. I have a strong believer in saying that data is an asset only if you know how to manage it well. If you don't know how to manage it well, it becomes a huge liability which is very hard to get rid of. So, data is definitely an asset, data is oil, blood, oxygen, everything, true, I'm not disputing that, but all those things will happen only if data is managed well. If data is not managed well, it's a huge liability which is very hard to get rid of. So, we started with this company in 2012. So, this is the 11th year in existence. We have served organizations in Vietnam, in Australia, in Spain, Netherlands, US, Canada, India, all those countries, almost like 35 to 40 companies we have served so far helping them get value out of data and analytics. And the three things which we do to help them get value out of data and analytics is number one, consulting. Number two, education, which I strongly believe consulting is also education because if you are not embarking on the education aspects well, which we call it as data electricity, you are not going very far with data analytics. And the third services which we do is research, helping the companies get more value about what's the industry seeing, what are the trends that are happening in the world of data analytics and all those kind of work, which I call it as research services. So, basically, DBP Institute does three things, consulting, education and research. Very nice, very important things. So, and you know, in looking at your bio, you have so many other titles and engagements currently that you're also doing, so including adjunct professor of data and analytics at the IE Business School and advisor on the CFP editorial board for MIT CDO IQ program. Can you tell me about all the additional titles that you're carrying and what you're doing? Yeah, so fundamentally, Shannon is two things that's resonating in all my engagements. The first one is data analytics. So I do all the work related to data analytics. The second thing which I do is education, because even if I do consulting, which is our biggest revenue stream, in my view, consulting is also education. You're also educating your stakeholders about what to do, what not to do, so on and so forth. So the two things which are resonating in different shapes and form, I'm an advisor here, I'm a consultant here, I'm a professor there and all those things, it basically has got two themes. Number one, it's all on data analytics. Number two, it's all about educating the community or the clients or building the community as well, so that they get to know better, so that they can get better leverage out of data analytics. That's what I do wherever I go. Well, that's again, so very impressive and I really enjoy that about you. And so I'm guessing too that your book fits right into that theme. So you're also the author of three books now, right? And the most recent publication, as I mentioned, Data Quality Empowering Businesses with Analysts in AI. So tell me a little bit about why you wrote this particular book. OK, so just a little bit of a context. You know, the first book which I wrote, Data for Business Performance, is basically to help the business people get value out of data. Because when I used to go for my consulting engagement, work with the supply chain folks, CFO community, treasury, controllers and all those people, they were under the impression, saying the data means it's IT guys. Let the techie guys to take care of the data stuff. So our job is something else. So I said, no, it's like both IT or IT stroke data and business, both of them have to come together. So that's the reason why I wrote the first book to educate the business community about data analytics. The second book which I wrote is Analytics Best Practices, which incidentally was ranked as the best analytics book of all time last year, is to find out what works and what doesn't work. There's a lot of studies which has been done by Gartner, McKinsey, Forrester and all this organization, which talks about the poor success rate of data analytics project. The same thing I see in my practical experience as well. Not many analytics projects are successful. So based on the kind of patterns which I saw, as well as the research and the secondary work which I've done, secondary research which I've done, I took out the top best practices on what companies can do to leverage data and analytics and get better results. So that's the second book. Then the third book which is on data quality is primarily to help companies on how to get good quality data. We all know that garbage in is garbage out. So many companies are talking about, hey, let's talk about data analytics, let's talk about insights, let's talk about decision making and all those things. I said, all those things are good, but one of the fundamental ingredients for that is good quality data. So I wrote this third book. So to give a little bit of context about the book, before I wrote the book, I did a research survey to see what the industry is talking about. We did a survey in my company and we said, we are seeing poor success rate of data analytics projects. Can you tell us why this is happening? What are the main reasons, the root causes, why we are having poor success rate in analytics project? And three things came out from that research. The first one is data culture. The second one is data quality. And the third one is data literacy. So we took out the top three things and said, let's do something about it. And this year or last year, when I started writing the book, I took data quality, put all my practical experience, did a lot of interviews, spoke to a lot of stakeholders, thought leaders, practitioners, so on and so forth and came up with this book based on a framework called as DARS, which I've implemented in my consulting work. Which stands for Define the Data Analytics Problem or Data Quality Problem. Number two, assess where you are standing with regards to data quality, which is all about the world of data profiling. The third one is remediate or realize or improve the data quality. You can assess and then you implement it and the last one is sustain, which is control the initiatives which you have implemented so that the data quality remains good. Research says that the data quality degrades by two to seven percent every month if you don't do anything about it. So you might have the great strategies about how to realize or remediate data quality, but if you say, this project is done, I'm going from here, then the data quality goes back to square one, which should be the initial state of poor data quality. So you need to come back with sustainment measures which includes data governance as well to make things happen. So this whole book, which is about 12 chapters, is broken down into three phases based on or four phases based on the DARS framework, define, assess, remediate and sustain so that you can get good quality data. And overall, there are three chapters in each of those phases and then we have 12 chapters in this book. And it went live on the 1st of February and so far it's been doing pretty well, Shannon, in three months, more than 2000 copies of the book have been sold, which has exceeded my expectations. I love that. It's so timely and it is so important. We were just having a conversation with some previous people on data quality and how important it is and how this person was specializing in data governance for exactly that to, because she saw so many companies try to standard machine learning and it flopped because there was no quality to the data. So very timely. Yeah, yeah. So Prashant, tell me, when you were very young and an elementary school, is this what you wanted, what's just the dream? Is this what you wanted to be when you grew up? What was the dream? When you were just a little more? You know, Shannon, I was born and raised in India and when you are in India, every child has got probably two important goals as a child. One is to become a cricketer or else to join Bollywood. And I want to be a cricketer. Oh, nice. Do you play at all? Do you play at all? Not these days. These days, when I'm getting older, I've gone to less strenuous ports, like pickleball, for instance, versus cricket. But I used to play for cricket and so that was my goal or ambition when I was in the school. But I love this as well. This is also a lot of data and fun and everything. Yeah, so I finished my schooling in India and I did my engineering from India to end masters as well. Then I was picked up by Procran Gamble for their office in Brussels in Belgium. So I went to Belgium in 2000. That was the first time I actually sat in a flight. I was born and raised in a small town in India. So till I was probably 24, I never sat in a flight. Oh, wow. Yeah. So and my first flight was an international flight straight to London. Because why start small? You know, very big. Yeah, I went to Belgium. It was a big change for me, both from a lifestyle perspective, the culture perspective, the work as well, because it was my first job and all those things. So I was there for till 2004 and because of family reasons, I had moved back to India. So I came and joined GE in Bangalore and from GE, after a couple of years, I went to SAP. And SAP transferred me to Calgary in 2009 for a project. And after a couple of years, I got my residency, all the immigration thing got sorted out. And I left SAP and I went on my own. I started DPP Institute basically to help companies get value out of data. So that was my kind of a career journey. And when I was in Belgium, I met a couple of my friends from school as well. And they said they had enrolled for the PSD program. Whereas I went to the corporate world. They went to the academic world for the PSD program. And they said, why don't you join as a part-time student for PSD? So I took my PSD program, started my PSD in Belgium in 2000. And I was just going very slow because of work, family, I got married, I had kids and all those things. And finally in 2014, I got my PSD. So I took almost 14 years as a part-time PSD students to do my PSD. And that helped me as well because that gave me a lot of kind of a credentials, skills, expertise about how to look at data, how to analyze a problem critically, so on and so forth that helped as part of my career in consulting and teaching and all those things. Then after I moved to, when I was in Canada, I took the MBA program at Northwest and Kellogg, which also gave me a different perspective about looking at the data or the technology problem from a business angle as well. So that I could talk to the right stakeholders in their language, so as to keep the value of data analytics. So the MBA program also gave me a good packaging skills on my technical aspects as well. So the PhD, MBA, my core skills about engineering and computer science, all those things came together. And so far it's been going well, Shannon. It's been 11 years since I started my company. I have not gone back to my previous bosses asking for work, asking for a corporate job, that's what I meant. So far it's okay, it's going on. Well, let's back it up again to see your PhD is in what? My PhD, the degree was in technology management, but like most PhD programs, you do a lot of research, which is to collect data, analyze data and so on and so forth. So that's where I got those kinds of skills as well. But when I was in G, I was a Lean Six Sigma instructor as well. So we had the green belt program, which is a five day course. So I used to teach the five day course in G. So it also gave me a kind of a different flavor about how to use data and analytics for business results. So in my view, what people today call as data analytics, G used to call that as Lean Six Sigma in 2000. It's practically the same, something like old wine in a new bottle kind of thing. The name has changed, but the content is pretty much the same. The G experience with Lean Six Sigma, deploying those techniques to solve real world problems, to come closer to the customer and understand the problems and everything was also useful. So it's a bunch of different things which I did in my career, which is PhD, MBA, teaching in G, learning the Lean Six Sigma skills and of course the core thing, which is data and programming all came together. And now I'm able to package it so that I can, I can help my clients get more value out of data and analytics. Well, I love that you went up to your MBA to get that, to get that a well-rounded education to talk to your clients and understand that the whole, a larger strategy for an enterprise company. Correct, yeah. You know, the MBA for example, MBA is in my view, it's not like a core skill which you can get like a master's in computer science or a PhD or those kinds of things. It's like a packaging skills. It gives you a lot of confidence when you do an MBA from a good program. So that's what helped me as well. Very nice. So tell me, I mean, you've worked with data so much through most of your career in school and so what is your definition of data and how do you work with it? Okay, you know, technically you must be knowing data is actually a plural word, which is basically a bunch of attributes which make up that particular data. So if you take a purchase order, we say purchase order is a data but what exactly are we talking about? We are talking about a bunch of attributes which make up the purchase order, which could be the vendor ID, item name, price, quantity, unit of measure, so on and so forth. So when you talk about data, these are the, it's practically a bunch of attributes or fields in a specific format which constitute those data elements. So we are not talking about a specific, we are often talking about a business object which represent your process. So now when we talk about data, it's in my experience, data has got three main purposes from a business perspective. Why companies need data for three main reasons. The first one is for operations. Number two, for compliance. And the third one is analytics, which is to have decision-making. Let me use the same purchase order example. Let's take your, you want to buy 100 quantities of this pan for your company. So you say, I'm going to issue a purchase order for this vendor to supply this 100 quantities of pan. Okay, good. Why do you issue a purchase order? Because it's an operational document. You have made a request to somebody for some company to supply this thing so that you know how many you have ordered, what price and so on and so forth. So you have an operational document. It is also a compliance document because if something goes south with this purchase, which is purchasing activity, you can hold the vendor accountable for him or her not delivering this pants. And vice versa for him too. If you are not honoring the prices, he can say, you set this price, now you are not paying me this price based on this payment term and all those things. So the purchase order is also a compliance document. Now the third thing becomes interesting because the one purchase order per se will not give you much about analytics. You'll not be able to glean much about that purchase order just by looking at one purchase order as such. So let's assume over a period of time, every week you are issuing many purchase orders to buy this bread pan to vendor one, vendor two, vendor three, so on and so forth. So over a period of one year, let's assume that you have issued 100 purchase orders to many different vendors. Okay, I have a critical mass. I have a critical number of purchase orders. Let's start looking at patterns. So what is the pattern which I see from this purchase orders? So I look at all this 100 plus purchase orders and I see, hey, one of the insight which I get is a more number of quantities which I order, the unit price of the pan is actually decreasing. Oh, this is a great insight which I have. So instead of having a purchase order issued for 100 different vendors, let me create an MSA, a service agreement or a contract and let's rationalize everything and talk to just 10 vendors so that we reduce, we do the economics of scale and make things happen and get the business benefit. So bottom line, so if you look at the whole analytics or the whole data journey, most often it's operations and compliance which are driving the data origination and data management activities and analytics is a by-product of it, provided you have captured those elements in a critical number, in a critical mass. So I always tell my clients when you want to leverage data analytics and want to leverage data and want to do analytics, first thing you have to do is make sure that your business processes which are pertain to operations and compliance are good. Only then you'll be able to do analytics. That's number one. Next number two, analytics is all about hypothesis base. It's about asking questions. The more powerful questions you have, the more insights you get, stronger insights which you have, which you get. So we'll develop an act for asking questions. So analytics is about questions on the data. No questions practically means there is no analytics. I tell my clients, moving to the cloud or a developer, Python scales, data storytelling are all important. I'm not disputing that, but the first thing which you need for analytics is the ability to ask good questions. No questions practically means there is no analytics. I love it. It's a very descriptive answer. Visit dataversity.net and expand your knowledge with thousands of articles and blogs written by industry experts, plus free live and on-demand webinars covering the complete data management spectrum. While you're there, subscribe to the weekly newsletter so you'll never miss a beat. Well, let me back up here a little bit. So with all of that, do you see the importance of data management and the number of jobs working with data increasing or decreasing over the next 10 years? And why? You know, I believe the number of jobs working with data will increase despite being automation, being discussed, chat, GPT being discussed and all those things. Why? The first thing is the amount of data that is going to be generated and captured is going to increase at an exponential rate. So they're saying that by 2025, the amount of data in a company is going to double every eight hours. So there's going to be more and more data that's going to be captured. Now you might say, what data is captured? So in my experience, it's going to be the TAVI data which is going to be captured. 80 to 90% of the data that is going to be captured is the TAVI data. What is that TAVI? It's a new acronym, which I came up for in my new book which stands for Text, Audio, Video and Images. So most of the data that's going to be captured, 80 to 90% of the data is going to be in an unstructured format which is this TAVI data. So what is a good thing about unstructured format? The good thing about unstructured data is it requires quick digitization. You can digitize that artifact quickly. But what is the flip side of it? The flip side of it is it doesn't have a defined data model. So you need to create the data model. You need to build the data model to suit that particular data object. So for that, you need to come up with rules and all those things which requires a lot of thinking. The automation just happens only when you standardize. So standardization requires a lot of rules, a lot of thinking and all those things which require human mind to work on. So that's the reason why I say a lot of the data jobs will be created going forward. But at the same time, most of the data jobs in my experience are going to be created will not be the ones which we might have which we would have seen few years back. It will be different level of skills that are required which requires a lot of questioning skills, collaborative skills and all those things. So I've been talking about skills about hard skills and soft skills which you require to be a data professional. So overall Shannon, the number one, to your question, the jobs will increase, the jobs will require different kind of skills which we'll talk later as well. But overall, it's all about more jobs that are going to be created because fundamentally every process is going to be digitized and there's going to be more and more data that's going to be created. I totally agree with that. And that's a huge percentage of unstructured data and that's such a big change from what it used to be even just five years ago. Absolutely. Yeah, yeah, that's amazing. So what advice then would you give to people who are looking to get into career and data management whether it be analytics or any other aspect? Yeah. Okay, that's a great question which goes very well with the previous question which you asked about the jobs. So I would say probably two buckets of skills people can look for. The first one is the hard skills which I say the top five, if I had to say about the top five hard skills the first one is the SQL skills because SQL is like the language of data management. So you need to be strong on SQL, number one. Next number two is data profiling skills which as somebody gives you a data you need to tell a story about what this data is all about. Nobody wants to hear about pages and pages of information. There's something on data profiling which is called the exploratory descriptive analytics or data profiling skills. So that's number two. Give you a summary statistics about what's happening on the data or what is this data all about. The third one is about predictive analytics and the most important skill one can have in predictive analytics is regression. In my experience, I have seen that almost like if you are able to get a good hold of regression almost like 60 to 70% of the heavy lift on predictive analytics is taken care. Regression is like the default scale or the most important scale when it comes to predictive analytics and especially linear regression. That's one of the most component best skills to have. A combination of linear regression and logistic regression I think would take care of almost like three fourths of the use cases in predictive analytics. The fourth one I would talk about is prescriptive which is all about scenario planning, sensory analysis, optimization, everything. So when you talk about future generally people have the question what if, any situation you talk about the future there's always what if. So you say the revenue is going to be 300 million then you might say what if this happens? So what if analysis is all about prescriptive analytics? So the fourth skill which I would say on the hard aspect a hard skill aspect would be the prescriptive analytics and lastly it's about the communication which is about data storytelling, ability to tell a good story about your data analytics. So these are the five skills or five top skills I would say on the hard skills which is about number one just to summarize your SQL skills, data profiling, regression, prescriptive analytics, the last one being data storytelling. But at the same time just having the hard skills will not help you grow in your job. So I would also say that you need to build your soft skills as well. So in Forbes I wrote a blog on the five seats of soft skills which is number one is about communication, ability to communicate your insights very well. When I say communication is also about listening skills, listen to what the stakeholders have got to say, business have got to say about their problem, about their pains be empathetic to their needs. The first one is communication. Number two would be critical thinking, ability to question skills we discussed about questioning as well. The third one would be curiosity. The fourth one would be collaboration, work with different teams. And the fifth one is be creative, creativity. So overall the five C's of soft skills are communication, creativity, critical thinking, collaboration and curiosity. So these are the top five skills, the hard skills and the soft skills which I would advise. But at the same time it applies to me as well. Even I'm done, just because I'm saying doesn't mean that I'm the guru in this. I'm also learning every day. Are we all right? That's, I think, you know, I think that's so something that we all kind of learned that no one is perfect, right? And we're all like constantly learning and then we get to ski better just right? Just constantly getting better. It's the only way I'd love, and I love that you have five bullets for each one and emphasize the soft skills because it's so important, right? I think we need to learn that enough or spend enough time on those. Yeah, yeah. And they are skills you can practice. Absolutely, absolutely. You can practice and grow and measure and grow and all those things, yeah. So, you know, Prashant, just to kind of expand on that a little bit, you know, where do you go to learn? You know, you work with your clients, you learn a lot from your clients. Where else do you find your growth? So, you know, the first one is in the beginning of the year, I have, I generally set up a goal. These are the things which I want to learn, both for my company and for myself. So I have found, I came across a statistic which says that writing your goals is 30 times more effective than just thinking or talking about it. So I write down my goals and I have it in my whiteboard. I review it regularly, watch my goals and I make a review of my progress every month where I'm standing both on the company side as well as on the personal side. So when it comes to learning, like I have been a big investor, not just in teaching, but also in receiving knowledge. So there's a few skills which I have, which I want to learn this year, which includes mostly on the ESG side, which is on analytics on the ESG or the carbon capture and all those things. So I've been working on that, building skills in that area. I've been trying to learn more about the chat GPT and the generative AI and all those things, how the whole thing works. That's another area which I've focused on. So a couple of topics which have selected both mostly on the world of artificial intelligence as well as on the finance world because that's where our stakeholders are. As a company, we approach the financial stakeholders to help them with their analytics project. That's our ICP, which is the ideal customer profile. So that's what we do. And the places which I go as number one is to go to universities. Like last year itself, I took a six months course at Rockman School of Management at University of Toronto. So to build my skills on all the latest business trends which are happening. So this year on the future of finance, I'm training myself on those skills as well. So I never know if there could be other places where I might go to get more skills. But it could be a combination of universities. It could be the internet, the YouTube and the sites for Microsoft and Coursera and all those places where I go and take courses. Then also places such as Data Varsity where I take their courses. And I also teach at Data Varsity. So it's like a give and take kind of thing. So online places as well. So it's a bunch of different places where I go internet, the regular universities as well as specialized providers such as Data Varsity. Oh, impressive. I love that you write out goals not just for yourself, but for your company. Yeah, yeah, that's really impressive. And that you're still taking college classes. Oh my gosh. So how many acronyms, how many do you have after your name right now? So you have PhD, MBA. It's a chain. It's almost as long as your name. It's really impressive for sure. But I still have room. It's not coming to the second line. It's not coming to the second line. It's still in the first line. Goals, hashtag goals. I love it. Well Prashant, okay, so tell me. I'll be remiss if I don't ask if somebody wants to reach out to you and work with you and your company. How do they find you? So the best way to reach out to me is LinkedIn. So I'm pretty active in LinkedIn. That's the best way to reach out to me and we can kickstart the conversation in LinkedIn and then we can have a Zoom call, email, exchanges, so on and so forth. So the best way to reach out to me is to connect with me in LinkedIn. Very nice. And of course, how does somebody get a copy of your most recent or in one of your three books? So if somebody wants a copy of this book, which is the latest one, they can go to the Wiley website or they can go to Amazon and buy it. And if they need a copy of the other two books, which is data for business performance or analytics best practices, again, they can go to Amazon and get it or they can go to my other publisher, the Technics, Technics Publication, which has published those two books and get it from their website. But all three books are available in Amazon and all other e-commerce sites. Very nice. And as you mentioned, we have some of your training in our training center, which is very nice, which we really appreciate and people have been loving. So we will put links on the page, the podcast page to all of those things so that you can connect with more people. Well, Prashant, thank you so much. This has been fascinating and I love your passion around it and the analytics and how much you love the education and helping people. That's really impressive. It's a pleasure to talk to you once again. And I'm glad to have the community and build a good data analytics group. And thanks to you and Data Varsity for all the work you guys are doing as well. Oh, thank you. Thank you so much. Well, that is all the time we have for today and to all of our listeners out there, if you'd like to keep up to date in the latest podcasts and in the latest in data management education, you may go to dataversity.net forward slash subscribe. Until next time. Thank you for listening to Data Varsity Talks brought to you by Data Varsity. Subscribe to our newsletter for podcast updates and information about our free educational articles, blogs and webinars at dataversity.net forward slash subscribe.