 Hello and welcome. My name is Shannon Kemp and I'm the Chief Digital Manager of DataVercity. We'd like to thank you for joining today's webinar, How to Strengthen Enterprise Data Governance with DataQuality, sponsored today by Syncsort. Just a couple of points to get these started due to the large number of people that attend these sessions. You will be muted during the webinar. For questions, we will be collecting them via the Q&A in the bottom right-hand corner of your screen. Or if you like to tweet, we encourage you to share highlights or questions via Twitter using hashtag DataVercity. And as always, we will send a follow-up email within two business days, containing links to the slides, the recording of the session, and additional information requested throughout the webinar. And if you'd like to chat with each other or with us, feel free to do so. Just click the chat icon, which is in the bottom middle of your screen. Now, let me introduce to you our speakers for today, Divinity Palace and Harold Smith. Harold is the Director of Product Management at Syncsort, responsible for the Trillium Software product line and co-author of Patterns of Information Management, published by IBM Press. Harold has spent the past 20 years specializing in information quality and integration and governance products with a focus on accelerating customer value and delivering innovative solutions. He has written extensively on the integration, management, and use of information. Harold has been issued four patents in the field of data management and integration. Divinity is a pre-sales consultant for Syncsort Trillium and specializes in data quality, data governance, data integration, and big data. She has considerable expertise in data quality and enrichment, having spent 16 years in the data marketing arena. She started and ran her own database marketing agency for 10 years before its acquisition, which she has stayed as a group head and data and insights for four further years. And with that, I will turn the floor over to Divinity and Harold to get today's webinar started. Hello and welcome. Thank you, Shannon, and welcome everybody. We're going to have you join us today. And let's get underway. You have these slides. So what we wanted to cover today, just kind of give a brief introduction, talk a little bit about, you know, why data quality, data governance are top of mind and then get into some of the understanding of how this is such a symbiotic relationship between these two core capabilities. And looking at then how data quality is strengthening kind of the overall enterprise data governance framework. As Shannon said, I'm Harold Smith, Divinity Ours is with me as well. And let's get underway. We live in a world of data. We live in a world with more and more data all the time. We've heard various phrases coming about in terms of, you know, data is the oil of the future, the fuel of the future. You know, it's a driver of growth and change. And it's changing how we're doing business. It's changing how we're approaching business. It's changing how we need to manage and govern our overall environment because it's a different quantity. It's not something that's just extracted and refined. This is, you know, digital information, but it's core and central to each and every one of us. You know, this is in effect our digital presence. It's how we're recognized by all the organizations that we're dealing with. And as we're interacting with these particular organizations, we have an expectation that our information is going to be managed and governed appropriately. And similarly, we also see, you know, the regulations are being applied to help, you know, persist this view that, you know, everybody has a right to, you know, the right information being collected and being able to, you know, utilize that in an appropriate manner. And this is really fueling a lot of approach in terms of, you know, thinking about data governance. But it's also the way that organizations look to, you know, extend their businesses. How can they best reach out to us as consumers? How can they reach out to their suppliers, their vendors, and do more with those relationships and really utilize the information that they have to help drive, you know, business benefit. So with that, you know, data governance, data quality are really central to this particular mission. And, you know, this really comes into play in a number of different factors. I mean, first of all, you know, we all know that volume, complexity of data is growing. And it's growing in multiple different ways. It's not just structured information anymore. It's unstructured information. It's censored data. It's all the different things coming in from our mobile phones or Twitter devices or social media activity. Huge growth. How do we address that particular challenge? We're also looking at, you know, how do we get more information out of that? You know, with all this data coming in, how can we get more effective correlations? How can we dissect this in a way or analyze this in a way that's going to give us new insight and allow us to do our jobs more effectively? So we're in a world where our expectations about what data can do, what, you know, the value it can provide is growing. The expectations about how it's going to be managed are growing. Yet at the same time, we're consistently hearing that trust and confidence, you know, generally is a large segment of business leaders who simply trust information that they're getting to be able to drive decisions. And similarly we see that aspect in terms of trust and confidence extended out into some of the, you know, challenges that we as consumers have around data. One of the consequences of that is, you know, broader and deeper compliance and regulation. That's certainly one of the aspects that we need to look at, you know, from a governance perspective. And we see this with GDPR. We see this with, you know, the California Data Privacy Act. And we're more and more focused on terms of how our organizations are managing data and how we're, you know, ensuring that we're in fact complying with that data. So our CDO is lots of questions, lots of questions. And it's not just the CDO. I mean, this is across the board. I see this in interactions with our CEO and each and every one of our C-level leaders. And I think, you know, in terms of our interactions with various organizations, we see this consistently as well. How do we get a handle on this? You know, and can I trust it? Are we compliant? Do we have the right internal training policies? How do I democratize this data? How do I get people data literate? Huge range of issues. And that's really what, you know, central data governance is about. How do we begin to wrap our heads around, you know, these particular challenges? Thanks Harold. Hi everyone. It's Divinity here calling from a windy UK. So why is data quality so important? I guess when we think about it most people start to think straight away about data quality that affects the sales and marketing parts of an organization in terms of analysis or dashboards or reporting or segmentation models and targeted and so on. But of course it affects all parts of the business work loads and scheduling, logistics, finance reports anything that needs any form of analytics or decision from that of course is impacted by the data. And none more so than of course the risk side, the governance, the compliance. So data is really important. The quality of that data is imperative across the entire business. Let's just have a look at some terminology because we all have different flavors and terminology. If we think about data governance and simply add about the set of policies that process the rules, the responsibilities and it's ensuring that the data is available, usable, it's accurate, compliance and so on and it's secure. If we think about that in practice what we're of course talking about is things like the key data elements, glossaries, dictionaries, the data stewards, the council, the availability and the compliance of that data and that's pretty much exclusive to data governance. And if we think about again what we when we talk about data quality but everyone has their own flavor and meaning when they say to data quality from a data quality perspective I talk about that making sure it's fit for purpose for its intended operational use in the business and we're talking specifically about the accuracy of that data the completeness, the consistency, the relevance, the validity of that data and again tangibly we're talking about things such as data cleansing, data passing, matching, suppression de-duplication, enrichment, profiling and again those are fairly exclusive to data quality but of course there are areas where there's overlap and there's areas of this common interest so to speak and there are things such as policies and rules. We might come to it from a different side of the circle so to speak. One side's coming from, the business side one's coming from a technical side but we still have that area of common interest. The consistency and standardization of that data reporting, analytics, dashboarding, monitoring and also that key one data lineage being able to tie that data back all the way through to source to see how it's come about. It therefore creates what I like to call this symbiotic relationship between data quality and data governance and as soon as I tend to talk about symbiotic relationships everyone looks at me and says what are you talking about Divinity? Of course it forms from that word symbiotic meaning a relationship between two parties which often sit together for mutual benefit but don't compete with each other which is rather unusual and of course we can see that here. If we think about data quality it's all about making sure we have the highest data quality and we know that but of course in order for that to work and in order for it to work effectively within the business framework it needs to sit alongside and within a data governance framework to make sure that the data is clean and high quality for the business needs. So therefore it relies on data governance to perform to perfection and equally in isolation data governance which looks after the more commercial side of it, the business the rules, the policies, the dictionaries and so on and again in isolation it can't work too because it needs those data quality tools and this is important not only to clean the raw data but also to help compile the standards and the rules and monitor that data over time. But of course this is really important to us all because we all use information as Harold alluded to already. We all use information intelligence every day to make the right decision or what we think is the right decision so we all love dashboards and maps and charts and we use them to make business decisions. So here for example this is a simple dashboard which is showing sales by region for parts of the UK now I might look at that and make a decision and say I'm going to target the highest performing region and I'm going to talk to them in a certain manner and I'm going to talk to the lowest performing region in a different manner that data therefore being accurate. But in this particular case without knowing it my dashboard is actually throwing up incorrect information the data that sits underneath that is giving me false positives so here for example I have three different instances three different spellings of the same county name so already it's diluted my understanding of that particular value by three. And lo and behold we can see straight away before and after data quality we can see the difference it makes. So for example here the first the highest performing region first before data quality falls to third but the one that was in sixth place rises to the top so had I used that the information beforehand I'd have been targeting and talking to the wrong people in the wrong manner and that's really important so data quality affects everything including data governance which is what we're going to show today One of the things that we always have to keep in mind as we're looking at this is that what you don't know can hurt you and this is really critical as we look at that data governance perspective and say how are we complying, how are we monitoring this information, how are we assessing our overall value of governing data to the organization all these particular pieces whether it's looking at particular business information or even our information that we've captured on metadata and systems that we're using to help facilitate data governance we want to be able to have trust in that information and these key elements of data quality are really helping to address a lot of core issues and those issues are growing as we see more and more types of information. So this is not only ranging from sort of here's your particular sets of fields and are they complete, are they valid for their particular values but we're not having to deal with a huge range of information such as sensors and stuff which may be dropping data, we may have signal loss, we may have noise and extraneous information because we think about social media there's simply a lot of information coming in that may not be relevant for what we're trying to put together here differing levels of aggregations in valid correlations. How are we understanding these particular challenges and the approach that we begin to take in terms of addressing this really are very dependent on core data quality capabilities. Things that we need to be able to see in this particular context. How do we enumerate what we're even trying to do, what we're even trying to look at? Well that's really a central piece in terms of data governance. Whether it's policy based, whether it's standards, whether it's simply what we're trying to address and measure and be able to communicate out to a broader audience, this is really central to supporting that data governance initiative. And that is going to be important for us to be able to acquire that information in a way that we can understand it discover what it's about, validate it not just in our traditional dimensions but really looking at the broader holistic issues that we're now facing with this volume and variety of data. Capturing that in a way, documenting in a way that we are communicating our findings out to our business leaders, our peers, the organization as a whole so people have an understanding of what they want to be able to do and utilize this information for and be able to do that repeatedly in a way that's in a catalog so that people can find it. So the role of quality in data governance simply can't be overstated. It's very difficult for organizations to respond to regulation in a quick manner and that's what Harold alluded to already. That's for a number of reasons. Data now comes from a number of different source systems and disparate systems and data silos the number of touch points with the end customer has grown hugely and on top of that huge demand for real-time data as in real-time being sub-second and that comes from both the business but also the end customer too. If I went to buy something from a shop I'd expect them to know about everything I've bought online and vice versa in real-time. That's what we expect nowadays. The bottom line is all of these facets demand that the source data is accurate of rubbish in, rubbish out or garbage in, garbage out. It's more pertinent than it ever was before. So we all know what the regulations are for so I'm not going to go through these in any detail. They're there to protect privacy and disclosure, risk management, fraud prevention and so on and anti-money laundering and terrorism and things like that. And there's obviously a huge variety of types of regulation too depending on the industry. Obviously the most pertinent to European at the moment is a GDPR which came in 10 months ago and there's a similar one coming into American RCCPA and of course if you're in financial services you've got a whole plethora of regulations FSCS back at AML bars or if you're in healthcare HIPAA and so on so there's a whole host of them. What I thought would be useful for you all today would be say to take three of them and look at specific areas where data quality actually helps a particular path of that regulation process. So we're going to pick on GDPR to start which I'll run you through. So if you think about GDPR it's obviously a whole host of things in there but it's essentially about knowing what personal data and sensitive data you hold on customers and is it up to date, what you're doing with it, how you're processing it but you actually have permission from the customer now to actually do something with that data. Where is it stored, is it held on multiple systems and how much is it duplicated by? Who has access to it and how you're keeping it safe? That's I guess quite straightforward but on top of that customers, the end customers have now recognised that they actually have teeth. They have a lot more power than they had before so they have new demands on the business too. So what do we know about me? The data you hold about me is actually wrong so you've got to fix it. The so-called right to be forgotten, raise all my data for good. I don't want to hear from you again but it has a huge impact for businesses. How my data is being breached and you've got to be acknowledged within 72 hours. How are you using my data and an unusual one of demanding that a human deals with your data rather than machine learning or artificial intelligence? And suddenly of course it's serious. Google's been hit with the first 44 million pound GDPR fine and there's an estimate that had GDPR been put in place in the prior five years before last year, it would have reached 25 billion pounds in fines so suddenly it's very, very serious. So if you think about it from a business perspective and how it actually impacts the business and how data quality strengthens the government in terms of GDPR, let's have a look at the situation. We've got a customer here, they've interacted through different touch points with the business, so they've interacted with a retail shop, an online shop, a call centre, a service centre, a website, whatever it is and of course as we all do, we've left behind that data trail of different amounts of information, different qualities of information in different formats, different structures and on top of that we also have permission class so on three or four systems I've said yes I can be communicated and on one I haven't. So we all know this, we know this from a personal perspective and of course what we're looking for, what we're trying to strive for is that perfect record, that single view, that golden record, customer 360, call it what you want, it's the same thing, that perfect record which picks up the best bits from each record but more specifically manages those permissions. So if we have five records of the same customer, we have that ability to identify that customer by one record and manage those customer preferences and privileges accurately. And of course if you have single view, we all know the fluffy marketing soft benefits that you get from that the analytics to start with will be accurate. Accurate analytics allows good segmentation and marketing and profiling and targeting and so on. And of course all the reports, the dashboards, the visualisation will be accurate you won't get those false positives. When we think about single view and high quality data, we often always think about the business benefits but of course there are benefits for the customer too, that customer experience. They will be receiving materials and content and service and products that are relevant to them at the right time with the right channels if they've got the right permissions so their whole experience will be better. So it's about understanding those customers which allows and enables and needs to make appropriate communications, sending the right things at the right time. And from a general business perspective of course it's simply about making the right decisions, making the best decisions for the business. GDPR is about much more than simply what data you're holding, what you're doing with it, have I got permissions and so on. That's a really important tenet of GDPR, don't get me wrong, but it's about more than that as well. Because regulation demands proof and evidence and documentation so these four articles here you have to prove that you've actually adhered to the principles, you have to prove that you've actually processed that data correctly, you have to provide documentation that you've actually done it in a secure manner and that you've actually understood all that. So it's all very well and good having high quality data. GDPR goes much wider than that and DQ data quality has to provide proof that you've actually gone through the right measures to provide that data quality. So I guess as a summary is data quality tools are no longer nice to have. They're absolutely fundamental, they're critical to any business especially when it comes to anything to do with data governance. So let's have a look at a few specific areas where DQ helps that compliance. The first is discovery and that is all about looking at the data that the business has, highlighting whether data is bad, you've got typos, data in the wrong field, data out of date, incorrect data or you might have data which doesn't comply to the required format or structure or syntax. A key one it also exposes where you have personal or sensitive data, buried in comments fields or data birth fields or whatever. So it helps you identify where you actually have key information that you wouldn't otherwise be able to identify it. And on top of that, as well as exposing and understanding the data, it's also helping build the rules which then work with data governance to actually monitor the data on an ongoing basis. So it's about auditing and monitoring at the same time. Moving on from discovery, DQ also then of course helps put into place the actual mechanisms to clean that data in both real time and batch. So actually putting to production all those cleaning processes to make sure that data is as high quality as possible to make sure that duplications are identified and matched and merged. And on top of that, again, that full traceability, that full data lineage so at any point in time, as well as actually cleansing and matching and merging that data if necessary, you've actually got proof of what you're actually doing to make sure that you are hitting those GDPR compliances. And the final thing is of course isn't integrating with other systems. So data quality is all very well and good, but it has to fit into other parts of the business too. So it has to fit in with a data governance tool. It has to fit in with a dashboarding tool or a business intelligence tool or a cube to actually show what the DQ process has done. So that's just a simple snapshot. I mean, just in summary, you know, if you were to sum up GDPR in one sentence, it simply mandates the tightest control of customer data possible. And it's fair to say that without DQ, duplication of bad data will generally propagate across the business as it does, and over time that will inevitably escalate through non-compliance of GDPR in some form. And so data quality effectively helps ensure GDPR compliance. But how important data quality is in this case to GDPR? So David, I was talking about GDPR. I think one of the things that is really central there is just how critical data quality is to help inform the overall picture of the policies and compliance. And we see this of course in other regulations as well. I want to hear through FACA, which is a little different case. Now FACA at the general level is very specific to certain financial institutions. And it's particularly in relation to U.S. citizens and their relationship with financial institutions outside the United States because we're trying to look at and understand financial crimes, their overall enforcement. It's central to this is being able to flag, you can just go back one slide please. So what's really central here is being able to require non-U.S. financial institutions to find who's a U.S. citizen and be able to report back to the U.S. government. One of the interesting things I found when I was just going through and doing my taxes just a week or so ago, how many of you's documents actually have little FACA flags on them? This is something that's actually now hitting into kind of the broad range of the consumer, the individual who has accounts with organizations. And being able to validate that this is in fact correct. So data quality really becomes a central piece in terms of helping to deliver compliance. While this is a specific case, it's a strong example of just where data quality needs to go beyond just simply identifying is this particular field popular or not. But being able to really look at the data in a way that we can help ensure that the information is linked together in a way that we can validate or component. It's really always looking at how do we get closer to that sense of accuracy, which is often a challenging dimension to look at, but get to the accuracy of the data being helped. We need to be able to address things that are in fact conflicting information, conflicting flags and indices that really help determine who's who. And once you've dealt with the issues in terms of remediation and harmonizing that, you're at a point where right decisions can be made. In this case ensuring that the organization is in fact fact compliant. The type of example here is, you know, where I have codes, I've captured the information that says okay, I have a fact of country code, it's U.S., it's something I need to report on. But is that actually correct? I think that's where we need to be able to go beyond just through the typical looking at particular values to be able to look at the relationships of information. This is central to us being able to look at any type of information that we're applying to compliance purposes or analytical purposes. So we don't want to waste time, we don't want to waste resources, we want to be able to understand that this is in fact the right value and identifies the real country of origin, irrespective of how the data has been captured. Similarly, we have to be able to look at the other relationships. So in a lot of these cases you're going to need to be able to look at correlations in multiple directions whether you're looking to say, okay, this information says it's U.S., does the address information in fact correspond to what I expect it to be? Or if it's not identified as U.S., what do we actually see? So being able to look at pieces in both directions are really central to that. I believe we've been able to highlight some of those corresponding relationships. Next slide. We want to be able to identify where duplicates are. Have conflicting values be able to continue to expand on that particular type of information to look at those particular conflicts. Be able to identify this in a way that we can then provide this out to data stewards, subject-commander experts who can begin to really look in here and begin to make decisions around this data which have implications for the regulations we're dealing with. As we go through that process of looking for those conflicts, identifying those particular differences, being able to flag those, identify those particular challenges, and then how do we begin to harmonize that data so that we get to the point where the records have the right flag, the right indices, and deal with our ongoing reporting. This is really a very central part of data quality. It's a very central part to being able to meet the particular compliance regulations. For the results out of this, we get a good sense of which records are implicated, have values that we want to be able to address, which things are potentially suspect, which things are not implicated in the particular process. That then drives to what do we have in terms of the exercise. If it's implicated, that's something we're reporting on. We know what we need to do with that. With suspect, how do we approach that? How do we get the manual information in here to be able to make sure we're making the right particular decisions? If it's not implicated, then we should be reporting on it. Again, this goes back to the core process pieces around data governance. Being able to make sure that we're getting the right level of compliance applied. This is what our policy said. How are we meeting those particular policies? We're doing that through capabilities that use core data quality approaches that really help one themselves to be able to get answers that we need at the right time. If you're getting the wrong information, if you're missing information, if you're not capturing the right level of information, you're at risk of failing your particular regulations. That has implications for you as an organization. It has implications for each of us as individuals who may in fact have relationships with these organizations as well. That information is then coming back to you as an individual and may be flagging you as somebody who is at risk, may not be compliant, and that has a much broader implication. So it's inherent in this of these organizations to be able to provide the right information so that they're not getting challenged by their customers and losing customers as a result because they're not being able to establish the right levels of control. Data quality now helps ensure data governance and compliance to the regulations, but it's really a core factor in terms of your overall business because your business is about serving your customer base, and if you're not able to identify the right level of information in terms of your customers they're not going to stay with you from a business perspective. So it's really a broader business criteria that comes into play here. Thanks, Harold. So we'll pick up on one more regulation which is anti-money laundering, which obviously affects financial services organizations primarily. And essentially that regulation is there to which mandates that the organization has to check the identity of their customers, in particular check the beneficial owners and recipients of financial transactions and trades and money transfers and so on. Also monitor the business activities, and again put control systems in place for that data as well. So again, very, very strong on the data data being very fundamental not into the governance aspects, but in order for the governance aspect to work correctly like VACA the quality has to be there. Again, not VACA, data quality doesn't provide AML per se. What it does is it uses a prerequisite to a bank's internal AML process, and it does that by making sure that the data is of the highest quality to make sure that it stands the best chance of actually passing or failing the AML process accurately as they want it to do. So let's have a look. What probably the most obvious one is I guess what we call matching and suppression against sanctions lists. So a sanction list would be a list of known money launders, terrorists, criminals and so on. So in a very simple example we have some bank data here on the left hand side, and we have the corresponding person on the sanction list which is published between banks. And in this particular case we wouldn't get a match and we wouldn't get a match because the data quality doesn't match correctly or we might get a mismatch. So we might include people that we shouldn't do or we were missing people that we should do. And that's simply because the data quality in terms of the bank's data isn't good enough quality. Whereas once it's actually been put through a DQ process and it's been standardized and passed and structured correctly, it has the optimum chance of actually matching on a sanction list. So that's a really key. I guess it's quite obvious but it's such a fundamental aspect to making sure financial organizations don't transact with known criminals and terrorists and so on. Just making sure that raw names and addresses are accurate in the first place. But possibly a slightly more unusual one which is when banks transfer money and data to each other, they do so in the form of a swift message. And a swift message is effectively an XML structure of data. And again the quality of that XML file is effectively paramount to the reporting of the governance on it and also the actual operation of moving that money and moving that data around. Let's have a look. So on the right hand side that's effectively a swift message. It looks like an XML once you pass it out. Once we've actually identified that, this is how a raw swift message might come in. Once we actually look at it, what we can actually identify that we've actually got missing information. So when we've got red, we've got missing information. And when we've got the yellow, we've got abbreviated information. And again, to stand the optimum chance of matching the recipient's data or stand the optimum chance of matching against sanction lists, we want that data to be the highest possible data quality. So again it sounds straightforward but simply by applying the data quality processes and structuring and passing and standardizing that data as correctly as possible, it stands for the best chance of matching the recipient's as well. So that's a key part of anti-money laundering, checking on the beneficial owner, the recipient of that data packet as well. And at the same time it's going to also help with the data quality on the actual bank information too. And as bizarre as it sounds, the number of swift messages which have missing key information such as post codes or bank or bit numbers or whatever. So in this case, we passed and stripped out that swift message. We've identified there's missing information and we can do matches on it and actually then bring it back and we can do that in real time. And that of course is fundamental to the operation of that business. If they are sending swift messages from A to B in split second, they need to ensure that the quality of that data within that message is the highest possible quality to make sure that it's not subject to money laundering regulations or whatever. So again, something which potentially might seem relatively straightforward, but if there was no DQ processing, it would directly increase the chances of any financial organization processing illegal trades or transactions or opening accounts with terrorists or criminals, which they shouldn't have. So again, DQ is fundamental to AML regulation and without it, it simply would fail. So again, data quality has strengthened the governance for that particular regulation. So we looked a little bit here at some of the different challenges, some of the areas where data quality is really informing data governance strategy. But how do we begin to look at this from a sort of an execution strategy? We've got a lot of information coming in. We've got multiple versions of the, we've got data challenges. Sometimes we don't even know what those data challenges are. Sometimes we have very inflexible organizations where it's simply a challenge to try and move anything forward. One of the things that I've consistently backed over the years is starting small, hitting something up where you can begin to get some demonstrable results is really critical. And this is one of the areas where I see data quality is really central to data governance practice. Now data governance encompasses a lot of aspects in terms of process, people, as well as tools, capabilities, communication strategy. There's a lot that's going to go into a data governance strategy. But there are areas where you can really begin to target, something like these compliance initiatives, particular data quality areas. But these are things where you can really target some business value. You can get some quick identification of key business objectives. And this may be in the area in terms of, I'm trying to increase revenue here, minimize risk, decrease cost, which is a particular value access I'm trying to address. Somebody out of my organization is experiencing pain on one of these areas. This is an area to really target from that data governance perspective. What is that pain? What are the policies involved? And what's going to tell us whether, where we are and where we want to be. If we're looking at the statistics that say 40% of the executives are whether they don't access their information. That's a huge pain point. How do we target something of that? How do we help validate or maybe show that the data is something that we've completed? This is where small projects, agile projects really come into play. How do we focus on aligning to those objectives, adopting to what we need to do, and then getting some real quick wins? Because we need to be able to make progress. This is really central to data quality, central to data governance, and it's where data quality can really help provide ongoing value. And sometimes it may not even be in areas where you think about sort of high level compliance. It may also just be in areas around your data governance tool. If you launch a pretty good initiative around terminology and in the business glossary, how do you know where you've heard of that? You want to take some measurements around that particular process as well. That may be an aspect in terms of basic data quality tool that applies to your data governance tools. Where am I in this particular process? How many terms do I have out here? How many of these things have gotten to an improved state? Let me get some quick assessment on some of these things. Use that as a way to really establish an effective way of thinking about how data quality metrics, the measurements, the monitoring of information give you ongoing insight and allow you to really start from that core perspective and be able to expand out. Secondly, we really need to be able to make sure that we're collaborating in this particular process. What is going to be the keys that allow us to understand cross business? What we are addressing? Governance practices have become very critical in here, but it's also very critical just from a standpoint of the data. We think about a lot of the discussions around data literacy going on these days. How do we help ensure that everyone in our organization is talking about the same things and doing it in a way that they understand how they're approaching looking at data and getting across some of those organizational barriers and silos and engaging people in that particular process. Policies and standards are very central to that. Being able to communicate out broadly what people need to be thinking about and looking at. As we get some of those early wins, those quick wins, get that information collected, this is something that allows us to get broader buying. We can point back to our successes or point back to the things that are particular challenges and barriers and continue to build out a structure and culture and ownership around the process, the data that's going to help drive the overall business value. Now it's central in this particular process to really be able to go back to the slide, please. It's central in here to really be able to enrich the information that we have. We need to be able to discover what we don't know. It goes back to the point I made very early on, what you don't know can hurt you. This can encompass everything from issues in terms of bias in the data to samples that simply are not valid for what we're trying to accomplish. It's really critical to be able to make some insights into the data that we're looking at to be able to inform the broader organization about what we've found, where the challenges are, where the issues may be in terms of trust, be able to annotate that to increase that overall insight, and then be able to then share that information out regularly. Where are the wins that we're getting? What are we demonstrating out of this particular process? How do we find these particular outcomes that are going to have that ongoing value so that people understand what they can do? How they can take basically get empowered to continue to drive these particular practices? And completely that particular cycle, it's a matter of really quantifying information. We need to understand hidden activities, where we're getting resource waste, where we're not seeing transparency, where we're not seeing trust, where it's disconnects in this particular processing. We can begin to see these as we've worked through some of these practices. We can begin to establish key baselines. We want to be able to keep that focus, continue to drive out metrics that are meaningful. And we may find over time that certain things have value, certain things don't. Being able to adjust and refine that is really critical in that process. Which things are going to highlight that business value? Which highlight our ability to document compliance? Help transform that particular culture? Which things do we need to continuously review? Which things should we stop doing? Because they're not providing any particular value as we're focusing on the things that we need to, the information's getting surfaced to address and resolve key issues. Look for those particular root causes and continue to quantify those impact changes. It's really critical to be able to consistently finish these initiatives, go on, drive value, capture those metrics, do that repeatedly, tie that into that overall process framework that we're putting in place from a data governance perspective. And it becomes a sort of ongoing informative channel that everyone can begin to really participate in. So just kind of summarizing the people views. Next slide, please. I think the message we really want to be able to convey here is that the accuracy of information, the quality of data is directly impinging on the downstream activity. Some of that may be bigger compliance. And that's obviously an area that data governance really came out of being able to help ensure that we're addressing these compliance needs. We're reducing risk in our organizations. We're doing that from consistent, you know, valid reporting, dashboarding in terms of these particular areas. But it extends broadly out into the organization. It's about customer care. It's about dealing with business initiatives and business directives. Any of these particular things are areas where data quality continues to not only strengthen data governance, strengthen compliance, strengthen our view and trust of information. It's also about being able to drive out sensible business decisions. As Divinity said, this is really a symbiotic relationship. This is about data quality helping to facilitate what we want to accomplish in data governance. Data governance providing those ongoing framework processes, people approach that helps inform how we put data quality pieces in practice and give us ongoing value and wins in that critical process. So with that, I would like to, I think, open it up for questions. Divinity and Harold, thank you so much for this great presentation. If you have questions, feel free to submit them in the bottom right hand corner for our great speakers. And just to answer the most commonly asked questions, just a reminder, I will send a follow-up email with links to the slides and links to the recording of this presentation by end of day Friday. So diving in here, can Functional Localize Data Quality activities be successful without an enterprise level data governance framework or is it doomed to fail? Well, I don't think it's doomed to fail. Now, there's barriers there. I mean, I think this is where it's so critical to be looking at an organization and for senior level executives to have an approach in terms of data where they're setting that particular value. But we've seen this growing particularly over the last decade with the advent of the Chief Data Officer, data governance council, data governance programs. This is certainly infiltrating a lot of organization. It's not necessarily universal, but it's certainly infiltrating it. That is going to certainly help their overall piece. With the lack of that, is it doomed to fail? No, I don't think it's doomed to fail, but there's barriers there that are going to be more challenging to overcome. It may be that we're simply able to do it within a particular practice area. That may be enough to begin to get business interest, business buy-in, but this is an important piece. It's something that's beyond just compliance, it's beyond regulation, beyond sort of red tape that gets in the way. Forget people thinking about this as sort of a core business driver. This is something that is actually going to allow you to reduce risk, reduce costs, but to be able to help drive ongoing revenue value because we're able to make better business decisions. Divinity, anything you want to add to that? No, I think Carol summed it up really well. I think as you pointed out with the rise of the CDO office in the last decade it's seen a fundamental shift, whereas data before, let's say 10 years ago, sort of fell in the cracks, if you like, between potentially IT departments who would maintain that data and looked after the systems that ran the data and then lines of business users and managers and directors who had to use that data in some way. It sort of felt that that data was falling in the gap in terms of who actually owns probably the most important corporate asset they have and certainly in the last 10 years and certainly in the last couple years with the rise of regulations such as GDPR businesses have really come to realise that data is so fundamental and so important that there is this renewed emphasis on looking after that data so well and that's a really good thing to see. But I think DQ can do so much, but as we've tried to allude to today it's still got to try and fit into a framework which makes the DQ relevant to the business. It's not simply about just cleaning dates for the sake of it it's got to be relevant and permanent to actually facilitate beneficial use to the business. And along those same lines, this next question is, what are your recommendations when there is no regulations to rely on regarding data quality efforts, personal experience in case it's hurting cats? And in almost every webinar that we do here at we always get that question of how do we get executive buy-in, this is kind of text onto that very first question there. I always think that the fun thing, and I've got a perverse interest in data quality, I actually think data quality is such a tangible thing to show benefits from, whether it be from reporting being more accurate or stopping showing false positives to making better decisions, to better targeting, to better ROI models, to better campaigning, to less risk. I think data quality is such a tangible thing to be able to present the benefits. I've never really struggled to show a business even if they don't sit in a typically regulated industry like healthcare or finances or whatever. I've never once struggled to show, well this is what you're doing now, even if it's a delivery company sending out a million packets a year. If we can show that they've got X percent duplication, they're sending out packets too many times or they're sending the drivers out inefficiently, if we can show that by applying simple data quality techniques we can improve their efficiency and reduce their churn and reduce their wastage, I think data quality is such a relatively easy thing to prove because it is so tangible. I think adding on to that just briefly one of the things that I think that people often struggle with and why it's often difficult to get that printer by, it has to do with the terminology and being able to talk to the business in a way that is meaningful for the business is really a central piece. I think coming at it and saying, well we're seeing X amount of records which aren't complete, the questions are going to be there. What does this mean for me at this point? Being able to talk about the business, the business problem, this means this in terms of our organization, we're going to see this is impacting customer charge, this is impacting these particular areas, this is impacting our risk in the organization. It's bringing us over a level of risk. Being able to have that terminology, that's really one of the central things I see in terms of that strategy around this is being able to talk about that language. This is a key part of that data literacy conversation is being able to understand the business language and how data quality fits into that discussion in the organization. I think with things such as just finishing up from that as well with wide-reaching regulations such as GDPR, it's working up the entire industry. IT or finance or marketing, whatever, anybody that holds any customer data, whether it be personal data or sensitive data, everyone now knows it's sharp in their pencils last year that if you hold customer data you've got to look after it properly. We have lots of great questions coming in, keep them coming, we'll get to as many as we can before the top of the hour here in just a few minutes. I don't want to let this question pass you all by. What tools do you recommend and have implemented to address data quality and governance? I'm not sure I should go straight into sales mode for that, but of course we obviously have a number of tools which do everything which we've referred to today from understanding and profiling and analyzing and auditing data to understand the nuances and the peculiarities and the outlying data and the poor data. We've got software which of course monitors data quality from a business perspective to see if it's complying with the regulation data governance business rules that would have been set up as part of that framework. We have of course software that works in both batch real-time and in big data spaces that can cleanse huge volumes of data from anything from sub-second real-time through to hundreds of millions of records. I've lived with very large complex data sets, but I'm trying to sort of stay very much away from going straight into sales mode. It's great. No, it's really good. It's very helpful. I'm going to try to sleep in one last question here. For any data governance initiative what is the sequence of activities from a data quality perspective? My number one answer is what's the question? This is really sort of a fundamental piece. I don't know what the question is. It's very hard to be able to say what's the next thing that I need to know. I want to be able to understand what's the question? What's the problem? This goes back to what Divinity pointed out in terms of fitness of use, fitness for purpose. What's my purpose here? What am I trying to achieve? What are the questions that I need to ask about the data? When I have an understanding of that, the question is I can begin to frame, well, here are the particular requirements that I have for data. Here's the particular things I need to be able to understand. Now I'm at a point where I can make steps and decisions about what those next activities are. It may be different. It may simply have some of that answer already. Maybe I need to go back in profile information or begin to look at cross relationships across the data, but I have to understand what I'm trying to accomplish. What's the question? Sorry, just to finish up from a data perspective, that's absolutely right. From a business perspective, of course, you need to know what the right questions are in terms of what the business problem is. From a data perspective, always start with understanding, always start with profiling and understanding the data, diving into that data. How can you make the right decisions about addressing quality issues? How can you go about remediating that data, or making it relevant to the business if you don't even understand what's wrong with it to start with something? Everything always, always, always starts with understanding that data. Well, that brings us to the right to the top of the hour. Thank you both so much for this fantastic presentation and thanks to our attendees for being so engaged in everything we do and all the great questions coming in. Just again, reminder, I will send a follow-up email by end of day Friday for this webinar with links to the slides and links to the recording. And again, and thanks to SingSort for today's sponsorship to help make all of our webinars happen. Divinity and Harold, thank you so much. Thank you very much. Have a great morning, afternoon, evening, wherever you are. Love it. Thanks everybody. Have a great day. Thank you. Bye now.