 We're going to move right along to how to govern data like a strategic asset, and to talk us through that we have Arthur Bellost, who is a principal consultant, principal business consultant with Mega International, focusing on enterprise architecture. He's supporting customers in their EA transformation journey and delivering application projects for customers around the Asia Pacific region. He has extensive experience in the public sector and financial service domains, and holds a Togaf certification. In this presentation, Arthur will provide an overview of how data governance can help you accelerate towards digital transformation and why it's important. So a warm welcome from the open group please for Arthur Bellost. Hi everyone, I'm Arthur from Mega International. Thank you so much for attending this presentation. It's a great pleasure and an honor for me to speak to you today. So before we start, I'd like to apologize for not being able to attend this event live. So I'm based in Singapore, so it's the middle of the night right now for us. In the interest of everyone, I recorded the video and I hope to hold your attention for the next 20 minutes. So in today's presentation, I'm going to talk about data governance. So data has become a strategic asset for organizations. So it is now critical to set up the appropriate governance on data. So based on my experience as a consultant and an expert of a HOPEX information architecture solution, I will show you in this presentation our outcome-driven methodology for data governance. So first of all, what is the market telling us? So Forrester has identified two trends in its latest report on data governance. So first, the data demand is growing. Decision makers need to expand their access to data sources beyond their enterprise. We can easily understand why in a world where the value of data is ever increasing. Getting access to more data means getting more opportunities. So with that need comes a question. How to trust and understand all these data? Data coming from various sources means that we need to know more about data origin, data access, data usage. The data supply chain must be under control. That's what we're going to look at in this presentation. So we have identified three key challenges linked to data management, just three big calls for data governance. So first, the data integrity. Can the data be trusted? If you don't know where the data comes from, how are you supposed to trust it? Is my data accurate? So as you know, garbage in, garbage out. No decision making can be based on dubious data. This challenge of data integrity is particularly relevant now that data sources have expended beyond the organization, thanks to the use of external APIs. The sources are heterogeneous, which may lead to data conflict. One out of four companies claim they have no single source of truth for their data. The second challenge is a data usability challenge, which is also amplified by the multiplicity of data sources. If data is not structured or documented, it cannot be used by business users. They won't be able to search for data or even understand its meaning. Only 3% of employees are able to get the data to answer their questions in seconds, according to the latest poll. As a consequence, business users may develop different understanding of data. If you're not sharing the same terms or data definition, this will lead to bad cooperation between teams. So who owns the data? Who can provide information and documentation on the data? The last challenge is regulatory. There are more and more regulations on data. In Europe, for example, we can cite GDPR, the General Data Protection Regulation, which require organizations to control the personal data and threaten companies with heavy regulatory fines if they fail to comply. It's up to 4% of the global annual turnover. So these regulations require sophisticated monitoring and policing. Data governance can address these three challenges. Thanks to data governance, we can achieve the following objectives. First, increase confidence into data and improve business decisions. We need trusted data for efficient decision making. As you know, with false premises, you might expect false conclusions. So we need to build these good premises, these good and trusted data. That's why we're going to set up a data governance methodology and solutions to maximize the data integrity through ownership, life cycle management, and data indicators. The second objective is to get visibility into the usage of the data. You want to build the data map, understand the data across the company. This will help us understand the company as well. With the help of solutions like Hopex, you can build your data repository. If you're building a shared glossary and building this visibility of your data to all stakeholders in the organization, this will bring value to your company. And the third objective is quite simple. Just avoid fines. You need to comply to regulation on data. To achieve that objective, we are going to measure the regulatory compliance of our data. So let's start with the basics on data. It is often said that data is the new oil of the 21st century and not only because everyone is fighting for its access, but also because it needs to be refined to be usable. Companies only analyze 12% of their data they have. So you got it. 88% of data goes unanalyzed. Data comes first in its raw form. From its raw form, usually an event recorded in a system, we need to build a more structured form that will constitute information because now the data is analyzed in relationship with other data. When we add the context to that information, we create knowledge. And finally, based on this knowledge, we can take action, informed and relevant actions. As you can see from this pyramid, data governance ensures data integrity, which enables better decision making and gives competitive advantage. So how to establish a data governance program? So first of all, it's a journey, not a one-time effort. So we recommend three simple activities to achieve the objectives stated earlier of data integrity, data visibility and data compliance. So first, you need to architect your data. We cannot govern what we don't know, which is why we need to start with the identification and the structuration of the data used in the company. Data dictionary and business glossary must be populated. To speed up the process, it can also be automatically populated through data discovery methods. As much as possible, we need to understand the three levels of the pyramid. In data architecture, we identify three levels of data, the technical, so what's in your database, logical, what's used by your application or your program understood by your application owners, and conceptual, what is used by your business. So we want to model the three layers and connect them together, from technical to logical and from logical to conceptual, linking our technical data to the business data. The second activity is to trace the data. The data is not static. It is dynamic. It is created, updated, used, archived and sometimes deleted. So we need to know what happens to the data. Data is used by applications, by processes. So by linking the data to these applications and business assets, we can establish criteria also to assess the quality of data. Knowing who is transforming will enable us to assess the criticity of the data if you have a critical application updating it or a critical process. The third activity is to govern the data. To effectively govern data, we need to establish the leadership and the ownership. So who is responsible for the architecture and the tracing of the data? Is someone responsible for checking consistency? So we need to identify data designers, data owners, scientists or data chief officers. We need to establish these roles. Some of them are responsible for the modelling of the data. Others will be capturing other information on data and some roles will be responsible for the validation of the data or the validation of the transformation steps. We can implement the collaborative workflow to ensure the validity of the data with a simple maker-checker workflow. A second aspect of the governance is in regard to the regulations. We need to describe the rules that are required by the regulatory bodies. We'll tie these rules to the processes and ensure that these rules are enforced through compliance campaigns, for example. Remember, three activities can be done and must be done concurrently, simultaneously. Let's start with the first step, architect data. The process is simple here. We need to collect the data and structure the data. So first of all, we need to multiply the sources. We can use the bottom-up approach, trying to gather as much technical data as possible, or a top-down approach, capturing the information as understood by the business. So here, it's important to know that we are collecting the metadata, not the actual data with their values. For example, we want to know the customer identified by its name, email address, phone number. This is the data. To build the data architecture, we don't need to collect the actual names and addresses of our customer, only the structure of the data, the metadata. The structure is captured in a data dictionary. Data models can be designed with the help of a solution like OPEX. This will allow a better visibility and understanding of the structure of the data structure deployed in existing database and used by your users. From the data dictionary, we can build the data glossary. It's the list of concepts used by the business with the terms and the definition, but also the synonyms and the relationships between the concepts. With the solution like OPEX, it can be auto-populated from the data dictionary. That glossary must be published to ensure a shared understanding between all the stakeholders. So we recommend to build a map of your glossary to facilitate the understanding of that information. Remember that one picture is often worth a thousand words. The next step is to trace the data. So understanding the structure of the data is not enough. We need to understand its journey, its lifecycle in the organization. That's the goal of this step, this activity, trace the data. We're going to use maps to understand the data lineage and the transformation that happens to the data. You need to identify who is responsible for each transformation. What are the systems impacting the data? What are the processes impacting the data? So for this activity, there is a preliminary step. You need to capture and manage the repository of your assets, systems and processes that are impacting the data. So with the solution like OPEX, you can leverage on the existing repository populated in other modules. So you need to build your application inventory, your process inventory, and you can leverage on this structure. This will allow you to link your data to your assets and generate impact analysis reports to answer questions such as what system is reading the data? What system is updating or deleting the data? What data are used by a given system? Or what data are required for my processes? So with reports like the one shown in the slide, you'll be able to answer these questions. So here we're getting the big picture from the static view to the dynamic view. But there's a third activity because architecting and tracing data are of course two key activities to understanding the data. But it's not enough because how can we guarantee that the data is under control, that the descriptions are correctly populated and that the data is compliant with the regulation? For that we need to set up the governance. This is the last key activity. Remember that the activities can be and must be carried out concurrently, simultaneously. So first we define the roles and responsibilities on data. Typically we need to know for each data or data domain who is the data architect, data scientist, or data steward, providing description of the data. But we also need to know who are the business users because they can provide valuable information on the data usage. You can identify your data governance committee that can guarantee a clear and comprehensive description. They can supervise the documentation of the data. Then you can implement a workflow, a maker checker workflow, to help ensure the validity of all the data descriptions. The data quality should also be assessed against simple criteria, such as accuracy, timeliness, consistency of the data. On the regulatory side, you need to describe in detail the regulatory rules. Identifying the set of rules, usually organized by chapters and articles. Each rule is tied to a process and we need to ensure the compliance of the process against the rule. This can be done through a compliance assessment campaign and these campaigns can be automated with the help of a solution like HoPEX. So finally, when a non-conformity is identified, controls and audit procedures must be set up in order to remediate to the risk of non-compliance. So if I identify the three steps, architect data, trace data, and govern data. But now in practice, how to deploy this methodology. So here are some best practices based on our experience. Some tips. The first one is to establish your data governance plan program. So remember it's a journey, not one time big bang effort that's going to solve all your data problems. So you need to plan carefully the activities step by step. For each step, set realistic time bound and measurable goals. For example, improve the data quality of my product domain by 50% by next year. The second tip is to really get a solution like HoPEX to support your data governance activities. With a solution, you'll get data discovery functionalities, data modeling functionalities. You can have a data portal to share easily the information with your stakeholders. And you also can benefit from the automated workflows to better collaborate. This will help you greatly. The third tip is to really start with quick wins. Typically, we start by building the business glossary and share it across your organization. This will allow you to quickly demonstrate the value to your organization. Next recommendation is to be transparent. Really. Continuously share the progress of your data governance initiative. You need to build the trust and the trust in the data. This is very critical when it comes to data. Next recommendation is to set the rules. Really. We have three types of rules. The data governance committee, the data owners and the business users. Data governance committee responsible for the overarching activity and validation consistency of your data. Data owners responsible for data descriptions of each of your data. And the business users that are actually using the data. Note that business users can also be data owners in some cases. And the last of our best practice recommendation is to really train and educate users. So it is a journey. It will take time. You need to train your users for example on data modeling techniques to help them improve their data descriptions. Or usually also train them on data governance process to raise awareness on the data for them to understand the value of having a good data governance. So I would like to end this presentation by sharing with you the results of a data governance project that we implemented a few years ago. The customer was a large company in the transportation sector. So what did we do and what were the direct benefits? So we followed the same philosophy of that methodology that I just explained to you by focusing mostly on three simple steps. First we architectured the data by building the common glossary. We did it in less than six months consolidating the definition of 900 concepts with a unique designation for each. Then we implemented an intranet, a data portal to share this glossary to all the stakeholders, both business and IT stakeholders. Finally, we traced the data. We mapped the data, the concepts to the processes. Describing how processes are affecting the data, who is impacting the data and how data drive processes. So what were the direct benefits from this initiative? So we managed to facilitate the collaboration between experts tremendously. We improved that collaboration. So to cite an example for a big transformation management system program that was involving stakeholders in more than 100 countries, they benefited directly from this shared glossary. They were finally able to understand each other without any ambiguity. So it is now time to conclude this presentation. So to summarize, we generate a lot of data. So two quintillion of data is generated by companies each day. But we only analyze the tip of that data iceberg, that data galaxy. Analyzing the data cannot be done without the proper governance. At MEGA, we believe in an outcome-driven approach. And with our simple methodology of architecting, tracing, governing the data, you'll be able to create and share the value to your organization, increasing data integrity, building trust, increasing visibility, sharing the data glossary and all the data models, and avoid fines. Thank you so much for your attention during this presentation. We wish you all the great conference and great day ahead. Bye. Thank you very much, Arthur. Thank you for taking the time to record that. And if any of you had challenges, either of the videos, then you certainly will be able to see them in the post-meeting documentation and materials that will make available. But it's a great summary and the topic of data and its increasing importance is obviously really, really important.