 Hello and welcome, my name is Shannon Kemp and I'm the Chief Digital Manager of DataVersity. We'd like to thank you for joining this DataVersity webinar. We have all the dark side of data governance sponsored today by IDIRA. Just a couple of points to get us started. Due to the large number of people that attend these sessions, you will be muted during the webinar. For questions, we will be collecting them by the Q&A in the bottom right hand corner of your screen. Or if you'd like to tweet, we encourage you to share highlights or questions by Twitter using hashtag DataVersity. As always, we send a follow-up email within two business days containing links to the slides, the recording of the session, and additional information requested throughout the webinar. Now, let me introduce to you our speaker for today, Ron Huzenga. Ron has over 30 years of business and IT experience across many different industries, including manufacturing, retail, healthcare, and transportation. His hands-on consulting experience with large-scale data development engagements provides practical real-world insights to enterprise data architecture, business architecture, and governance initiatives. And with that, I will turn the floor over to Ron to get today's webinar started. Hello and welcome. Thank you, Shannon, and welcome everyone. And hopefully, we're going to have a little bit of fun with today's topic. As you're aware, we're talking about data governance. And the backdrop for this discussion is some very significant things that are happening on May 25th of this particular year. For those of you that are Star Wars fans, yes, there will be another movie hitting the theaters that many people are looking forward to. But it also happens to be the day that the GDPR date regulation is coming into play, which has implications far beyond the borders of the European Union countries. It actually is a regulation that extends worldwide. So while both of those aspects, the movie and the regulation, have a dark side that we have to look at and be able to address, we thought now would be kind of an interesting point of time to be able to address that. And incidentally, I also remembered that it is also my mother-in-law's birthday, but we'll actually leave that out of the conversation as we continue to go on forward in the webinar itself. So again, in the spirit of our friends doing the Star Wars movie and everything else, we'll start this out by just looking at a little bit of the background of why we're actually here today. So in a galaxy not so far away, which happens to be right here, there are several things we're going to be looking at. When we look at it in general, really this is in some respects a dark time for all citizens of the galaxy that we're looking at. The data we generate is growing more quickly than our ability to manage and control it. And yet that's against the backdrop of having an unclenchable thirst for even more data as we go forward in our organizations. The dark forces continue to strike. And what we're talking about here is, of course, breaches and misuse of data. And that's the ongoing threat to privacy and even our well-being in society as a whole as these data breaches continue to go on and an even larger proportion than they have in the past. The bad side of this, of course, is that the criminals pursue and steal identities, and obviously their intent for doing that is often very malicious. But we have other breaches as well. And those breaches simply occur through errors or through lack of awareness that individuals or even the companies that have this data don't realize what they're doing and that the data is actually accessible outside of their walls or their cloud data stores or those types of things. In response to this, what we're seeing with regulations like GDPR and all others is the lawmakers are really trying to regulate this in an attempt to control it. So the deterrent, of course, is financial penalties, which penalizes the offenders, or at least the companies from which the data comes, but not even necessarily the offenders of the people that are going after and causing the breaches. But it still comes up short in terms of protecting the innocent people whose information is shared out into the rest of the world. So what we need to be able to do, of course, is we need to maximize the knowledge at our disposal. So in this context, we'll call that channeling the force. And what we really want to get to is a data culture. Governance isn't a project or just a program that addresses a specific regulation. It's much more widespread than that. And we really have to look at it in terms of establishing a culture in our organizations of data awareness, a philosophy of prevention. And we need to just be kind of the part of the fabric of how we conduct ourselves each and every day in our organizations. So those are some of the things that we're going to talk about. I'm going to talk about this in the context of not just GDPR. I'll talk about some examples with GDPR. But this applies to governance as a whole, whether it's GDPR or other regulations and that type of thing. So what we're going to cover today is we'll just talk about data security generally and some of the privacy regulations out there. There are hundreds and thousands of them, depending on which countries you're in and which ones come into play for your particular organization. What are some of the implications of those of the regulations as a whole? And what do I perceive as the dark side that we're facing and how do we address it? And then how do we actually channel the force and really be able to react and manage things from a governance perspective correctly? So I'm going to be talking about things like enterprise architecture, the use of models, integrated repositories, and a collaborative environment to really help to address those types of concerns. I'll walk through some examples, maybe on the governance and some of the metadata side of things, just to give an idea of some of the things I'm talking about. And then we'll have a quick summary and a Q&A after that. So if we look at data security and privacy in general, there are, like I say, many different regulations that come to mind. But there are only a few that I'll just list on this particular slide. GDPR is front and foremost on the mind of most people, particularly in the EU. But the recognition is starting to kick in worldwide. So it's something that anyone in healthcare in the U.S. has been familiar with for many years is of course the Health Insurance Portability and Accountability Act, which is HIPAA for short. And of course, Sarbanes-Oxley is also something that's been prevalent for a number of years that most people are familiar with. What we want to talk about though on these regulations is every single one of them is geared at protecting basically personal or private information. But there are also other things that go on as well. Like I say, in an effort to respond, some of the regulators introduce monetary penalties. And one of the reasons we're talking about GDPR in particular today is the penalties associated with noncompliance or data breaches and those types of things can be huge. Depending on the type of breach that occurs or what happens, basically the initial fines can be up to 10 million euros or 2% of your annual worldwide turnover or revenue from the previous year. And in other instances, it can be even higher than that. Starting at like 20 million euros, up to 4% of the annual worldwide turnover for previous years. So the fines are very significant. But even against the backdrop of the fines, a lot of organizations find themselves not well-prepared at all to be able to adapt to and handle the regulation as it's coming down as well because there are a lot of different complex aspects of managing your data that come into play just to be able to address GDPR and frankly, a number of the other privacy regulations sufficiently. So again, that May 25th due date is looming. It's less than three months away right now. So for organizations that haven't been started, you're in big trouble. You're not going to be able to adapt and be ready for GDPR by that deadline. So there's a lot of scrambling that organizations are already doing to be able to address that. I've talked about the huge fines. And GDPR as a regulation applies globally. It's not just for the EU countries. It's any or all organizations worldwide that have any information about EU citizens' data. Not necessarily just customers. If you're a global organization and you have EU employees or customers or anything in between, the regulation applies to you. And when you think about that, if we even take a step back and talk about North America as a moment, the first reaction might be, well, it doesn't apply to us. But when you really start looking at it, if you're in the hospitality industry, you've probably got tourists and business travelers coming in from all over the world. So you will have EU citizens' data. If you're in healthcare or those types of things, you may be doing clinical trials and everything else that are worldwide. So there again, you may have implications not only from something like GDPR, but from HIPAA and that type of thing, depending on where your focus groups and everything else are. So a lot of these regulations, if you take a step back and look at them, do have implications to your organizations. And like I said, there are many different regulations, and these are just a couple of examples, as well as the Sarbanes-Oxley. Under GDPR, of course, a very interesting statement, and we'll talk about that a little bit, is the law requires privacy by design and default. So what that means is the way that you are designing, architecting, and building your systems and your databases should have the idea of individual privacy at the forefront and a consideration in everything that you're designing, as opposed to an afterthought that we're trying to manage based on a regulation coming in at the same time. And I actually agree in this particular philosophy, because when I look at regulations like this, quite often when I look at governance trying to scramble to meet compliance for a particular regulation, you end up asking yourself the question as like, shouldn't we be doing this anyway even without a regulation? And in most cases, the answer is definitely yes. In terms of GDPR, there's a couple of different categories of the personal data. There's the standard personal data, such as your typical names, addresses, web audit trails, cookies, and those types of things that occur. And then there are things that they classify as specialized personal data. In other words, ways that you can really identify the individual. So that's basically private identifiers like, for instance, if in the U.S. it would have been social security number or whatever the EU equivalent is of the SSN, credit card information, bank account information, healthcare numbers, biometrics, genetic information, racial, ethnic information, and those types of things. So again, a lot of different things come into play and there are a lot of different pieces of information that you have to protect. And in some cases, maybe a person isn't identifiable with just one piece of information, but it's often through a combination of different pieces of information that it then becomes a privacy concern. So again, there are a lot of layers to peel back when we start looking at this. Things that we really need to look at in terms of trying to build this into our processes. And again, this is a very simplistic process overview that I'm showing here. But we need to think about how are we collecting data and specifically which data are we collecting. So whatever our process flow happens to be, we need to understand where that data first enters our organization, where it's stored, and also what we do with it in our organization. Quite often we have one or more collection points for any given type of information, but it can permeate throughout our organization and end up in a number of different databases and surface in a number of different applications. That's just a fact of life of how we work in our organizations today. With something like GDPR, we need to be able to produce this data and actually be able to identify where this information is stored if it's requested by a private citizen. And that means that we need to have an inventory and accountability of everywhere that it is. We need to be able to show them what it looks like and also allow them to correct that information if it incorrect. So if you don't know what it is or where it is or how it's stored, the chances of doing that are slim to none. So there's a lot of work that needs to be done to be able to enable that. Let's look at some of the privacy implications. And I kind of call this the deaf star of governance because a lot of organizations are unable to answer these types of questions. Like I talked about, where is the data? Well, you may know where the initial source is or maybe copies of the data, but if you can't identify where all the instances of those private data are, then you have a problem on your hands to be able to figure out where it is. The other part is what specifically is it? So when I'm looking at a particular piece of information, which privacy laws could affect it? GDPR, HIPAA, SOCs, other ones? So any piece of information that we're looking at in our organizations, we need to know which privacy laws really do come into play. The only way we have a chance at doing that is being able to inventory all of our data assets, property catalog, and classify them, and then start associating that information back to the different policies and regulatory rules that apply to those particular pieces of data. And we'll talk about this a little later, but that's where we really need the repository of the information to tie all of this together in the metadata in our organization. Other things, who has access to the information? There are specific access requirements dependent on how private the data is and what the nature of the data is. For example, and there could be physical restrictions as well. Remember when working on HIPAA initiatives many years ago, it's not just the access of the information as it's stored in your systems as well, but there are regulations in terms of even things like privacy screens on computer screens and that type of thing. So there's somebody that is working with sensitive personal health data, that's somebody that just happens to be casually walking by or in the background, cannot actually see that information on the screen. So there's physical access requirements as well as computer access requirements, if you will, as well. Also, specific permissions that are granted for use of the data. If we're utilizing the data in other places or exchanging it with other partners, there are certain types of data masking that needs to be done as well. For instance, let's take this scenario of you have a production environment, but you're making enhancements to your systems and you want to be able to test them. With this private data, you can't just take a copy of that data, take it out of your production systems, put it into your QA systems and do testing with it. You actually have to mask it or alter it so that the people that are working with it that don't have the permission can't tell who the people are that you're actually working with. So let's take our friend Yoda and gain some insight into the dark side in general. For those of you that are Star Wars fans, you might remember from the Phantom Menace Yoda said, fear is the path to the dark side. Fear leads to anger, anger leads to hate, and hate leads to suffering. Well, quite often that's the reaction that some of the organizations and the individuals have to these privacy regulations. It's fear of the unknown and not really knowing how to address it, and if it's left unchecked and not managed correctly, it really starts to get worse and worse because companies just do not know how to comply or actually handle the regulations themselves. When we look at something like GDPR, Yoda would say something like, do not pay huge fines, you will. And we've talked about that. The percentage of annual revenue that companies could be paying if they're found in noncompliance are huge, so that's causing a lot of organizations to sit up and pay attention. Again, if you look at something from the Empire Strikes Back, a word of wisdom that can apply to many different things here. Do or do not, there is no try. Again, that applies hugely to data governance. We really need to make sure that we're putting in full data governance programs in our organizations. We can't do this haphazardly and we can't do this lightheartedly. We really need to make sure that we're delving deep and making sure that we're covering it correctly. And then, again, how do we address this? Well, again, from the Empire Strikes Back, Yoda said, and it's actually a subset of a quote, the Empire Strikes Back quote is, a Jedi uses the force for knowledge and defense never for attack. I left the last part off because what we really want to do when it comes to things like compliance and privacy and everything else, we want to use the force for knowledge and defense. And of course, in the context of the force, I'm talking about things like enterprise architecture, data architecture, and those types of things. I'll get to those momentarily, but there are also some things that we need to address on our own internal dark side or our organizational dark side. And these are different types of behavior or things that we need to be able to address. Being unaware, like I talked about earlier, companies, especially those in North America, may think that the regulation like GDPR doesn't apply to them. You need to be extremely sure about that. In some cases it doesn't, but there are a lot of cases where you may think it doesn't apply, but it really does. So you really want to make sure for that regulation and other ones that you're very clear on to what the boundaries of the regulation are and the types of business you're conducting and whether it applies to your organization or not. Procrastination. I think human nature is actually geared towards procrastination in the first place. It goes right back to the days when in school and you've got assignments or term papers due, there's the last minute scramble to get it ready and hand in the assignment just before the due date. Well, in instances like this, nobody really seems to be ready, so they'll push the compliance date back might be the train of thought. Well, in reality, there are a lot of organizations that aren't ready for GDPR, but from everything I've seen, they are not going to push that compliance date back. It is going to go into force on May 25th, so organizations need to be as ready as they can be. We also need to look at terms of amendments to regulations because with HIPAA and other regulations as well that have come into force over the years, quite often there's an initial regulation and then based on it being in place over time, there are amendments that some of them may relax restrictions a little bit, other ones often are more stringent, so we also need to be aware of all the different amendments that maybe flow through to those regulations as well. And of course, a lack of a full understanding. Quite often people will take kind of the easy approach to go, oh yeah, it's just another compliance regulation. We've got some safeguards in place, so we're good, but unless you actually look at the regulation in detail, look at the different types of things that implies you may not actually be ready. If we look at GDPR as an example, there are a few different things that come into play here. The right to be forgotten. So what that means is if you're tracking information about me, if I was an EU citizen, I can actually ask that I be forgotten and not recorded in your systems anymore. Well, those of us that are data architects and everything else say, well, what does that do to our referential integrity and our databases and everything else? So what you need to be able to do is you need to be able to find a way to tokenize or find placeholders for who the real person was, but still represent the fact that there was a transaction or something that occurred in your organization because you just can't delete that information because then you have an incomplete data set. So you need to find ways that you're going to address things like the right to be forgotten. I talked about this one earlier as well. A person's right to full disclosure and review of information that's being tracked about them. If you don't know where it's being tracked, you're unable to give them that full disclosure. And something that comes into play here as well is one of the other things we talked about right at the intro was us wanting more and more data all the time. So we need to look at this in terms of the architectures and everything that we're putting into play as well. Quite often we're harvesting a lot of information and we're doing things like dumping it into a data lake. We don't know what it is, but we're putting it in there and we want to analyze it and deal with it later. If there happens to be private information for, for instance, EU citizens in that particular data lake, just the fact that you don't know that it's there isn't an excuse for not governing your property. So you need to be very sure about what's in there and being able to produce it, correct it, and the right to forget and actually alter it as well. So again, it becomes very complex as you start to peel this back. Many regulatory requirements, but not all of them, typically shouldn't come as a surprise. And like I said earlier, a lot of them represent practices that we really should be following anyway in our organizations even without the regulations. In some instances, you could make the argument that the reasons the regulations are coming into play is because organizations aren't doing the types of things that they should have been doing, so that's what makes the regulation come up. I'd like to draw an analogy to something different. It's just like seat belts and cars. The laws to wear seat belts on their own don't make us safer. It's the wearing of the seat belt that makes us safer, and that occurs whether there's a law to back it up or not. So that's kind of the approach we need to take in terms of governance in our organizations is we need to say what is the way that we should be safeguarding information whether there's a regulation or not. And as part and parcel of that, rather than being reactive to governance policies and regulations, we really need to have a proactive culture of establishing positive governance if you want to call it that and making sure that we put it in place throughout our organizations. So how do we do that? Well, we need to channel the force, and we need the tools to fight back and be able to deal with the breaches and the privacy and everything else, and the way that we do that is through enterprise architecture. Our perspective of our enterprise architecture is summarized basically by this particular building that we're showing here. You need to have a very solid foundation in data architecture. Data architecture is where the enterprise architecture discipline grew out of in the first place, and then of course the other aspects of business architecture, application architecture, and technical architecture grew out of that as we started dealing beyond just the realm of data into other aspects of our organizations. The data architecture needs to be there as that solid foundation to support those other three pillars, not only to be able to support them themselves, but also of course to manage the metadata associated with all of those other different parts of the enterprise architecture as well. By doing that and having a balanced enterprise architecture approach is where you're really starting to drive enterprise enablement and capabilities and the ability to respond to different market conditions, whether it be from competitors, just moving into marketplaces as a whole, customer requirements, and those types of things. But more importantly for this discussion is you need to have that in place as a solid foundation for governance. You need to know what the data is, where it is, how it's used, how it's accessed, the business processes that act against it, how it makes the journey through the organization, and how it's transformed along the way. You can't do that without a solid enterprise architecture to be able to build a governance program on top of it. So again, central to that is modeling. Data modeling, process modeling, component modeling, all those different types of things. But when we talk about the modeling in particular, what are some of the questions that using modeling as part of that enterprise architecture can do for us? It helps us to understand the organizational data. What's important? Where is it? Because of course it can be in many different places and basically be altered in different formats as it makes its journey through the organization as well. Where did it come from? How is it being used in our different business processes in the organization? What's the chain of custody? In other words, how is that moving through our organization? What's happening to the data on its way through? And what are the business rules that we're implementing that act upon that data as well? It's virtually impossible to do this without models and understand because a picture really is worth 1,000 worlds. The models themselves give us a lot of context about how this information and the processes interact in our organizations. When we look at things like governance, we're looking at in the context of modeling and the management of the metadata that goes along with all those models, how do I do certain things? How do I identify what's private information? Other characteristics come in like how long should I be keeping the information? How do I classify it from a master data management perspective? And of course a lot of this privacy information that we're talking about is used in transactions. But generally we're keeping master data records about these particular individuals and that type of thing. So just being able to classify the data helps a lot. And of course data quality. Is the data that we're actually using for a particular purpose fit for that purpose? Or was it intended for another purpose? And as an auto trail, when we're changing information or even if they're changing our data structures and our databases and everything else, what are we changing and what's the reason for changing it? So change management is a huge component of this as well. So there are many different approaches in the industry. But I'm going to talk about two in particular. A lot of organizations may not have mature data modeling or business process modeling practices. They may have some metadata that, you know, basically by extracting information from the databases themselves or some rudimentary modeling capabilities. So what will quite often happen is that metadata will be exported from whatever those systems are and then imported into a metadata repository. Or think of it as a metadata catalog that's really not integrated to modeling and that type of thing. So they can actually find things by name. They can do tech searches and lookups and those types of things to try to wrestle all the different pieces of metadata out there for these different types of things that we're trying to govern and manage. I personally look at that approach as something like a cloud earth society that it's kind of in denial and that is you're only recognizing part of the problem. You can find things with tech search and lookup and that type of thing. You can try to correlate different pieces of information together but without that map or those models to actually tell you how all these different pieces fit together it's extremely difficult to get a handle around this. An analogy that I like to use when I describe this is imagine that you're trying to find somebody's address or find where somebody is, you've never been to, where they live, they're across the country or a continent or that type of thing. What would you rather do? Would you rather use something like a phone book and look up the address and then look at some paper maps to try to figure it out from there? Or would you rather like use something like Google Earth to be able to hone in on exactly where it is and then be able to get the directions of how to get from here to there? It's really the types of things that we're talking about here. One way to do that is with those fully integrated metadata and visual models that we want to pull together. By being able to have these integrated models with the metadata and also building more metadata around them it gives us that integrated and global perspective. Data models, business process models, that visual data lineage or traceability of the information as it makes its journey through the organization and all kinds of metadata, not just the metadata about the data structures itself but the metadata about the business processes, the metadata about the policies, business glossaries, and the reference data that we use in our standardizing and organizations to tie all these pieces of information together. What does that look like? Let's start with the modeling. Again, this is very simplified, but if we start looking at some of the things that we're used to thinking about, such as conceptual, logical, and physical data models, also an adaptation of our models for the dimensional models, such as digital, logical, and physical for data warehouses and BI environments, and our business process models all linked together and having one also tells a story about the other one. So being able to link business process models to the actual data structures that we're updating so that we now understand what we're doing in our daily business processes and tying it together to our different data models. The data lineage in terms of mapping where the information is traveling, the transformations that's happening along the way, the model and map that visually as well as being able to derive it from things like ETL processes and that type of thing. And a very central component that's essential is the concept of enterprise data dictionaries. I'm not just talking about the typical things that data architects think about in terms of domains, reference values, and those types of things, which are all extremely important, but I'm also talking about characteristics or metadata extensions that we can define and apply across our organization. Things like business value, master data management classifications that I talked about. And from our compliance perspective, which we're talking about today, being able to define compliance policies and properties and those types of things and associate them right at the design level in the models, which is then consumed and available throughout all the other metadata that we're tracking about these different types of things in our organization. As we start to build around that, there's a lot of other things that we want to be able to do with this integrated metadata model repository to really collaborate as an organization to work with this information together. So I've talked about business glossaries briefly. We need to be able to categorize the information and we need to be able to define it. We need to speak in the vocabulary of our business. But business glossaries extend well beyond the concept of specific business terms and their definitions. And when we think about that, that's the first thing that people think about in terms of business glossaries. It's like a dictionary, but there's a lot more that goes to it. Business glossaries are also places where we can define things like business concepts. What are the different policies? What are the different regulations that have those policies in them? What about reference data sets? How can we tie that in so if we have reference data sets for different types of information, how do we know which of those reference data sets should apply to which entities that we're dealing with across all of our different databases? Security, of course, is going to be not only security in terms of the security parameters that we're designing across all of our systems, but also how do we secure the information itself in the context of today's discussion? Where things are sensitive? What kinds of alerts and notifications can we set up so that if people are working with pieces of information that are sensitive, that they're actually able to see that as they're working on a day-to-day basis or looking at the metadata associated with it? What are the different data sources that we pull this information in from? Like, in other words, how are we cataloging all of our different databases or data stores if they're not databases? They could be spreadsheets, they could be NoSQL data stores, all those types of things tie in. And then, of course, from an enterprise architecture perspective, we want to round out our architecture in general. What are the goals and strategies of our organizations and how are we tying this back into the different types of things that we're producing and modeling in our environment and the way that we're changing our business processes and the way that we utilize our data going forward? How do we tie that back to the goals and strategies? What are the business rules that come into play? What are the different business units within our organization that are part of the organization itself and that collaborate together to drive this forward? What are the applications that we're using that contain this data? In other words, we have these back-end data stores, but which applications are surfacing and utilizing this information? And again, that ties back into our business processes and everything else. And again, from the government's perspective, who are the stewards that we've assigned that are responsible for these different types of information in our organization? And if there are things that I'm interested in, do I have an ability to follow it so I'm notified or I can see when changes are being made or say a regulation changes or that type of thing, what do I need to be able to do to respond to that to be able to affect change in the organization? And of course, from a collaboration perspective, what we want to be able to do is be able to tie this all together through things like discussion threads and that type of thing. So if there's any questions or anything that comes up, you can take any of these given concepts in the models or the other associated metadata that's represented in this particular context diagram and start to have a discussion internally about it with people contributing to the discussion and having an audit trail of what that discussion was to arrive at those better decisions. Those are all things that we want to be looking at. So just a quick aside, there's a lot there, but just to kind of put this into perspective of how we do it in the ER Studio Enterprise Team Edition, and I'm going to go through this rather quickly because I've actually talked about a lot of it in the context of the previous slide, but we're talking about different things that support and are intertwined in the models. We can link all of these different concepts that I talked about together through the metadata repository that we were talking about, but glossaries in terms in particular, how do we classify the things? So we've got the business glossaries in terms. We can define policies and rules. We can basically have catalogs and links to our reference datasets, whether they're internally defined or externally defined in our organization. We can nest these things into a hierarchy so basically a glossary hierarchy so we can decompose it into different areas. And I'll show you this in a moment with a couple of examples, like catalogs of different regulations and then policies within those regulations, just like we can manage and catalog the data artifacts that we have in our organizations as well. And again, the ability to do end-to-end associations, in other words, associations to the other instances, whether they're terms, policies, and that type of things, to the specific model or data elements and even process elements that they apply to. And that includes things like custom attributes, characteristics, and other things that we may have defined as well. Our data dictionary. I talked about that concept of the centralized data dictionary. What we can do with that centralized data dictionary is it allows us to take those same concepts, define them once in an enterprise data dictionary, and then associate them to the model attributes across all things. The beauty that that gives us is by doing that, we also have, through those bindings or those links between those different concepts, we can also see where everything is used across models and that type of thing. So all those different instances or databases that represent the same concept, we can actually look back the other way. And I can see things like even more general questions like where's all my master data, where's all my reference data, and those types of things. Collaboration. Again, I talked about the ability to have online discussions and streams to follow and see changes, assigning responsibilities for stewardship and who actually is able to manage the information in the repositories, and of course that ties into the permissions that are granted as well. And for privacy and security in general, which is a lot of what we're talking about today, is being able to define the policies, associate them to the data elements that they apply to. We can do that through modeling constructs that we call attachments. We can do it through security properties. And if we do that, we can actually have notifications throughout the suite as well. And of course, being able to tie all that metadata back, and as we're navigating through that network of information that we have in that repository, being able to tie into the information, whether we're starting at a piece of metadata to be able to chain back and actually see a visualization of where it is in one or more data models or process models in context, or if we're actually starting from a model perspective and can drill down to the underlying metadata and then start to see the associated policies and everything else associated with it. What we want is that network that we can navigate to find and manage the information that we're dealing with. So here's a couple of examples that I'm going to show you. I talked about being able to take things like a business glossary and represent different types of governance policies. So in this particular one, I've got a catalog of a number of different types of regulations that come here. It's kind of my top level glossary that I have in this hierarchy. So I've broken it down and you can see as you look at the screen, for each one, I've got the name and a brief definition of the policies for GDPR, HIPAA, Payment Card Industry, IPEDA, which is a Canadian regulation, Sarbanes-Oxley, and there are more and more, as we would go down. We could literally have hundreds of different governance policies or regulations that we have to deal with depending on how large our organization is and the geographic footprint of that organization. As I go down further, I can then start to look at the different pieces. So if I drill down for something like that GDPR policies, what I've got here is basically a place that I can put in information about GDPR in general and then I can actually start to link underneath that all the different policy statements that apply to GDPR. But there are other things that come into play here as well, which you can see as well as, of course, the ability to assign stewards and that type of thing. So you can actually see who the stewards are for this particular GDPR catalog and if I have the authorization, I also have the ability to go in and manage who I can assign the authority to or who can help work with it as well. And I have navigation capabilities to add new terms or new policies and those types of things to that glossary as well. As I take that down to that next level, what we've got here is a representation of just, again, a very small part of GDPR, but being able to take the different aspects or policy statements from GDPR and each one of those is basically given what the statement is and then a more detailed, qualifying statement of what that full policy statement is. By being able to take this type of information, I can break down these complex regulations into their individual parts and by virtue of doing that, I now have the ability, you're seeing a View Term tab here, but I can relate those to the particular pieces of information in my organization that they apply to. So again, you really have a lot of capabilities and then you have that traceability of which regulations apply to the different types of information in your organization itself. Same type of thing. Like I said, this isn't just about GDPR. It's about any particular regulation. I've done the same type of thing here for the top level where I have HIPAA data policies and as I drill down into them, I can see the specific HIPAA data policies again broken down into the different brief statements and then the full policy statement for each one of these different basic classifications. And then again, by being able to do that, I now have something that's granular enough that I can associate these different principles or policies back to the different types of metadata for not only the data but processes and other things that they apply to in our organization if I have all this stuff in my metadata repository as it originated from my models and that type of thing. And there's some very interesting things here. Like if you look at something like the third one down, like de-identified health information, there are things about actually taking that information again under HIPAA and masking it or de-identifying it so that if you're exchanging information between other providers or bodies where they don't need to know who the individual is, that you need to be able to do that type of thing. So there are very specific things that you need to be able to do as a process in your organization as well. Now we're looking at something from a slightly different perspective. I'm actually looking at the metadata that came out of a model for patient admission and the type of information that we might be recording. And of course, I'm dealing with this under HIPAA primarily. But if you look at this type of thing, right away we have a screen that says, hey, this patient admission information is extremely sensitive personal information. So you know that you have to be very careful with that. And that's the result of an alert from security properties that have been basically attached to that information right down at the underlying model level and then it's available throughout the entire system. We see things, and again, this is partial information, but you see things like some of the attributes that we're tracking in this particular model for about that. But as we look further down and we see things called attachments and security properties, we're actually seeing different types of things about data retention, the master data class that belongs to, you know, in terms of business value from a data quality perspective, whether this is high, medium, low, that type of thing. Whatever you want to be able to define as these classifications, you can put into this metadata repository and then start to track that information about whatever it applies to within your repository itself. Again, flipping that over to looking at the different types of things, I can now look back the other way. I'm dealing with this particular entity. If I go to this other view, I'm still seeing that alert at the top saying that it is sensitive information that I need to worry about. But now I'm actually starting to see all the different types of policies that are starting to kick into play for this type of information, regardless of which regulatory regulation it came from. So I'm talking about de-identified health information, disclosure accounting, that type of thing. So again, I can link all these different pieces together. And of course, as I talked about, we originated the models, but we also want to have an ability for people that don't actually have the modeling tools themselves that are actually working and creating the models at any given point in time as they're navigating through this information to be able to get back and actually see a model that it came from. So this is a snippet of a very simplified model on this particular healthcare example, but it shows the things like the patient admission information, kind of the middle left there on that, but it also gives us the context of the other entities that are tied into it in this particular model. And even there on the diagram surface itself, we're actually able to visualize things like some of the compliance mapping and other security properties right from a visual perspective. Let's talk reference data. Again, another part of a business glossary and a metadata repository is a very good use of it is to really have this as a focal point for our overall ecosystem and how we're working with it. And what I'm talking about there is not just our modeling tools and our metadata repository itself. We may have different types of things that we're utilizing in our organizations, whether it's things like we may have some reference data that's in spreadsheets that's just the reality of life. We may have some organizations that actually use a master data management hub. We may have some data sets that are defined externally. For instance, zip codes or in the more international sense, postal codes and those types of things, which are information that is available for us to consume from out the organization. So what we want to be able to do is set up again this high level of the different reference data types of sets and be able to link out to where we can find that particular information. We are able to do that in this example. So again, what I've done is I've taken a top level, basically have a place to catalog all of my different reference data sets in my organization. And then from there, I can go further. And now I've got a catalog of the different data sets within that catalog. So I've got things like country codes, currency codes. This is a healthcare example. So things like access out to the national drug codes, state and province codes and that type of thing. And as I go further, I can basically click. So this is an example with something that I might have stored in a Google sheet. It might have been an Excel spreadsheet or a PowerPoint site. It might have been an MDM hub. Wherever it is, at least being able to have that link to get out and consume it at source. And the nice thing about that is, is because I have this internalized again in this metadata repository, again, I can link those reference data sets or the specific reference data set to the different types of information that actually or entities that need to use that particular reference data set. So very powerful capability at the fingertips to get an overall picture of what the data looks like and how we can manage it in our organizations. So that was a lot that we covered, but just want to summarize in a few specific points. Again, what we really need to do as organizations is we need to establish governance to address multiple data privacy regulations that have varying complexity impact in our organizations itself. We really need to be able to conquer the dark side and overcoming the fear and the complexity of these regulations. And the way to do that is through integrated enterprise architecture with data modeling, integrated process modeling, data lineage, and metadata collaboration. And again, it's not just things like particularly in process modeling, modeling the existing business processes we have today. We should be modeling and making sure that we're establishing our compliance processes as well. So again, utilizing the tools at our disposal to define and really harness our approach to compliance as well. When you look at it, you really want to channel your inner data Jedi and again, be proactive and establish that overarching data culture or new organization so that it becomes woven into the fabric of the organization rather than an exception that you have to deal with all the time. And again, that culture includes awareness, a preventive attitude, and again, conduct yourself each and every day, always looking at things like that design principle of security really needs to be built in or privacy needs to be built in by design from the front rather than an afterthought. And again, the best way to conclude this is the word of our friend, again, that it says, do or do not, there is no try. You really need to embrace the data governance and really establish a solid foundation for it in your organizations. And with that, we're at the conclusion of the actual formal part of the presentation itself, so I would like to turn it back over to Shannon and then we'll take any questions. Ron, thank you so much for another great presentation that's this fantastic, I love all the references. And so I'll just answer the most commonly asked questions. Just a reminder, I was on the follow-up email by End of Day Thursday for this webinar with links to the slides and links to the recording and anything else requested throughout. So diving right in here, Ron, the U.S. doesn't have federal regulation requiring companies to communicate data breaches to the public. What is GDPR's requirements with regards to breaches? I don't basically consider myself to be an absolute expert on GDPR, but I know that if you look in the regulation itself, that nice, brief, little 500 and some odd-page regulation, I believe it's with it, I think you have to comply and report to the authorities within a 24-hour period if you have a data breach. I'm not 100% sure, but I'm pretty sure that's it. Somebody just flashed up 72 hours, so maybe that's correct. I thought it was 24, but 72 actually does sound a little more reasonable. Sure, yeah. So how do you model catalog unstructured data? Like Word documents, Excel videos, et cetera, data that doesn't fit nicely into the database. What I tended to do is, like, whether my data, and this is going back to, you know, whether my data was in spreadsheets or that type of thing, I would actually create a data model to represent that, because, again, it doesn't necessarily have to be implementation-specific to at least capture the concept of it. So I would typically start with the logical data model, and then I would actually, what I would do, say it was a spreadsheet or something like that, I would model the construct in a data model of some sort, and then through my repository, I would actually link it back to the spreadsheet that it ties into. And, in fact, what we can actually do in our modeling tool is you can actually link to where that document is stored so you always have that ability through an attachment to go right back to the source. For non-structured data, you know, maybe there are document stores like JSON, like a MongoDB store or that type of thing. Again, we support reverse engineering in bringing those types of models right into the environment so then you can actually deal with them right alongside and in conjunction with your relational models for the same concepts. Because, again, you may have customer data in all these different types of data stores or patient data if you're healthcare and that type of thing. Part of the power of bringing it all together into repositories, not only the ability to bring it in, but link those instances of that information together so you know where it exists in the organization. We've got a wide variety of questions here, Ron. I just love it. There's a lot of thought going on here. So, you know, with regards to consent for data, particularly with regard to customer base, if product service is renewal of subscription, is there a need to require added update aside from auto-renew? Again, that's getting into a deep interpretation of whichever the specific governance regulation is. Now, I know quite often what happens is even if there's an auto-renew year period or something like that under a lot of different regulations, even if it's an auto-renew, you still need to give the consent to allow them to continue. And a lot of us have probably even seen this type of thing in our e-mail where there's an auto-renew, but you'll get a message that says something to the effect of, yes, this is about to renew, but we do require your consent to continue to keep you subscribed to this or that type of thing. So you'll quite often see a principle like that come into play. Basically, it's customers or companies having that audit trail of not only initial consent, but making sure that they have that ongoing consent from individuals to keep that information. So how challenging has it been or is it to provide confirmation that requested data has been fully deleted from all data sources? Thoughts on how EU regulators will approach compliance in this regard? Well, it's really interesting because you could take a perspective that it's virtually impossible. I mean, that particular part of the regulation has far-reaching implications, and you can't possibly say that you've deleted all instances of that data unless you're in the small percentage of the companies worldwide that can account for every piece of data that they have about every individual and know how to access it virtually instantly because we're dealing not only with things like our online and live systems here. Let's talk about other things. Where have you archived it? Where are your backups that have this information and everything else? So really, you need to be able to say, you know, do I know everywhere that this piece of information is in my organization? Now, I would say even with a lot of the things that we talked about today, you're going to have a much better shot at it, but being able to do that 100% all of the time is going to be extremely difficult, but it's something that we need to strive for. So it's definitely a target, but it may be an unattainable target, and I think what we're going to see, and I could be wrong, I've been wrong frequently in my life, but I think as we start to see organizations grappling more and more with that particular regulation, we might actually see an amendment that kind of eases it a little bit but still makes it reasonable. And, you know, we just can't get through a webinar without this question. It's so key and so important. It's so hot right now is, what about the metadata? So how do we handle the metadata through this experience, Ron? You know, are you adding on software? Do I dare? How do I dare handling all that? Well, basically what we do is, kind of the entire premise of what I showed you with the screen samples and everything like that is we have our models, but those models aren't standalone. Everything is integrated into a repository. So all of the metadata from the models themselves goes into a repository. And then it's augmented with those other concepts that I showed, all those yellow bubbles and that type of thing, which is other metadata and business class recharacteries that we build. That's all in one integrated metadata repository so we can link anything to anything to be able to get access to the information. The metadata is the key to get at all of this. I love it. So how do the business applications access this data to users actually use this basic data entity? So this question holds true for lots of MDM and metadata reference data, glossary data, et cetera. You know, how does this, again, so just how does business applications access this data so users can actually use the basic data entity? Yeah, what we do is, of course, when we look at the collaboration across an organization, there are a number of different constituents or stakeholders in the organization. We have people that are using the modeling tools on a day-to-day basis, like our data architects, data modelers, that type of thing. If we're talking about process diagrams, we're talking about our business architects and those types of things, but not everybody is living in those tools on any given day. So that's why we actually, we have basically the information that they're working with goes into the repository itself, but then there's this whole other layer outside of it that we call Team Server that is fully browser-based. So individuals that have the authority can be given a profile, they can go in and we can designate, you know, which areas are they able to contribute to? What are they able to see? So again, you can also break it up on who's allowed to see what within the organization, but then they can fully collaborate on these different aspects by, and quite often a lot of them will become observers, others will become stewards responsible for different glossaries or identifying and keeping policies up to date and that type of thing. So it really is a divide and conquer, all using kind of that web-based portal, if you will, to get at the information as a whole. So, you know, is there standardized API or web service or something like that then? Is there feedback loops, a new MDM reference data or a Fedback to the MDM reference data tool? It depends what you want to do. Like what I showed today was actually being able to link out, but what you can also do is, you can actually, again, I kind of talked about it being kind of the heart of your ecosystem to be able to drive this type of thing out. So we have a lot of, for instance, from the modeling tools, we have a lot of export and import capabilities from different areas using metadata bridges and that type of thing. But if we're talking the collaboration platform in general, which is Team Server, it's all built on RESTful APIs as well. So we can actually exchange information with other platforms. So organizations can actually build their own integrations using those RESTful APIs. And we get this question a lot as well, Ron, you know, what's your take on this? How can we get buy-in from business on data governance if regulation does not apply? Well, I would look at that two ways. I would be, one of the things, particularly when we talk about privacy and that type of thing, I would be surprised if there's any organization out there that the privacy regulations don't apply. Now, having said that, it's interesting when you read something like the GDPR that some of the, I won't call them organizations, that are exempt from some of the provisions of something like GDPR is actually the government itself, which kind of is kind of interesting, but you can understand that at the same time. So what I would do is, you know, really make sure, don't just look at one or two regulations. Maybe take a look and say, what are the regulations out there that we could be subject to, because you'll probably find a case for regulations that you should be complying with anyway. And I guess the way I would look at it, like if I was trying to make an argument with a CEO or that type of thing, maybe I would approach it a slightly different way. I would say we just found out that Person X just made some purchases with your personal credit card information, and they happen to have your social security number and all that other type of personally identifiable information. Do you find it important to control that type of information so it doesn't happen again? And then I would take the point of it the way I'd look at it, as I would say, that as an organization or as organizations as a whole, we have a social responsibility to do that and protect the people that we conduct business with, whether they're employees, customers, vendors, or whoever. I would start with kind of that philosophical approach. And I think we have time for one more question here. So how does a company with out strong central data management get the data into the repository? Many different ways to do that. If you have databases or data stores out there, the quickest way to be able to do it is actually take something like the ER studio data architect, which is very high powered and reverse engineer your existing data artifacts or databases into the tool itself. Now when you do that, you're probably going to find all manner of sin committed over the last years or decades when those different databases were out there, such as possibly cryptic physical naming standards for tables, columns, and those types of things. So the first thing you'll want to do beyond that is be able to start to decipher that and be able to tie it into a corresponding logical model where you actually apply more meaning to those. In other words, more business like terms to categorize those different pieces of information in your business. And we actually have things called naming standards templates that can do that. That you can build based on the abbreviations of the conventions that you're seeing in your organization. People usually think of that in terms of the other way around. If I have different types of names for things and then I was going to generate a physical database, specifying the keywords or the abbreviations that they would use to then generate the DVL for the database. We can actually apply those naming standards in reverse and take things like the physical names or components, those names, and assemble them back together to come up with English or whatever language you're using phrases or names to properly categorize and explain what that piece of information is. And then from there, you can then start linking your constructs across all those models together. So if you have 75 different instances of a customer in different places or 50 instances of an employee across different systems, you can actually start linking those concepts together. So then you can get a handle on where does all the information about employees or customers or that type of thing exist across my different systems. Well, Ron, thank you. Indeed. Well, Ron, thank you again for another fantastic presentation. I just love it. And thanks to all of our attendees for being so engaged in everything we do and for all the great questions. Unfortunately, that is all the time we have for today. Again, I was going to follow up people by end of day Thursday with links to the slides, links to the recordings, and you've got wrong contact information right there, too. We'll get all that to you. And thanks everybody for attending and have a great day. Ron, again, thanks so much. And thank you. And thank everyone for attending. And if you're facing challenges, I'd love to hear from you and how you're trying to handle them in your organization. You're welcome. Thank you. Have a great day, everyone.