 Hello and welcome. My name is Shannon Kemp and I'm the Chief Digital Manager of DataVercity. We would like to thank you for joining this DataVercity webinar. Getting started with data governance, huge process models sponsored today by IDERA. Just a couple of points to get us started. Due to the large number of people that attend these sessions, you will be muted during the webinar. For questions, we will be collecting them via the Q&A in the bottom right-hand corner of your screen. Or if you'd like to tweet, we encourage you to share our highlights or questions via Twitter using hashtag DataVercity. As always, we will send a follow-up email within two business days containing links to the slides, the recording of this session, and additional information requested throughout the webinar. Now let me introduce to you our speaker for today, Kim Brashever. Kim graduated from the University of Texas with a BDA in Management Information Systems out of college. She went to work for Ernst & Young as a business consultant and Java developer for one of the largest Java projects in the United States at the time. She then came back to Austin to work as a Java developer for our local startup called Hire.com. Kim has worked in a variety of different technical and customer-facing roles, and she now is a senior product manager for IDERA working on the ER Studio Business Architect product, as well as a few internal products. And with that, I will give the floor to Kim to get today's webinar started. Hello and welcome. Hi, there. Thank you so much for having me. I'm really excited to talk about this topic. Business process models are one of the things that excite me at the moment. So the topic is getting started with data governance and using business process models. So I'm going to cover a little bit of highlights about what data governance is, and then we'll go into the process models themselves. So what is data governance? This is the official definition by the Data Governance Institute. Data governance is a system of decision, rights, and accountabilities for information-related processes executed according to agreed upon models which describe who can take what actions, with what information, and when, under what circumstances, using what methods. That's quite a mouthful and quite a large quote. So to break it down, I'm going to take it piece by piece to start with. So it's a system of decision, rights, and accountabilities, which means that you should be looking at your data and who gets to make decisions in regards to your data and who's accountable for the decisions that are made. For information-related processes. So that's a fancy way of saying your data processes. Executed according to the agreed upon models. So the models that you have, that you've got of your data, that one's pretty self-explanatory, and under who can take what actions. So you should be defining who can take what kinds of actions, who can create, who can read, who can update, who can delete items within your data, with what information. So different groups may have different access to different information. And when. So there are times where there are very time-sensitive information that's, maybe it's when you're going to archive off information, or maybe there's a process flow that, the way that the information flows, under what circumstances, and using what methods. So your data governance should also define what methods people should go about in order to take these kinds of actions and behaviors. So after giving that big giant quote, I'm going to be throwing in a couple little sun data facts as I go through the presentation in order to keep it a little lively. And for those of you on the east coast or in central United States, a little bit of wake up after lunch. So over 90% of all of the data in the world was created in the past two years, according to IBM. And around 100 hours of video are uploaded to YouTube every minute, and would take you 15 years to watch every video uploaded that day by users. So we've got lots and lots of data that's being processed, that's being handled, and that's being collected. And we need to be thinking about what are we going to do with it, how are we going to be responsible with it, you know, and how do we make it useful. So why is data governance important? These are some of the things that you should be thinking about. So the first one is regulatory standards influence data governance. And it's essential for companies who are working in some of these highly regulated industries that's leading kind of the forefront into data governance. Now if you're not in this industry, it's still really important, but obviously there are some people in the forefront that because of regulations need to be able to show this kind of information about their data, how it's being accessed, how it's being used, how people are getting into it. So according to a Forbes article published in March 2016, they came up with five types of data. The first one, a lot of us are focusing on, there's a lot of buzz around big data right now, which is your predictive analytics. So if you have, you know, millions of users, how can I find a pattern of behavior and be able to do something with that information? An additional type of data is fast data. So this is information that can quickly be analyzed. If you have someone who has a rewards card at a grocery store, then as soon as you buy dog food, it can say, oh, hey, this person just bought dog food, let's give them a coupon. Then you've got dark data, which is information that you can't easily access. This could be, for example, in videos, you can't just type in a search term unless the transcript has been transcribed to be able to get at that information. But there's a lot of really good valuable data that is considered dark data. There's also lost data, which is information that's collected but never reviewed. So your systems may be collecting information about page clicks or page turns or how long people are staying and viewing a page. And if you never review it, then you don't even know that you have that kind of good information available at your fingertips. And then the other type is new data. And this is information that you could have but you aren't harvesting. You aren't asking for this information. You aren't collecting it. You aren't storing it anywhere. You aren't logging it. And so there's a lot of information in your organization that you might be able to collect now but perhaps haven't defined that you need to do so. So other company data sets to consider when you're considering data governance and how you want to start building out these processes. The first one, marketing analytics and demographics. We talked about that a second ago with big data. The next one, product information. So that could be your page turns. That could be who's using it. That could be a variety of information on product. Any kind of regulated information should be considered as part of your data governance process. Operational data. So if you're trying to understand how your manufacturing processes work or how do you collect information on how many widgets have been produced. Financial data is another big aspect of data sets that you should be reviewing and considering. And then HR data, which is information about the people who are working for you and how are they doing and what kind of performance are they having and how can you make their lives better and can you see habits and patterns of behavior based on the people that you're working with. And then the last one being legal data. So looking and reviewing information that perhaps might be used in a court case or information that is very sensitive, patent information. So these are the kinds of data sets that you should be considering when you're looking at implementing your data governance program. Some of the benefits to consider in regards to data governance. You can increase the consistency and confidence in your decision making in regards to your data. You can decrease the risk of any kind of regulatory fines that might come about because you can't accurately prove how your information is being used. You can improve your data security by thinking ahead of time how you're going to implement threat behavior and how you're going to react to information if it gets corrupted or if it gets accessed by malicious people. You can maximize the revenue generation potential of your data. So that goes back to if you haven't accurately defined the information that you're gathering, there may be a lot more information that you could be gathering that could help drive more revenue in your organization. Designating accountability for information quality. A lot of organizations have very low data quality because they don't have anyone that's responsible for it. There's no metrics that are being measured. There's no information that's being gathered in regards to the quality of your data. And so if you don't know what your current data quality is, then you don't know how you can improve it. It also allows for better data planning. So if you have large storage and you can know that every December, because you're a retail organization, you're going to get a massive amount of data flooding through your system, you can better account for those transactions and that information taking place. And then reducing data redundancy. So a lot of places people have the same data in multiple different locations and that information can either become redundant or it can become out of sync. So these are all really good reasons to implement a data governance program. And I do see some of these questions that are coming in and we'll answer those all at the end because if I break my stride, then I may not be able to cover all the great information today. So I will definitely get to them at the end of the call. So as far as what poor data governance can result in, certainly poor data governance can result in a variety of different kinds of lawsuits. It can result in regulatory fines, which we've definitely talked about already. It can result in security breaches. It can result in data regulated risks that can be expensive and damaging to a company's reputation. So if you send out that information or you access it in an unusual way or you do something to harm people because of your data, it can definitely be really expensive and it can be really damaging. Sometimes you can't come back from it. And for legal discovery, if you know what data is available and what places and how it's accessed when companies are coming in in legal agreements with you, it allows you to reduce the information that you're handing over so that you just hand over the relevant information and not have to go through and weed between all of the information that may not be relevant to the legal system. So some challenges when data is not understood. Data can be thrown into a data lake waiting to be used one day and that's a really big challenge because that data lake will just continue to grow and grow and you'll just accumulate it, but it just grows out of control and you can't really manipulate it or access it or even know what's in there if you're just throwing it and you don't have a program in place to know what to do with it. It can also be thrown out and then later you discover what you need. So if you're trying to reduce your data and you say, oh, I don't need all of this information, let's go ahead and get rid of it and it may be later your company changes direction or progress and decide, oh, wow, I really do want all of my client's birthdays now. If you've thrown that out as information that wasn't relevant before, then you've lost all that information. And so you really need to get a good handle around what is your data and why is it important and why are you capturing it in the first place. So if you don't have data governance, increasing the scope and the scale of your data will just breed confusion. It's one thing if you have a minimum number of transactions, if your organization is starting and it's beginning to grow and maybe you can keep your hands on it, but once it starts to grow and get really busy, then you can lose track of all of that information and it can get out of hand really, really quickly. So why is there so much angst involved in implementing a data governance solution? So one of the things is that outside regulations don't provide guidance on how to handle the data. It allows companies to figure it out for themselves and I'm sure that's why many of you are on the call today is because you want to try to figure out some ways to be able to start to be able to get a hold of data governance and implement it within your organization. Most companies compensate by archiving all of their data on central file servers without understanding what they have or they need. This can leave them open to risk and it also, as we were saying in the previous slide, it can allow that data storage to grow completely out of control. So this can become a real challenge for an organization. Companies tend to ignore data points that live outside their firewalls. So if you are starting to implement your data lineage and trying to understand where the data is going, if you haven't started thinking about that, you may have additional data points that are available to you, but they're not being managed within your software system. Most organizations' data quality is siloed and it's poor to begin with and we'll address more about data quality here going forward. Data can become fragment, it can become inconsistent, it can become redundant. So as you gather data and then later you change your system and you start gathering different data, you can have different pieces fall out and you don't know where they go or let's say you have a buyer who's buying both on the website and in a retail store and you may be gathering different information in those different places which allows the information to become inconsistent and you may also, again, the information may become redundant because you have it in a variety of places. So freeing all of these things up with a good data governance program can really help and take a lot of the heartache out of the process. So I'm going to cover a couple of the fun little facts again before I go into transition to the next part of this presentation. So Google alone can process on average over 40,000 search queries a second which makes it almost four billion searches in a single day. And if you can think about that mass quantity of information, if you are gathering it correctly, you can do some very interesting things with those Google trends versus if you aren't doing anything, then you're losing out on some valuable information. The number of bits of information stored in the digital universe is thought to have exceeded the number of stars in the physical universe as of 2007. So, and this information is just going to continue to keep growing and getting out of hand until we start to get a handle on it and put some good processes in place. So I'm going to cover now the pillars of data governance and these are the good foundation for a good data governance program. The five different pillars of data governance that sit on top of your data architecture are data quality, data definitions, data lineage, data modeling, and data access. And in the preceding slides I will go through a variety of different questions that you should be asking in regards to these different pillars. So we'll start with data quality. Different data quality questions that you should be considering in your organization when implementing this are, how can you improve and maintain the quality of your data? How do you measure the quality of your data? What is the current condition of your data? Is it organized? Is it focused? Is it all over the place? Do you even have the data model behind it to understand, you know, where your systems are? How trustworthy is your data? Do you believe that the data that's being gathered is accurate or do you have bugs in your system that may be giving you inaccurate information? How accurate does your data even need to be in the first place? How well does the data align with your corporate and regulatory policies? And it's really important that before you start to enact the data governance program to really understand what those policies are so that you can align your data better to them. Sorry, I clicked and then it didn't go. How do you identify issues with your data? So if you have problems that are coming up, do they come in through a bug? Do they come in through a support ticket? Do you have people who are responsible for data quality in your organization? How do you fix your data once you've determined that it's broken? And this is really great to think about in advance so that as soon as you realize that something is broken, you can put the processes and steps in place in order to implement it so that it doesn't become more broken and it doesn't bring your systems down. And then the last question for data quality questions is, how do we develop strong data quality parameters that are consistent and repeatable? The next pillar is data definitions. In this case, the kinds of questions you should be considering are, how do you define your data in the first place? How is your data mapped? A lot of times we have multiple different systems that are involved and are talking to each other and they don't always have the same table structure underneath it, so you really need to understand how is your data mapped and what information in one system maps to the information in another system. What does your data mean? So again, this has to do with that data lake that you're thinking about. I mean, how do you define that data to be able to make it meaningful? Do you have consistent definitions across your organization? Do you have a chart in lexicons? Oh, that's my next question. Are you in alignment with your terms and your lexicons? So, you know, does everybody call a customer the same thing? Does everybody call and order the same thing? Does everybody consider a purchase the same? And so it's really important that when you're dealing with your data that everyone's on the same page and they're using the same terms and describing it. And then the last question is, how do you find the right elements to interact with? So if you're looking at all of your different data, how do you know that this piece of data should be interacting with another piece of data? The next pillar is data lineage. And so the questions that you should be considering in regards to data lineage are, what happens to your data over time? Because data can definitely change based on where someone is in a process or what's going on if you think of, like, an inventory system. Obviously the data is constantly changing based on when people are buying things and new inventory becomes available. So how is your data used? And this is important as far as the systems that are using it as well as the people that are using it. You know, if you think of a marketing team, they want tons of information to be able to be available at their fingertips and you want to make sure that they're using it in the appropriate ways. What can the data be used for? Where can the data be used? So can it be used by, for example, your sales force integration or can it be used by your financial systems? You know, being able to kind of lock that down so that you don't give that information to systems that don't necessarily require it. What does the data produce? What information gets produced whenever that data becomes available? And what does the data consume? So you may be looking at how that data is bringing in information from other places. What rules do the data follow? And what associations are there between different points of data within your system? The next pillar is data modeling. And there certainly are lots of data modeling tools out there. IDERA has one that's called Data Architect. But in this case, I'm really kind of looking at instead of the term data modeling, looking at how do you actually model that data. So what does your data look like? What controls and audits are put in place to ensure compliance? What metadata needs to be captured? Are there places that you can reduce redundancy within your data by looking at your data models and understanding where information is being used and what tables are accessing it? And is your data consistent? The last pillar of the data governance pillars is the data access pillar. The questions that you should be asking yourself in regards to access, that's a little bit of a tongue twister. Who can access your data? How is your data protected? How is your data stored? How is your data managed? And who can influence your data? And this is really important so that you can keep your information locked and inaccessible to outside systems that might want to do it harm while actually being able to give that information to the people who need it and need to make decisions based on it. So that's the end of this section and I will again go with some of my fun quotes, fun facts. So if you burn all of the data created in just one day onto DVDs, you could stack them on top of each other and reach the moon twice. That's a lot of information that's taking place and that's being gathered and that information is just going to continue to increase as we get more Internet of Things and places that are accessing that information and as systems continue to want to gather it. This year, there will be over 1.2 billion smartphones in the world, which are stuff full of sensors and data collection features and the growth is predicted to continue. I previously worked at an Internet of Things company and just the geocache data between where somebody is going and how that information is being tracked over time is excessive and trying to figure out how often you should be collecting that information and so there's a lot of things that you should be considering in regards to your data. So now I'm going to get to the crux of how process models can help you to begin to start your data governance process and be able to really define for you some different elements and if you think of business process models, think of the fact that you are speaking to the business ears. They like things that are very visual. They like things that are very high level and a lot of times I know that this audience is very data engineering oriented or data architect oriented and so you spend a lot of time thinking about the data itself while the business side is really thinking about how to run the business and so business process models really help you to come into alignment with the organization so that everybody is on the right page and the data team is making the right decisions based on the information that the business team is providing to them. So first I'm going to give you a quick little primer of some of the little notations that you may see in some of these process models. These are the process models were built using IDERA's business architect product and so these are some of our extra notations. For the reference objects we have stewards that are available that can be actual data stewards or they can just represent a people icon within your organization. We also have reference objects for business units, for business rules, business elements, as well as applications that may interact in your system. We also have external data objects that we refer to and this information can be pulled in in some forms from our data architect product and they're your entities, your tables, your views, your data stores, your data feeds, your reports and your flat files. And the BPMN data objects are the, these are available via most BPMN business process model tools. We have the data object, the input, the output and the two different collections. You guys are asking some good questions coming in. So I'm going to show you some process models for who can access which data, who can make decisions, who's accountable for which information, who can act on that data, when can they take these actions and what info can you use. Now all of my business process models are very, very, very simplistic and very high level within your organization. Obviously they would be far more complicated and they would have a lot more detail. But if I were to show you an actual in progress for an organization data model, we could spend the whole day analyzing all of the different pieces of it. So this is really just to kind of guess your mind thinking in that direction as far as what information really can be gathered and how you can start to draw that out. So for who can access which data, in this case we have our data stewards as well as our business unit and identifying these are the types of data that can be accessed. I've hung a few tables off the bottom to represent what kinds of tables are interacting with that information as well as the different people that are involved and why they have access to that information. The IDERA mascot is the rubber ducky. So you'll see a lot of my models are referring to something duck oriented. Then there's who can make decisions. So in this case you may want to diagram out a decision tree in a decision flow so that everybody from the data side to the business side understands when information is created versus when it's updated or any decisions you really are making in regards to your data. But that's the example that I used in this case. So in this circumstance you will ask this train of information to be able to check the data and correlate it to allow you to decide if you're going to create a new customer or if you're going to update that existing customer. Who is accountable for which information? Oh, and I did throw in this quick slide here on accountability as something that you should be considering in your organization. Who is responsible for how accurate the data you're talking about is? Who is responsible for defining who can use that information? Who is responsible for defining how consistent that information needs to be? Who can tell you how complete information is as well as updating and cleansing the information? Who has the accountability and the access to be able to update it, archive it, throw it in a data warehouse? So you should really be considering this kind of information for when you're drawing out business process models. So in this example I've shown some subject areas which I use from our conceptual models and then identified my stewards that are involved in it so that you can take this information and you can give it to either your data team or you might want to give it to the entire organization and say, hey, if I have a question in regards to marketing analytics and demographics, Katie McDuck is the only person that can give me the information that I need to know about that or at least give me the definitive answer. So it's really good to be able to let the organization as a whole know these are your touch points and these are the people who you can go to when you have these questions. And for a data architect this is really important as you're drawing out your models or you're implementing new projects so that you understand, okay, if I'm dealing with product information on this project I need to go to John Mallard when I run up against questions in regards to what I'm doing with my data and what my information is. So the next one is who can act on the data? And so in this example I've taken a sales path and a marketing path and I've identified how there are some different inputs into the CRM solution. So marketing may be gathering contact information as people are coming and visiting a website and that gets fed into the CRM solution but the sales team may be the ones who are identifying who is actually a customer because not all contacts and not all leads actually convert to customers. And so you really need to kind of understand that you don't, in this example you wouldn't want marketing updating the customer because they don't really know what defines a customer because the sales team is the one that implements the business rules in regards to that. When can they take these actions? So in this case I've done a real quick example of archiving off-order inventory and so I have showed several different data stores that are involved in this and when are you able to take that information and send it over to the data warehouse because at various different times you may still want to be able to have that information in your base system rather than having to track back and figure out where was my order six months ago. And also if you've got customer service people that are on the line that are helping to track an order in this case, they would know, hey, if somebody's coming to me with something that's greater than six months old then I'm not going to be able to access inventory or if it's after 90 days I won't be able to access their order unless I can go and access the data warehouse system. And obviously these are just numbers that are for an example. So the next one is, just taking a little slow, looking at it again. Which information can you use? And so in this case I've taken a customer and I've taken a variety of different tasks in regards to the customer. And certainly you could drop tables in here and start to show that this is interacting with different tables and features. But in this case we've defined that in these task behaviors when I'm dealing with my order data store, you know, in what cases can they read it and what cases can they update it? What cases can they create it? And in this case we don't have a delete defined. But it's really good to know this information especially from the data engineering perspective so that you can understand who your business says that they want to have access and at what different points in time so that you can lock down the data so that it doesn't get manipulated from places where it shouldn't be. So again, another fun fact. Every two days we create as much information as we did from the beginning of time until 2013. So we, and I expect that number to continue to increase and it's possible that since I pulled that statistic from TechCrunch that that information may already be out of date. It may be even larger than that. But this is, we are just collecting and gathering so much information at great depths because storage costs have reduced and we can actually, we have tools that allow us to get at that information. So we've got a lot of stuff that's going on which makes data governance more and more compelling and important as we go forward. So I'm going to throw in a few more data governance process models. I love data process diagrams to show you guys a couple more examples of different ways that you can define your data. So we're going to go through how data is stored, how it's mapped, how it's archived, how it's backed up and how it might be protected from mishaps theft or attack. So in regards to how data is stored, so in this example I've shown, you know, a customer comes into the website, they access their account, they place an order and then it's shipped and then I've got a lot of data stores that may be taking place and then involved with those tasks at hand. And so if you've got this information from your business side, then it allows you to better understand, you know, at what points in the system do you need to access different data repositories. And if you look at how data is mapped, now before I get into this, there is a lot more robust data mapping that's available via, you know, data architecture tools like our data architect. There's, you know, entire ETL systems. But I just wanted to go and look at how data is mapped and represent it in a way that your business users can look at that information and be able to really quickly summarize how information is mapped between two different systems. So in this case, we're talking about a web system and a retail system. And in the web system, it may be defined as an order and the retail system is a purchase. You know, an account may be a customer, a shipment may equal inventory. And this is a really great place to get started when you're talking to your business teams so that you can get these lexicons and get these terms correct and make sure that everybody's talking about the same thing. And you may even want to, you know, on the line that connects them and links them be able to say what is the overall term that we're using in our organization for this. So that everybody's on the same page so that the same information is gathered so that there's not as much redundant data and then obviously getting into those more in-depth tools that do the more detailed mapping, especially on, you know, the column level to be able to, you know, use your APIs and be able to cross that information back and forth. So processes need to be defined concerning how data is archived. So in this case I've done an example diagram where I have my data stores that are in production and my data stores that are in archived information and this would allow you to be able to kind of do a systems diagram to start the conversation with a business user to be able to say, okay, so I've got all these applications that are using this different information and I have users that are using this information. And so, you know, at what point do we want to archive the data? At what point do we need to bring the data back out of the archive to retrieve it and what are our different processes and obviously then from there you can draw out a task flow that shows, okay, in this case we're going to do that, but this is kind of the high level, you know, 10,000 foot view in order to get the conversation started so that people even understand where data is even being placed in the first place. The next one is a process model to show how data is backed up. And so in this case, you may receive a request to back up the data because you are on limited space. You may have to look and see is the space available and if it is, then I may need to get it approved. I mean, if it's not, I may need to get it approved and if it is, then maybe I've got, you know, a priority and so I may have, you know, two different data store levels of priority to kind of help the business side understand, you know, when I do need to do a back up, you know, and it may be a, you know, every 30 days kind of thing, it may be an every night kind of thing and be able to kind of start that conversation so that your business side is in alignment with the data side so that nothing happens. It may be that you're doing it every millisecond, just in case, you know, to do duality of your systems. But if the business side is aware of how frequently that information is backed up, then if something happens within the business, then they know that they've got access to that information and they may need to know, you know, how long are backups retained? So this is all good information to discuss with them and put in business processes. The last one is how it's protected from mishaps, thefts, or attack. And so in this case, I've gone through kind of a threat assessment process and so an instance of a threat may be reported. You've gathered the details. You confiscate the hardware. You investigate the theft. And then if there was data theft, then you want to notify the management team and they need to know what they're going to do with those breach procedures. Or if there wasn't, you may want to follow a report and start monitoring it and see if people are starting to get access to that information. I mean, obviously there are, you know, complex processes within processes on doing this, but this is the higher level view so that you can start to have the conversation with the business team so that they can be thinking in advance of something happening, what are you going to do so that you can react quickly in the moment that it happens. And everyone can look at the process and know exactly what they need to do because the faster you can report it, the faster you see it happening, the faster you can correct it. And with a lot of organizations, even a matter of milliseconds could be detrimental to a company. So you need to know the moment that something starts happening, what you're going to do, and you need to have your business users agree with what that process is. So I'm going to add one, another quote here. Big data has been used to predict crimes before they happen. It's kind of like, I don't know if any of you saw a demolition man with, so that's just alone as Sandra Bullock and how they knew, all the precogs knew the stuff was happening beforehand, but what was that? I think that was big, yeah, it was demolition man. Anyway, big data has been used to predict crimes before they happen. Even a predictive policing trial in California was able to identify areas where crime will occur three times more accurately than existing methods of forecasting because they're able to take all of that data and be able to do something interesting with them and see that there are precursors that take place before the crime even starts. Oh, it's Minority Report. Thank you, Christopher, but Bradley, I appreciate that. I was struggling a little bit. Afternoon Sleepy. So, summing up the conclusion of kind of a little bit of what we've talked about today, data governance is already a necessity for regulated industries. There are already a lot of these organizations that are doing this, and one of the value that you can get from that is, you know, you can go and you can talk to a lot of these organizations about how they're doing it and how to better do it with your own organization. Data governance will become more essential as data continues to grow in organizations. This is not going to go away. This is going to become a more permanent foothold, and so the sooner that you can start to get these processes and policies on board and figure out what you want to do with your own organization, the sooner ahead of the curve you'll get and the less work you'll have to do later. And implementing good data governance processes aren't easy, but business process models can definitely help you get started. They can open the door and open the conversation so that the things that the data team are thinking about are in alignment with the things that the business team is thinking about. So it's expected that by 2020 the amount of digital information in existence will have grown from 3.2 zettabytes today to 40 zettabytes. And please excuse my typo there. We are amassing data in a crazy speed, and 2020 is just three years away from us now. The NSA is thought to analyze 1.6 percent of all global internet traffic. It's around 30 petabytes, 30 million gigabytes every single day. So to be able, if you think about monitoring that information and gathering it and making it useful, you really do need to have some good definition in place to allow you to be able to be able to get a handle on this massive amount of information. So my final slide is my question slide, and at this point I'm more than happy to bring up any of the questions that have come through. I know you guys have been asking some good questions as we're going along. I'm happy to go back and flip to some previous slides as well so that we can get, so that I can help and answer your questions. Kim, thank you so much for this fantastic presentation. It's just great. I love it. And to answer some of the most commonly asked questions, I will be sending a follow-up email by end of day Thursday with links to the slides and the recording of this presentation along with any additional information requested throughout. So diving right into those great questions. Do you think governmental institution encounter is the same kind of problem than private sector? Yeah, I think government at the moment is more tightly regulated so they have a lot of that structure already in place, they already have a lot of that reporting that's already in place that may give them a little bit of a leg up in being able to define their processes but absolutely it's just as essential to the corporate sector and the public sector as it is to governmental agencies, if not in some cases more so. And so hot on everybody's mind right now is GDPR. So can you please address tightening privacy rules particularly that GDPR in Europe? Yeah, I could do an entire other webinar on GDPR and the aspects that people need to be looking at and concerned with so I'm not going to face the last 13 minutes going into those details but certainly GDPR is something that while it's being implemented in Europe we are an international economy and we are working with, you know, most large organizations are working with companies that are in Europe and so they need to be able to respond just as much as the European companies do to be able to accurately protect people's data and be able to, you know, siphon it off so that the right people who should have the right access have accessibility to that information and the rest of the people interacting in the system do not and so data governance definitely spends a lot of time looking at, you know, what is that accessibility and who should be able to use it. Perfect. And can you walk through data lineage and data modeling size just one more time please and maybe add a little, I don't know what there's, maybe there's a more detailed question from the question or if you want to add some of that in. Yeah, let me figure out what slide number that is so I can quickly jump to it here. So I'm going to go to the one that includes all of the questions in regards to data management. So when you're tracking your data through time you should know who's interacting with it, who is, you know, who has access to it, you know, as it comes into your system or even as it comes into maybe some supporting systems before it gets to your system, you should know, you know, who has the accessibility to it, who can interact with it, who can change it, who can create it, as well as, you know, what is that data even used for so that you can make sure that the line that you are creating is going to the right places. And then is there, did another question come up in regards to data lineage or should I jump to the data modeling questions? Yeah, the one after that is, you know, there was another question, could you please show the slide again quickly. It was just to view the slide again. So let me go to, you said the data lineage and definition or the modeling? For the second questioner is the definition and the lineage and the first questioner is the lineage and the modeling. Let's jump back to definition real quick here. So these are the data definition questions that you should be considering. This is not an inclusive list. It's just something to get your brain started and get you thinking about, in this case, how you define your data in the first place. And, you know, most, you know, good data modelers and data architects, they know the answers to these questions, but it may not be in sync with the information that the business side thinks they know about it. So it's really good to get that dialogue going and get the insights and the insights to make sure that the information is being defined in the same way across the system. And then the data modeling slide is number 83. So these are some of the questions that I threw out there in regards to what you need to be thinking about from a data modeling perspective. At least in regards to when you're talking to your business users, obviously there is an entire dance and there are entire departments that are devoted to data modeling and I certainly don't want to make this oversimplified. But, you know, as you're sitting there and you're talking to your business users, you should be considering these kinds of questions. I love it. And as you mentioned, I mean there's so many. We could spend hours on just each section here. Absolutely. So, you know, are the process models required in BPMN, if so, are they installed into an automated workflow? So BPMN stands for business process modeling notation. So obviously if you're talking about BPMN, you're talking about the business process models. But as far as can they be fed into an automation effort. Currently, the business architect product for ER studio does not support the BPEL, which is an automated language so that you can talk directly to your automation programs. But certainly you can use business process models to define your workflows and then be able to take those to a team that can implement them and move it forward. I mean the pictures themselves are invaluable to organizations to be able to understand what's going on with the processes. And we have, you know, we have Fortune 100 customers with massive, massive business process models that really make a notation of everything and anything that happens in the system. Perfect. So where do you typically see a data modeler in an organization? Are they generally separate from IT and the functional domain? So I actually did a separate, I've done a video and I've done a couple of webinars talking about why data modelers should care about business process models. And I think that I think in a lot of organizations they are siloed. In a lot of organizations they are kind of thought of last in a project. People do project planning and they lay out all the different things that they want to implement. And then at the end of the day they have to go to the data team and the data team, you know, implements that information so that the systems can access it. And I think that's really the wrong way to do it. You know, your data experts are invaluable to your process and they should be included at the beginning of the conversation and they should be talking to the business and understanding these things because it could change the way that a solution is implemented based on the data and the data is available and what needs to be available. So a lot of organizations do have data modelers as a separate entity and it's certainly they are a separate entity but they have them isolated away in a separate department and I think that they absolutely should be involved in any conversation and part of the business process models could help you define that moment where you do start to get a data modeler involved so that you can ask the right kinds of questions as to when to pull them into your conversation. Sure, absolutely. So could you please provide more info about the data quality? How can we validate the quality of the data and which tool do we use to measure the quality of the data? Basically how do we answer the questions you have asked in your slide? I'll go back to that slide real quick here. So these are the different questions and I think that there certainly are tools out there that can help you with data quality. It depends on the size of your organization and how much you'd like to spend on that kind of information but in smaller organizations it certainly can be sitting around and talking to your organization about this and defining it. And do you want, just like you're going to talk about availability of your systems. So you say I want it to be .99999% available where it's only down two hours total in the course of a year. You should also be defining your data and understanding what is my data quality and how do I watch for it and how do I know if it starts to become separate and how can I monitor that and how do I set those metrics in place. I mean we certainly have data monitoring tools here at IDERA but you kind of really need to look and see where is your organization lacking in regards to data quality and what kinds of tools do you need to implement in order to bring that up to the level that you'd like. Perfect and could you recommend some references to dive deeper into these concepts? I mean you can do searches on data governance and there's a million of them out there. There is the data governance institute which does talk about data governance at a very medium level but certainly data governance is a hot topic right now and lots of people are talking about it. So it's really not difficult to bend your ear and start hearing about these things and obviously I'm going to be a lot more vocal about this going forward. We have white papers within our organization but I think that the first thing is to kind of get ahead and an understanding of what does data governance mean to you at a high level and then you can start to do your research to find out where people can help you answer the questions you need in your organization. Well, Kim, thank you so much for this fantastic presentation. It's really been great. Really appreciate it but that does bring us close to the top of the hour here. Just a reminder, I will be sending out a follow-up email by end of day Thursday with links to the slides and links to the recording of the presentation and we'll get you some additional information. Kim, again, thanks to all of our attendees for being so engaged in everything we do. We just love all the great questions that have come in today and your participation in the webinar. Kim, thanks so much and thanks to everybody and hope everyone has a great day. Thank you so much. Thank you.