 All right. Hello everyone and welcome to our next EDW session called Dama Dimbak the who and the what by Raymond McGirt Who is a computer scientist at data expeditions? All audience members are muted during these sessions So please submit your questions in the Q&A window on the right of the screen and our speaker will respond to as many Questions as possible at the end of the talk So let's begin our presentation now. Thank you and welcome Raymond. Good day, everybody. Welcome to this session Hope that the Conference is going well for you so far What I am presenting is essentially a critique of the Dama Dimbak Looking at the context diagrams and if you're not sure what they are I will explain what what those are in the sections that we're looking at in particular This is a little bit about me The main things I want to highlight here is I've been working in data since 1992 Took some data classes in college as well. I've done various Things with data through the years as highlighted there. I am a certified data management professional from From the Dama. I'm also a how hold a data modeling certificate from the data modeling Institute I've been a data Dama member since 2012 with the Indiana chapter that's based in Indianapolis Here are my primary sources for most of the information that I'm using of course the Dama Dimbak I Realized heavily on it Specifically the context diagrams and also the Dama dictionary of data management. I use that for a lot of definitions And I do use a lot of definitions throughout this presentation because I think it's important that we all Be sort of on the same page as far as what we're talking about what a data architect is what a business requirement is Things like that that you understand at least the view point that I'm presenting as a definition So that you can either agree or disagree based on what your own Um Interaction with these people are I realize different businesses will have may have a business a data architect But it may they may do a slightly different thing than what the definition is and that's okay Sometimes the data architect is completely different than what the definition is and that's okay to at least want to present what I view as the defined Definition for data architect in the clip are sources. I used Most of them said you can use the clip art as long as you give credit credit is due so here's all those We're going to look at the context diagrams explain what they are and explain the different sections that That are used for this presentation we're going to look at the inputs and deliverables for excuse me For the watt the items are used to Uh perform the activities and what comes out of those activities We're going to look at the who which are the suppliers the people who provide information for the activities the participants the people who actually execute the Activities and the consumers people who use the things that come out of the activity now have some final thoughts at the end Now we're going to look at the dembach and specifically the context diagram So like I state there each knowledge area has a context diagram This is a sample one from the data governance and stewardship knowledge area Um And this is the basic format Each diagram will this will have definitions and goals Technical drivers and activities business drivers And then the others I will explain here shortly In particular what we're going to look at for the watt are the inputs which are highlighted here And the deliverables which are highlighted here The uh, these are I think fairly straightforward the inputs are The documentation and data items are used By the activity to perform the activity The deliverables are the things that are come out of that that are produced by the activity and hopefully used later on The context diagram also has suppliers. These are the people who perform activities ahead of or perform past ahead of the activity to prepare Whatever is needed for the activity the participants are the ones who are actually executing the um the activity for the knowledge area and They have basically the hands on for this particular knowledge area The consumers are the ones that are using the information that comes out of the activity Uh in some fashion to make decisions to use in later knowledge areas or other knowledge areas um to perform paths that they need to do as part of What is determined by the activities of this knowledge area? and my hypothesis for this Uh presentation is the what's in the who's are used in several them back them back knowledge areas The idea that I was hoping what I would find was that The inputs and deliverables are used in several areas of the um Several knowledge areas are not just used once and then they're done and also the who's Are are spread out between the knowledge areas that they have to have their hands in several knowledge areas They're not just focused on just one knowledge area specifically all the time So we're going to look at the what's the inputs and deliverables And here's a tree map that I've generated that has um each of the knowledge areas Uh with the documentation that is produced by the knowledge area or excuse me the that is input and deliverable by the knowledge areas And if you look at my data governance and stewardship Uh about a few thirds of the way over there's a thicker line in the rest of them The ones on the left are the inputs ones on the right are the deliverables Um, and this is similar for all the other knowledge areas Excuse me Look at another way These are all the this is all the documentation both inputs and deliverables that are used uh throughout all the knowledge areas And below them Within each knowledge within each documentation item are the knowledge areas that those are used in Um You'll notice that there are a lot of documents inputs and deliverables um Some of them appear to not be used to very often But we shall see The first set of documentation is our regulatory requirements These are the things that either a government or a licensing agency Or an oversight agency will place on a company Or organization In order to um Manage and regulate as the term alludes to the products and the practices of the organization And this is one of those definitions that was not in the dictionary So I highlight I note the website that I got that from And this will happen a few other times as well I'll give you a moment to read that read through that Okay, moving on regulatory requirements are inputs For these three knowledge areas the data governance and stewardship the data security and data integration and interoperability um I feel these are highlighted or these are noted here Because the regulatory requirements play a key role In how things are set up and how things are managed how things are kept secure um The g The gp dr. I believe it is in the california uh privacy Play a lot into security and governance as well as well as how data is used in the integration interoperability area IT strategies These are action plans for achieving a goal um Basically in their long term as the definition there uh alludes to um These are basically plans that are uh set in motion To determine how The it structure of the corporation organization is going to be managed And these are inputs in the data government the stewardship again Uh documentation and content management data warehousing and business intelligence Uh these IT strategies is key in these areas because They determine how infrastructure is set up and also how um Data is managed throughout the life cycle And also how it is used especially in business intelligence The idea is that you have to have The infrastructure and the data in the right place in the right settings or really get the meaningful data out to business intelligence work with it data strategies Similar to it strategies except this is more focused on the data itself um it's It's more along the lines of like I said how the data is managed instead of just the it Again with well, let's look forward here Data strategies again as an input for data governance and stewardship and also data architecture And it's also deliverable for data governance and stewardship Um as inputs data strategies play a role in these um knowledge areas because It determines how things are set up for How things are set up to allow access allow usage of the data that's being generated in place in the data storage areas It's deliverable out of data governance because it is It sets in motion the principles that Will be used further down the road to determine How things are set up data standards um This this definition comes from the epa And it's all about how the data is Going to be managed across the organization um There has to be an Higher level and overarching set of standards But each business unit can set its own standards, I believe as well for um its own Domain of data as long as they're consistent with the overarching data standards Give you a moment to read that Data standards aren't input for data modeling design data integration and top building data quality um This data standards go a long way in these knowledge areas Because they help define How the data is going to be arranged uh and By defining how it's going to be arranged You play a role in how it's going to be used as well and Who has access to it? And what the basically what the definition of the data is will be key In these knowledge areas business requirements um These are overarching requirements as it says for the business uh For performing for the performing organization corporation, whatever um It becomes The objectives for fulfilling a project What do you hope to gain by? uh By initiating this project as the project goes on What do you help? What is the final goal and What are the parameters that exist in between the start and the finish? This is requirements are input for data warehousing business intelligence metadata management and data quality And to be honest, I'm kind of surprised that it's not input for a simple for a couple other uh knowledge areas as well in particular the data governance and stewardship uh Business requirements become important here Because especially with the data warehousing and business intelligence It helps uh Focus on the end goal and the parameters in between so you you get a sense of as you're Mining the data Using the data analyzing the data That you get a sense of what it is you're looking for and can you determine Are you finding what you're looking for or is there a problem? Is there something else that needs to be done? To get to arrive at the solution data quality Business requirements become uh a data quality concern as well because It's the business requirements to determine What the level of quality is what level of quality is expected? uh, it plays a role because uh Again, it gives you a sense of what is expected and allows the data quality personnel to understand What is Good bad or indifferent in the data that you're seeing and the quality the data that you're seeing And I believe business goals um simple straightforward Definition what a company expects for hopes to accomplish over a specific period um business goals become As it says the expectations of the company organization Uh, and the period could be Six months a year or five years ten years however Long that company wants to decide that period will be and also I think the s is important here business goals in that There are usually more than one goal that a company is striving for And there may be competing goals And there needs to be an organizational and definitive business That determines how those competing goals will be managed Business goals as an input for data governance and stewardship data security data integration interoperability um I feel that the business goals become the key drivers In these areas because they kind of set the agenda for What will be the outputs of the The deliverables of these knowledge areas um They determine the business goals also set parameters much like the business requirements and will um will Will drive How that how those activities within the within the knowledge areas are processed business strategies uh business strategies much like it strategies and data strategies Are uh plans and principles that are set forward Uh again the strategies set parameters for how the um Business how the activity will function within those parameters The business strategies as input for business governance and stewardship data security data integration interoperability and documented content control Again, it becomes important here much like the uh uh like the other uh um other Inputs and deliverables especially inputs in that It helps define the activities that will be The well, it doesn't really define the activities, but it defines how those activities will be performed um It sets parameters it sets target goals It sets expectations for Uh what the end product will be For each of these knowledge areas And now we're going to look at the who the suppliers the participants and the consumers Here's a tree map with the knowledge areas colored with the The suppliers participants and consumers behind it Excuse me You'll notice in the top left the data governance and stewardship have a lot of people involved In that knowledge area Uh the way it works is top left is usually the most active And then it goes down and then to the right To where the data storage organization at the bottom right have fewer people But still an important knowledge area We're looking at it a different way these are the different Uh personnel These suppliers participants and consumers that are used with the knowledge areas that they are Uh used in either as a supplier or for a consumer um Again top left is the one that has the most knowledge areas bottom right has the fewest I'm looking at some of these specifically we don't have time to go through all of them Others highlight the more common ones We have a project manager who has the overall responsibility for leading your project um This involves as the definition says that you can read this involves several aspects Within the project that need to be done project managers a participant in data integration interoperability data warehousing business intelligence and metadata management um The project manager plays an active role in these knowledge areas Uh largely because he's looking at or he or she is looking at the information that goes in and comes out um In how that how it transfers from an input to an output as well within the um the knowledge area consumer of the data governance and stewardship and data architecture knowledge areas That is that the project manager will take what these people decide And incorporate into The plans of the project A database administrator very simple straightforward definition an IT professional role responsible for database administration And database administration is the overall overarching management of the database everything from setting it up Initially to fine-tuning it and tweaking it for to optimize it To resolving errors and faults and problems and crashes um database administrators to supply to the data modeling and design uh Largely because uh the database administrator will have certain tools available to them and we'll have to uh Keep in touch with the modeling and the designer the modelers and the designers in order to make sure That those limitations are not exceeded Database uh administrators and participant and data storage and operations and data quality data storage and operations because the operations part especially if the database administrator is involved in because They're going to Actually make sure the database is functioning and operating as it should be And they're involved in data quality because they're constantly looking at the data with the data quality personnel to ensure that it doesn't Fall out of sync and fall out of The expected uh parameters of the data quality uh knowledge area Consumers of the data architecture and data modeling and design um The I believe we'll talk about architecture later and data modeling and design as well uh data architects and data modelers as well But essentially these people will design the architect will design the infrastructure that the Administrator database administrator is expected to implement And the data modeler and designer will develop the actual design schema through a physical model That the database administrator is also expected to implement as the database itself subject manner expert Again a simple definition here person with specific experience knowledge of a given topic or function um This is a wide ranging uh Person skill step In that it could be the knowledge of a given topic or function could be anything from The data itself how the data is used in the real world How the data is received how it's processed Just a wide variety of expertise that could be You know a wide variety of topics or functions The subject man man Manor experts this supplier In these knowledge areas. I won't read them all to you um We'll highlight a few Data architecture the subject manager manor expert will Have specific information about what? uh what Excuse me About what type of architecture structures are available? uh In general and what is best used for the organization? um Data modeling and design subject manager expert as I mentioned earlier May have specific knowledge about how the data is used by the organization and will Interact with the modeler and designer to Incorporate those uses into the model Excuse me In a data warehouse and intelligence again The subject manor expert will have specific knowledge on what the data means um And we'll impart that into the business intelligent and the warehousing folks to make sure that it is Uh Stored in a reasonable manner and also that when it comes out of storage that is used in a way That is consistent with how the business is using the data It basically prevents tunnel vision and My favorite phrase for tunnel vision is working in a vacuum data modeler a couple's uh Uh sub definitions because uh used definition for definition But basically um The data modeler Takes the business requirements and develops a schema through a series of um Of iterations To take those business requirements and develop a Schema that the database administrator can implement to use as a storage method for the the desired data I'll give you a moment to read through that Data modeler is a supplier a participant and a consumer um The obvious one is participant in data modeling design knowledge area um The data modeler is going to be just Developing the data models And the design of the database itself um Data model is also a consumer data modeling design and this is primarily because sometimes they inherit A data model and it needs to be reworked or redone In order to accommodate new requirements They're also involved in data storage and operations because they're working with the database administrator or to uh Sometimes there's only so much a database administrator can do to make the database efficient sometimes you need to go back to the modeler To rework the model in order to get those efficiencies data architects Data architects are in my my responsible for the infrastructure of the data storage mechanisms Uh They're a supplier of participant and consumer in these uh knowledge areas Uh data modeling design, uh, the data modeler has to consider what the data architect has designed in order to Generate any efficient data model Uh Data storage and operations. They're working with the database administrator In order to help set up the infrastructure Uh, make sure it's running as efficiently as possible Possibly make changes to the architecture to accommodate unexpected Uh findings a data steward Uh data stewards are business leaders or subject matter experts In certain areas Uh, and there are six different areas listed there from the dictionary Or Uh They're primarily the In my mind the people who are most knowledgeable about the data and how it's used Here are the ones who are most interested and care about the specific Uh aspect of the data And they're the ones that spend most of the time Uh managing it in some fashion Data stewards are suppliers participants and consumers in these knowledge areas Uh, I'll highlight a few the data governance and stewardship Um, they play a key role in Providing understanding to the data governance council On how the data is used how it's stored And how it's, uh Managed throughout the system Uh, there are key components understanding usage of data And are Very valuable To a governance council And getting a clear picture of how the data is used and managed Uh They're in data quality work with data quality personnel Uh data steward can usually understand when data Is bad and when it's good Um, they can set the criteria for that determination Working with the data quality personnel To ensure that the data remains that uh at peak performance And they're also a consumer of the data model and design Uh knowledge area as they are um Looking at how their data is stored and how it's going to be stored and how it's arranged Uh making sure that it fits With the actual usage of the data conclusions The what's the inputs and the outputs are not widely used across knowledge areas Uh, you'll notice if you remember that a lot of knowledge areas had a lot of inputs Uh, or excuse me a lot of the deliverables won't allow the inputs were used inputs in one place But we're not the deliverables were not used in another place Uh The who's however do work in several knowledge areas And I think this is important to prevent Working in the vacuum so that having your having having By being spread across several knowledge areas It allows for the diversity of the That uh, they don't get too focused on one thing if you look at the Knowledge area activities from a holistic approach For and try to determine what's best for the overall Database management system and not just what's best for that knowledge area uh final thoughts I feel that the The who's the suppliers of the specific consumers are widely represented across The knowledge areas and I think this is the good thing um I think it's like I said it's beneficial to have Uh, a vast array of personnel looking at the various knowledge area activities So that uh You get a diverse solution rather than a tunnel focused solution I am concerned with the Deliverables and the inputs because they seem to be Not widely used They seem to be uh used in a few knowledge areas three at most And then they're not used again Um, I think this is the problem because I think it allows for uh for uh Some information from a knowledge area that has been generated to be lost In other knowledge areas because there's no supporting documentation or that documentation is not considered by that knowledge area Here's some additional final thoughts There's a lot of information flying around. There's 167 different documents that I cattle in the context diagrams And 101 different people involved different tasks Or different groups of people not individual people And it takes a lot of coordination in order to manage those people and those documents And I feel there needs to be understanding of more than just one single knowledge area I think two or three is a a good number That you may focus on one knowledge area, but you also need to have understanding Of some of the knowledge areas that are working around you. Um, I am by try the data modeler I've also gotten into data governance and data quality as well Because I think those are important to my data modeling skills and To be able to produce a quality data model. I need to work with those people in order to um Produce quality data models that can be used across the board and not just with any focus area Here's my contact information um I will admit I do have a twitter account. I do not use it very often Also, I will respond to anything directly sent to me Uh, I most likely will respond to email Uh, linkedin is a good way as well. Uh, if you want to connect with me on a regular basis Linkedin might be a good way And with that, that seems to be my presentation Uh, I'm going to drop out of my Uh presentation and look for Uh Questions and comments Let's see I see mostly comments Yeah, I see one comment that said this is requirement should be an input to reference master data as well I agree with that Inputs and outputs across the knowledge Uh One comment I think this is requirement should be an input to all knowledge areas Everybody needs to know something about why they are doing the data work. I agree with that Uh See the actual question. Do you believe that this type of mapping should be included within the next Dembock, I think he's referring to the tree maps and I think they should be considered that whether they should actually be included in the dembok I'm not sure. I don't think they bring A lot of extra value um I think there are well, maybe some mapping about how Uh One document leads to another One document from a cave from a key knowledge area leads to another I can see that being helpful But not the tree map itself And that's all the questions. I see there's several other comments as well. I'll read through those as I have time um I think we we finished a bit early and so, um If anybody else has any other questions go ahead and post them Otherwise, I think I'm done All right. Thank you so much Raymond for this great presentation and thanks to our attendees for tuning in Please complete your conference session survey on the page for this session The next session will start in about 15 minutes Thanks everybody and let's give Raymond some claps Have a great day Good. Thank you all. Have a good day