 Hello and welcome. My name is Monica. I am part of the digital team here at Data Diversity. I'm stepping in today for Shannon Kemp, our Chief Digital Manager. We would like to thank you for joining this month's webinar, Strategic Comparative, the Enterprise Data Model. It's part of the monthly webinar series sponsored by Agira. Just a couple of points to get us started. Due to the large number of people that attend these sessions, you'll be muted during the webinar. For questions, we'll be collecting them during the Q&A at the bottom right-hand corner of our screen. Or if you'd like to tweet, we encourage you to share highlights or questions via Twitter using hashtag Data Diversity. As always, we will send a follow-up email within two business days containing links to the slides, the recordings of the session, and additional information requested throughout the webinar. Now let me introduce to you our speaker for today. Ron Cusenga, who is the Senior Product Manager of Enterprise Architecture and Modeling at IDERA. Ron has over 30 years of business and IT experience as an executive and consultant spanning a diverse range of industries. His hands-on experience in large-scale enterprise initiatives includes enterprise and data architecture, business transformation, and software development. His background provides practical, real-world insights to enterprise data architecture, business architecture, and governance initiatives. And with that, I would like to give the floor to Ron to get today's webinar started. Hello and welcome. Thank you, Monica, and thank you everybody for joining me. The Enterprise Data Model is a very important topic that we want to talk about. And what I want to do here now is just kind of walk through and tell you about what the importance of the Enterprise Data Model is. I'll kind of go into the next slide here. Basically, one of the things that organizations have often looked at is what the importance of an Enterprise Data Model is. It provides major benefits. It's very high value to an organization. And very importantly, it provides context and consistency for all of the organizational data assets in the company as well. And it gives you a classification framework for data governance. Our environments today are getting much more complex and we really need to make sure that we can actually handle this. And just a quick question, Monica. Is there a way that we can silence the beeps from everybody joining? We're working on it at the moment. Shannon is in the background. Time to get that fixed. Okay, thank you. I apologize for the inconvenience. All right, I'll keep going here, but I just wanted to make sure. Okay, so here's what we're going to talk about today. First of all, I'm going to take a look back in terms of what the information capability is in most organizations today. We're not as good as we think we are and there's a lot of room for improvement. And we'll talk about some specific studies and surveys and that type of thing to really help out with that. In terms of data environment complexity, as I alluded to, our data environment is becoming more and more complex all the time rather than simpler. So we really need a framework to be able to deal with this. Hence the enterprise data model. It's something that really is important. It gives us context for all the data assets in the organization. I'll also talk about what the approach is in terms of creating an enterprise data model and how it can become the focal point for all of your data architecture in your organization. We also want to talk about how can we actually tie together all the related implementation models back to the enterprise model as well to really tie that in and give that actual context and the traceability back to that enterprise model as well. Then I want to talk about how we actually deal with this in implementation in our organization. So I'm going to talk about service-oriented architecture and some of the techniques there as well. Some of you may have heard of things like the canonical model which we'll talk about, which is really a derivative of the enterprise model. We'll talk about the data governance considerations that come into play. And also then we'll just go and wrap up and summarize all the things that we've talked about. So in terms of information capability, what are we doing as an industry or industries across the board? And when we look at that, based on a study that was done a few years ago by the PWC at Iron Mountain called Seizing the Information Advantage, some of the major conclusions were that very few organizations are really utilizing information to its full potential. And I just wanted to make sure that everyone else can still hear me because somebody just said they lost the speaker. I can hear you, Ron. Okay, awesome. Okay, just want to make sure they're all right. What we see out of this study is they trace it to some basic things. Deficiencies in technical capability, sometimes deficiency in skills, and quite often it points to a lacking in a data culture in the organization as a whole. What we also see is a lack of investment in value-driven information strategies. A lot of organizations are now very extents-oriented and they're not really thinking in terms of information and the cost of information as an investment in what we're doing. The other thing is that very few understand how to derive maximum value from the information that they have, and that really erodes your corporate value if it's not corrected. So generally speaking, when we look at this study, most organizations are really failing in terms of how they're actually accomplishing the information and driving the capability of utilizing that information to not only make important strategic decisions, but also in day-to-day operations of the organizations as well. So let's look at this information management disparity that came out of that particular study. They had a few different client operations, but I'm going to talk about the two extremes. On the one hand, they talked about the misguided majority, which was 76% of the organizations they looked at in this study. Generally speaking, those organizations were informed, but constrained in how they could utilize information, or they were uninformed and ill-equipped to utilize the information. Either way, it is still a problem. Another symptom was that data was often seen of a byproduct or taken for granted in those organizations, and there was a really low comprehension of the commercial benefits that can be gained by actually managing and understanding your data, ensuring consistency and data quality. Often, they were constrained by legacy approaches and the regulations that were hampering them in terms of they were focusing on that more than other things. And of course, weak analytic capabilities were one of the things coming in as well. Or sometimes they had a strong analytic capability, but they were locking the focus and focusing on the things that really drove the value out of those organizations. Also, those organizations could be overwhelmed by data volume. And here's one that's very dangerous at times is that data was often viewed as the domain of data architects. There was the lack of ownership of information in the business itself. Data belongs to the business and it's owned by the business. The data architects helped to facilitate that, so that's something that organizations do need to learn. Another symptom is that, generally speaking, when looking at data, if the initiatives were IT-led rather than business-led, that was also another issue there. And it was characterized by things like spreadsheet hell. In other words, not well-managed data information systems, but management by spreadsheet and proliferation of different types of information that weren't unified and tied together. Now let's look at the other extreme, which are the information elite, which were the top 4% that they looked at in the study. Generally speaking, these organizations were very proactive in their actions and how they utilized data in the organizations. They used data to diversify and expand their business models. They used the data on a day-to-day basis to improve operating efficiency. And they also used the data to identify and implement new market opportunities for expansion of their businesses as well. They also looked at tangible data value and they looked at the data quality and linked the data in their organization to drive organizational key performance indicators and managed by those key performance indicators as well. So what they were really doing is they were able to exploit that data and give them a competitive advantage of other companies in their particular industry. And that's very important because that means that you're really utilizing the information in your company effectively. Now of course, what they also needed to recognize is that data has security considerations, but they had a very balanced approach between security and value extraction, giving access to the data where it was needed and still carefully administering that, but making sure that they were giving it to the users in the organization that really need that information to make the decisions. And a very holistic approach. Governance was just a normal part of the business. It's not something that was thought of as an add-on or a special program that needed to be set up or anything like that. It was just woven into the fabric of those particular businesses. They had a well-defined information strategy that was tied into their business objectives. And interestingly enough, when they looked at that information elite, they actually found that the majority of them were in healthcare, manufacturing, and engineering. Now, one of the reasons for that is likely because those particular sectors have always had a very important discipline in terms of needing to rely on information and ensuring information accuracy. In healthcare, obviously, patient records, medication records, those types of things, manufacturing where things like bills of material and manufacturing lead times and machine times and everything like that are extremely important. And of course, engineering, which is always dealing in very detailed technical specifications as well. So here's an important perspective about information. And this first quote is from Tom Peters, one of the co-authors of In Search of Excellence many years ago. And he says that organizations that don't understand the overwhelming importance of managing data and information as tangible assets in the new economy will not survive. Another complimentary quote from somebody that most of you probably know who's a regular presenter here on Data Diversity as well as the conferences is from Peter Akin. And he says, your organization is all about data until it's not about data. Let's take another look at a survey that we had commissioned by dimensional research a few years ago. And this is really looking at how business stakeholders are using information or data in their organizations. When you looked at this, we were looking at do you suspect, a question was, do you suspect that business stakeholders in your organization are interpreting the data incorrectly? At least 67% said it was happening at least occasionally. Only 9% were confident that it never occurred. A 10% unknown and 14% said it occurred frequently. So the next part of that question which is actually even a little more dangerous is how many suspected that the business stakeholders were actually making decisions using the wrong data in the organization. And the results were remarkably consistent here. Slightly lower on the occasionally front, 13% on the never, so slightly higher there again as well. And 11% saying frequently. But when you basically look at that, when you look at the never making incorrect decisions at 13%, that still leaves an 87% chance that people were making decisions using the wrong data in the organization. And that's extremely dangerous. We then took that a step further and wanted to see if there was some kind of a correlation or tie into what the organization's approach was to data modeling. And some very interesting results came out of this. Only 18% had a data modeling team that was responsible for the data models. 31% said that developers developed their own data models. I get very concerned about that, not to disparage developers, but developers will generally design data or data stores so that it's easier for them to design the program, the particular application that they're working on, rather than having that enterprise point of view about how that data should be structured to benefit the organization as a whole. So that can be extremely dangerous. Some had DBAs that were doing their own data modeling. And then there were others where there was kind of a mix where they had a data team that did most of the data models, but developers also built them as needed. 3% were other. And then the 13% that's very disturbing are the companies that weren't using data models at all to understand the data in their organization. We then translated that, other than the ones that, for the only those that use the data models now, is we actually asked the question, the follow-up was, how does the organization's technology leadership team understand the value of using data models? 60% only understood somewhat. 17% didn't understand at all. There's a 3%, I don't know, gap. But again, 20% to only 20% completely understood the value of using data models. So when I look at these types of results, as well as that previous study that we talked about, there's a direct correlation here. I think what's really happening here is people and organizations are not truly understanding the value of information in organizations and leveraging it correctly. And a big part of that is because they're not approaching and managing it correctly through data models and other techniques that are really necessary to harness the importance of the information in that organization. And here's why. As I said, our data in our organizations is becoming extremely complex. I used to refer to it as a data landscape, but I now refer to it as a data ecosystem because over the past years, it's become more complex. It's shifting and growing organically as we continue to grow. So that draws the parallel to what an ecosystem is really about. And we see many different things going on here. So when we look at things, we may be trying to ingest information from outside the organization that could be in NoSQL stores. They could be things from relational databases, feeds from social media, internet of things, sensor data, those types of things, and all kinds of different types of data that we're trying to ingest and utilize in the organization itself. When we're doing that, what we tend to do is we tend to land that in data stores that could be anything from NoSQL stores or even relational databases if you want, but we're really dealing with raw transient data when it comes into the organization. And this is also where we have things like the sandboxes where our data scientists are spending a lot of their time just trying to see if there's anything useful in that outside information that they can utilize and draw decisions from. Interestingly, one of the keynotes of the Enterprise Data World conference, a very interesting perception and statistics were that data scientists spend 90% of their time trying to find the right types of information and cleanse it to see if they can make decisions. Then the remaining 10%, they actually spend 90% of that remaining 10%, making sure that that data is clean and accurate to be able to do that. So in all their time, they're actually only making decisions on that data 1% of the time. So that's a very low level of throughput, especially with those types of raw information. As we're moving from left to right in this diagram, what we're really doing is we're starting to build higher levels of quality and trust in the data so that we can actually utilize it in our organizations. So once we get past those raw transient data stores, then we actually end up with approved raw data that we feel can be useful to move onward and utilizing our organization in some fashion. Then we get to trust the data stores, and again, this can be on virtually any platform or storage mechanism. Things from master data management stores, relational databases, no SQL databases, those types of things coming into play. And ultimately, refining that information for decision making using ETL and those types of things, and ultimately getting to the point where we can actually utilize that refined data, which we're typically storing in data warehouses, data lakes, those types of things. Again, varying technologies that come into play there, based on premise and cloud technologies, and ultimately driving things like cell surf analytics and reporting. Now, that's taking it from left to right all the way through to the analytics and reporting. But of course, when you look at the center, like trusted data source and everything like that, we're also using that information as part of our normal operating procedures in our organization. All that transactional data, all that master data coming into play is as we're actually executing our business functions in the company as well. We need to understand this. So how can we actually map this data? And the way that we can actually map and understand this data is through the use of data models, enterprise models, enterprise data dictionaries, and linking our implementation models back to those things. So what we're looking at is conceptual models, logical models, physical models for those physical implementations, obviously dimension models for things like data warehouses, data cubes, and that type of thing. And very importantly, the enterprise model and its derivative, the canonical model, which I'll be talking about in more detail here. We also want to understand what the data lineage is. So modeling visual data lineage and the traceability of information through the organization is something I'm going to talk about in conjunction with this, as well as enterprise data dictionaries, which are extremely important. They do things like give us naming standards, something that we call attachments, which are metadata extensions to help classify the data in your organization, in addition to the typical types of things that you think about in data dictionaries, such as domains, data types, and those types of things. And then tying that in to really bring it back full circle to understand the information and context in the organization is you want to have behind us a metadata repository and the ability to tie all this back into business glossaries to really provide that business meaning for not only the enterprise model, but the other models that you're utilizing in the organization as well. If I zoom in on that one part, what we're really talking about here is I'm going to work bottom up here. We have implementation models that we want to create for our databases that we're utilizing in the organization. So think about our operational data stores, our transactional databases, even our data warehouses, knowing and understanding that data by mapping it through your data models. We tie those into the enterprise data dictionaries, but very importantly as well is we want to have that overarching enterprise model that defines what the information is in our organization. That's important to us and all that information is conceptually linked together across the organization. That becomes our frame of reference and then we tie it back to the way all these various implementations that we actually have spread throughout our organization. So let's talk about some things about enterprise data models because there are some myths out there that we need to dispel. So I'm going to talk about what enterprise data models are and enterprise data model is a long-term focused initiative. It is not a short-term project. It never ends and the reason it never ends is because your business is continually adapting so it will only end if your business ends. You need to have this fueling and understanding what the data is that you're utilizing in the organization as you continue to go forward. It's an investment. If people look at the data modeling as an operational expense, that can be very dangerous. You really need to understand the value of the information and utilizing that in for enterprise data model to give you that clarification is really an investment in your organization and the data in your organization as well. It really is a basis for communication. It's that Rosetta Stone that really translates what that information is about in your organization and what it's for. The data model itself, yes, we will have data model diagrams in the future. The pictures that we have because it will be multiple pictures in multiple subject areas really help us to understand how the data is related together and it's also a statement of business rules about how that data interoperates with the other pieces of data in the organization as well. It's also a foundation for standardization. And interestingly enough, some people say that enterprise models and agile are incompatible because the model is a longer-term approach versus a short-term cyclic approach of agile that they say they're incompatible. Exactly the opposite is true. If you have that enterprise data model, it becomes that frame of reference that actually helps you to drive more value out of your agile projects because you know what you're delivering against and you know what you're delivering and how it actually fits into the overall picture of the data in the organization. The enterprise model, again, is continually evolving in incremental evolution. It's not an ivory tower artifact. You will fail if you try to go off to the side and create an enterprise model before you actually try to do anything with it. What you really need to be able to do is come up with that baseline or basic or critical mass of what your enterprise model is, but you're continually evolving it and that means adding entities, attributions, definitions, and everything else. You really have to be able to drive this forward by combining it as you go forward based on the changing business needs that you have. Again, I've said this in a couple of other slides. That enterprise model really establishes the framework for all enterprise data assets. When you're looking at any enterprise data asset or any enterprise data store, you should be asking yourself the question, how does the information or data that I have here fit in this overall larger picture because your enterprise model should be able to tell you how to do that. An enterprise model is not specific to an application or a particular business area. It spans the entire organization or business. Now, people may start creating it beginning with a particular business area and expanding it from there, but its value is by spanning the business rather than being isolated to a particular part of the business. Very importantly, your enterprise data model is business focused. It is not for IT use only but for your enterprise data model resource. It really needs to encompass those business rules about how that information works together in your organization. It is truly a strategic imperative. If you want to drive the value out to give you competitive advantage, you need to have an enterprise model to give you that context. It is not a waste of time. You cannot understand, govern, or ensure data quality of anything unless you fully understand the data that you're working with that does that for you. In terms of preparing it, creating an enterprise data model requires both business and data modeling expertise, so it's not something that can be undertaken and done by developers or junior models. You really need to have that point of view. In fact, in my opinion, quite often we see data modelers as part of the IT organization. Personally, I believe that data modelers should be part of the business organization because it has something like a chief data officer function that the data modeler should actually be part of that CDO group really driving for things like data governance as well and being recognized as part of the business leadership in the organization. An enterprise data model also is a rationalized, intelligent design about that data. If you're documenting your data after the fact, you're really losing sight of what the benefit is. You really need to be able to do it. Of course, an enterprise model is fully platform independent. It is not for a specific database. It's really looking at what the information is regardless of how it's deployed in the organization. When we look back at that previous picture I drew, the specific databases are those implementation models that I was talking about, but they should tie back and reflect what's happening in the enterprise data model as well. And again, the enterprise data model is common sense and practical. It is not an academic exercise. I've seen models in the past have painstaking academic debates about how something should be structured. You're going to be wrong one way or another until you actually get more information to actually pull it together. So what you really want to do is you want to capture the best that you know at that point in time. Because when we go back to that earlier point of it being an incremental type of design, as you get more business rules, you will start to adjust things correctly. Just like you would under agile or anything else. It really, it's not a static picture. It's a living and breathing organism or model that reflects the way that your business works. So in terms of creating it, the way I like to do it is really aligning the enterprise model to the different business areas. So obviously an enterprise model could be extremely huge. You're not necessarily going to get it on one page because you could have hundreds or thousands of entities depending on the size of organization that you have. You could break it down along business lines using a technique that's been around for many years called business decomposition. If you understand the way your business is structured, this is also a very good way to organize the different sub models or subject areas within that enterprise model as well. And you're also doing it in a manner that when you're looking at the, when you're talking to the business participants in those different business areas, it's information that they can relate to when you're actually discussing the requirements with them as well. Again, having that framework and the enterprise model is really helping us with the need for common understanding. If people look at certain things, depending on their perspective, they may see it differently. So part of that enterprise model is to be able to figure out what the organizational perspective is and to resolve the difference that we see. In this particular example, there are a couple of different perspectives. And generally in an organization, depending on your perspective, it can mean many different things. So part of what the enterprise model does as well as things like business glossaries is it allows us to actually reconcile these different perspectives and come up with an organizational perspective of what this really means to us. So how do we comprehend an enterprise data model? We look at the different layers. So the enterprise data model itself is going to have many things. It's going to have what we call business data objects, which is really, again, how often business users think or even developers think in certain respects in terms of, they may think of something like a customer order, but they don't necessarily think about how that customer order is decomposed into different pieces like order headers, details, and those types of things. So in our enterprise model, we can actually satisfy both perspectives in terms of what those higher order objects are, as well as the business entities that comprise those higher order objects. Very importantly, obviously, is understanding the relationships between those different pieces of data because those relationships are also a statement of business rules. And the derivative from the enterprise model that we'll talk about in a moment is the canonical model that really helps us in terms of how we actually implement and utilize this in our organizations for initiatives like service-oriented architecture and those types of things. The enterprise data dictionary is also extremely important. It includes things like classification metadata to help us classify our entities, and I'll talk about that in a moment. In addition to things like data types, domains, naming standards, and security properties that come into play in the organization as well. And then, of course, supplemented by business glossaries, which gives us things like terms and definitions. They can also be organized by a business area. We can also have technical glossaries in addition to business glossaries that certify things further. And it's not limited to just words and their definitions. It's also a good place to actually catalog governance policies as well as a source to link out to reference and master data sources as well. So you have a data catalog tied into your enterprise model that's pulling everything together in one place for you in that metadata repository. This doesn't happen by accident. It takes a disciplined approach, which I think was probably very evident about the different types of organizations and how successful they were in utilizing data. What you really need to do is you really need to have the accountability. So when we look at this accountability, we're looking at the different types of things. You need to designate teams and members of those teams. First of all, who's actually responsible for it? Supplemented by who's accountable for the information, not only in the enterprise model, but of course the business glossaries who are the stakeholders that you consult as you're creating these things and then of course, who are you informing about it as well? This is kind of known as RACI, which has been a term that's been around in project management for a long time, but just like any other initiative, you need to have these areas taken care of in terms of responsibility, accountability and consulting and informing stakeholders within the organization appropriately for the different business areas within that enterprise model as well. This is an oversimplified example, but what we also want to have in our enterprise model, this is actually showing, depicting a very simple, small area of an enterprise model. So we're looking at things like purchase orders and how they're comprised of headers and details, things like suppliers, customers or items or products that we utilize in the organization and how they're decomposed. But very importantly, between them, we're seeing definitions for them and don't worry if you can read this slide, it's just to kind of give you a perspective of it, but when we look at that metadata that we want, it's not just attributes, it's not just entity names, it's all kinds of things that we need to use to classify the data, such as things like what's the master data class of the information, in other words, is it master, reference, transactional data? Business value, not all information in the organization has the same relative value, so there are important entities, there are some that are less important, so having some type of a categorization of what the relative business value is is very important, because that also allows you to prioritize things like your data quality and data cleansing efforts to make sure you have the consistency there as well. High value information is going to receive the highest level of attention to ensure that it's accurate. Things that people often forget about is retention period, how long you need to retain the data, and that really ties into data life cycle that I'm going to talk about in a moment, and also where does the data originate? And I do things like, what I have as a classification is am I utilizing data that's originated internal to the organization, external to the organization, and sometimes the answer can actually be both. Volatility is very important as well from a data quality perspective. The more volatile the data is, the harder it is to manage, so the more effort it takes to govern and manage, and also assigning who the stewards are responsible for, and then in addition to that, it's things like privacy levels, security impacts, all those types of things that always come into play with a lot of the regulatory policies, and things that we should be doing anyway are things that we should all be classifying in terms of our enterprise data model and attributing the information with those types of characteristics. Now let's look at how we tie this enterprise model into the different implementation models that we have in our organization as well. Again, this is a very simplified view so that it will fit on a slide, but I have a number of concepts represented here by the different colors, but what you'll see is they're often called different things in different implementation models as well. Some of the databases that we have in our organization are from ERP solutions or they've been acquired or things have changed over the years, so there's quite often an inconsistency in terminology in the way that we name these things. We want to make sure that the proper enterprise name for things is represented in the enterprise data model but then we actually tie this together with what the like or related entity instances are in those other operational models. So as an example, if I take something like supplier, my enterprise name for supplier, my enterprise entity is called supplier. If I look in these two different implementation models, I have it called vendor in one place, suppliers in other words, a slightly different naming conventions plural rather than singular in the other model. Same thing, when I'm talking about something that I call items in another one it's called product in another one it's called part. It's all representation of the same type of information, so building those ties in back to your enterprise model is very important so that you know where these implementations are. Same thing with customer, some places they're called client, some places it's called customers and even the reference data where we might have something called state or province which is more of an international flavor in one database it might be called province and in another one it might be called state and it's reconciling those differences and tying those things together. In our particular product we call these universal mappings and we actually store them in that metadata repository which gives us that linking between all the models and all the items in those in that repository as well. The benefit of that and here's an example is linking those things right in the entities in the modeling itself when we look at the where used tab it gives us an automatic byproduct that says here are all the links that you've got so any of the universal mappings that have created from my enterprise model down to my other models it shows me what those different types of things are and in a lot of places you're going to see it to other logical models you're going to see it to physical models but it's a way to pull that to a local point together. We can create them generally speaking you want to focus at the entity or table level first but for key or critical data elements sometimes what you actually want to do is you want to tie them to get these important attributes together as well things like social security number or those types of things that span multiple different things you may want to link those together regardless of where they're actually stored as well to bring that tie in together as well. Now let's talk about utilizing information for typical types of an analytics or BI flow. This is depicting a very simple data lineage model where we're looking at things like information in different staging files the way they're being transformed through the loads into a dimensional model in a particular table that we have here which is a dimension. What we're really seeing here is things like source, transformation, and target which is a very important thing to do because it allows us to see how this information is being utilized in our organization. Quite often when people think of data lineage they're always thinking in terms of business intelligence and those types of things but lineage is much more than that. It really ties into the overall life cycle in an organization as well. So when you look at the data life cycle you really want to be able to understand everything about how every element is created read, updated, and deleted through the organization especially on those critical enterprise items that we're talking about. So you'll understand the data creation or collection. You want to be able to classify it, how it's how and where it's stored, how that data is modified and how it's used in the organization, how you share that information throughout the organization. I talked about retention policies in other words but you really need to understand of how you retain or archive that information and ultimately how you destroy that information as well. That's the full information life cycle that we need to understand about our current enterprise data elements. There are many factors that come into play. You're looking at the business rules, you're looking at the business processes, you're looking at the applications that actually are implementing these things and how these are occurring, and you're also looking at all kinds of different things. And very important is there may be more than one way a particular data element is created. Often people talk about the golden record or the single source of truth as an example but the origin of that data may actually have come out of multiple places. For instance you may be using an HR recruiting system that has some type of information about prospects that you're looking at interviewing and that type of thing, but they don't actually show up in your HR systems until you've actually hired them. Whereas in those same systems because depending on where your organization originated that may be the first place that that information is captured as well. So you need to be able to pull all of this out of that and put it there in various ways in various business cycles that actually act upon that data. The data lineage, we also think of data flow, we think of the integration of the information across our organization and that includes the ETL or Extract Transform and Load for data warehouse and data mart and staging areas, but it's not limited to that. That's only part of the overall data life cycle. So let's look at a different example now and this is an example where we're still using or looking at data flow and lineage, but we're really looking at it in terms of an organization that may have built point-to-point interfaces between different applications. In this example they're basically have all of their information about prospects and customers and that type of thing and something like Salesforce, but until they actually become customers and purchase something, they're not going to show up in our accounting systems. So when that happens they actually move that information and transform it by extracting it from Salesforce, mapping it through transformation and into a load into PeopleSoft to bring that particular information across. Obviously there are a lot of attributes and things that come into play and there are a lot of ways that we want to actually document and define exactly what's happening in those transformations, but this is a very high level diagram of that basic process. This changes when we do something like service-oriented architecture because one of the things that we're trying to get rid of is these point-to-point interfaces where we have all these inconsistencies, inconsistent naming conventions and everything else. What we're dealing with in this type of diagram is almost like one-off mappings for a particular situation and that really turns into what we call the integration hairball when you start trying to tie information together across a whole bunch of systems with all these different point-to-point interfaces. When we look at service-oriented architecture where we're looking at a design pattern to standardize this and it's really a design pattern based on distinct components and we're looking at services to other components that are utilized in our overall architecture. It's not only application functionality, but it's also a consistent application and utilization of data services in the organization as well. Service-oriented architecture is independent of any specific vendor, product or technology. It's really a set of techniques and the way that you approach the problem is that we really want to concentrate on here. The way we do this is through utilizing our canonical model, which is basically a design pattern that uses our standardized data model across all data services. The way we look at this is I talked about our enterprise data model so far and quite often when we look at our enterprise model we're going to have the English terminology so if we have something like customer order, purchase order, those types of things, it's going to be spelled out, it's going to have the proper spacing between the words just like we were reading in a sentence. Your canonical model is a derivative of that so what it's going to do, it's going to do things like resolve to your physical naming conventions. So what I typically do as an example is if I'm using the normal English in my enterprise model, my canonical model is a reflection of that but I use basically what we call camel case where I've actually broken out the spaces in between and actually just pull the words together. Other organizations use things like underscores as word separators but be consistent in how you actually do it. What happens is your enterprise model drives down to your canonical model so everything in your canonical model is derived from the enterprise model and the way that you're actually moving the data through your organization is done through something called canonical messages for your data services where you're packaging up pieces of information and all the naming conventions that are utilized in those are based on that canonical schema so your terminology is always exactly the same for that particular information construct that you're dealing with. Again what this means is you have standardized and consistent data elements throughout regardless of which systems that you're actually utilizing information in terms of which system sending information to which other system and I'll talk about how we do that in a second as well. The standardized messages are extremely important the payload standardized the transformations are all based on the canonical schema so in other words everything that we map from our source to our target we don't do that direct source to target mapping anymore. We map from our source to our canonical schema then we map from our canonical schema to the recipient and target or what we also call subscribers to that information in our organization as well. Service-oriented architecture can be quite complex I'm just looking at the data services part of this right now and this is really where this ties in so basically what we're really looking at is integration services in ETL data replication even things like distributed queries and all the enterprise information integration in your organization driven off this canonical model and those messages to apply that standardization throughout your organization. Let's take that same example that we talked about before where we're moving things in from Salesforce to PeopleSoft but like any typical organization in this example I might actually have multiple ERP solutions that come into play because I have different divisions that I've acquired over the years and that type of things that had their own ERP solutions so quite often we need to be able to map this to multiple places. Same type of idea but now when we're taking information out of Salesforce on the left we're publishing that so we're extracting it through a published process and we're using and mapping it to structures that are in our canonical model with those naming conventions and everything else. Once we've done that we're actually saying what are our subscribers to that so if Dynamics AX happens to be a subscriber now here's our mappings from our canonical structure to how we map it into Dynamics AX if we're using PeopleSoft we have the same type of thing where we're still utilizing that same common canonical structure creating the mapping and how we actually push it into PeopleSoft and we could have multiple systems that come into play here. The idea is what we would do is when we actually add information change information in something like a source system like Salesforce we would take that, we would publish it out to an enterprise service bus and those applications or areas that actually need to consume that are known as subscribers so they pull that information off the enterprise service bus and update their own data stores accordingly all in a very common and very standardized format that everybody can understand. What we do want to have and what we want to accomplish is information in the right place at the right time the enterprise model is your focal point to be able to do that it gives you the comprehension of the information in context which gives you effective communication across the rest of your organization an enterprise model is a major investment but it also yields major returns and all these types of things come into play data rationalization the integration of data across your organization it's a focus for data quality it ties in heavily to data governance, metadata management and of course that enterprise model is an implementation of business concepts and rules in your organization. We'll take governance a little bit further most of you have probably seen this this is the DIMBOK wheel and from the DIMBOK 2 and I'm not going to go through every one of these but to me one of the most important things that came out of the differences between DIMBOK 1 and DIMBOK 2 is that data modeling and design has properly been split out now as its own area of interest it used to be buried in data architecture but is now standing on its own because it is that important you need to understand model and design that data again if you can't understand it you can't manage it you can't govern it and you cannot make appropriate business decisions with that again from a governance perspective we want to tie this back into our business glossaries that supplement our enterprise model same type of an approach I structure my business glossaries and I break it along a business decomposition with the different business areas but I also do things that are more than just the terms and their definitions I have governance policy catalogs for the different types of regulations that come into play and within those policy statements so I can link that back to the particular pieces of information that those particular policies relate to the same type of thing you can utilize your business glossary as a jump off point to link all of your master data management or reference data stores back together you may have an MDM system you may have some information in spreadsheets or other places but you can use it as a focal point to tie all of that back together as well here's an example of a set of glossaries I'm going to go through these very quickly if we look at a typical type of thing where we're doing the definition of a term it's things like statements of what that or basically the name of the term how it's classified the definition of the term and you can actually get very complex maybe there are formulas or other things involved in terms of how you calculate that you can capture all of that and should capture all of that in some type of a business glossary to give you that standardization in your organization in terms of the data governance policies same type of thing I would typically have a higher order placeholder that ties all my governance glossaries together and then I would break that apart into the different ones that come into play and ultimately what is comprised in each of those so now I've gone down to HIPAA which was one of my glossaries and now all of my policy statements I've broken out in here because these different policy statements apply to different types or pieces of information in the organization so now you can link them together so it makes it a very rich metadata repository that ties all of this information together other important things of course is in addition to the policy statements you're also seeing things like that warning on top of the screen so we can actually tie in things like our data security considerations and even talk about warnings about whether the information is highly confidential so that people are aware even looking at the underlying metadata that the actual data content for these things is very important and needs to be secured as well again this is an example of that with a slightly different perspective that's showing all the information and the data structures that came from our data models as they got published in as well the last one I'm going to talk about here is the reference data sets, same type of thing, a place to tie them together and then ultimately you can have these different reference data sets that you can define in there or master data management sets expose them and tie them together this was a lot of information things that we really need to think about in terms of the enterprise data model benefits is the enterprise data model allows you to achieve strategic business alignment improves the understanding of your data and information in the business which improves your communication in the business which is extremely important it really helps you to manage the complexity you've got all those organizational data assets out there, a plethora of databases I mean a lot of organizations that are very large may have hundreds or thousands of different databases and data stores like that Rosetta Stone or your enterprise model to say here's the type of information that's important to our organization you need to have that defined so then you can actually tie that back to where are the implementations of this information in the organization so you can pull it together and you can actually manage and govern it and make sure that you have the consistent data across those data assets and the data quality to actually drive your decisions some people think of it as a lot of work what it actually does have done correctly is because you've done the homework to have this frame of reference is that actually lowers implementation costs overall and it reduces rework because everybody's dealing with that standardized implementation of what they're looking at rather than everybody springing up their own projects doing their own thing and then trying to reconcile it after the fact this is something that is used all the way through and again it becomes that framework for service-oriented architectures that we talked about the canonical model is derived directly from the enterprise model gives you that standardized naming terminology, types of mapping that we use in the data services as we're doing things like publishing information through data services and then subscribing to that information and pulling it into the data stores or applications that needed in the organization and again it's a foundation for enterprise data governance the framework to understand the data and how it's related together tied together with your business definitions your business glossary to give you that complete enterprise data governance set of capabilities in your organization that's all I've got in terms of presentation itself so I'll turn it back over to Monica and we'll open it up for questions great thank you Ron we do have a few questions the example of EDM looks often complicated is there a need to go to that level of detail in an enterprise data model it seems to me that would be a real challenge to maintain currency for all of those details again it's an evolutionary process you need to understand the details somewhere and your enterprise model is the way to do that so what you're doing in your enterprise model is the first step is you're going to talk what your business entities are that are really of interest to you and that's where a lot of people will start so they may just have the entities and those types of things and their definitions but what you also want to do is you really want to be able to take it down to what those important data elements are in there so obviously something like customer you know you're going to have things like customer names and those types of things so you want to at least have that type of information well represented that common information approach your enterprise data model you do however want to have that classification as well in every model that I do I classify it. At a minimum I would use a classification so I understand what my reference master and transactional data scores are because that's a method of prioritizing data quality efforts. Your master data as an example is used in virtually every transaction that you conduct across the business somewhere so making sure that you have taken care of that information first is extremely important when you have that enterprise model and you've built that classification in you can generate and easily see what those subsets of data stores are so then you can say okay here's my master data here's my reference data then you go further and you start that tying that together through your implementation models you not only see that but you can say here are the things that are important and by the way here are all the implementations of those things that are important you have to be able to reconcile and tie that together to make sure that you're using consistent data for your organization. Does that make sense? Yes, sounds great. Next question. Our solution architects are advocating that we shouldn't use a conical model B flash B it becomes too complicated and delayed however they're in favor of domain models. Do you see these as two separate things? What you're really looking for is achieving standardization and I actually think that canonical model is the way to achieve that standardization and depending who you ask some people will talk about they'll actually use the terms interchangeably there are subtle differences between but that basically varies by perspective as well really what you're looking at though is your canonical model is an overarching enterprise model whereas if you're looking at a particular business domain it's really a subset of what the canonical model is because enterprise applies to that particular business domain area. Okay Next question. I have found challenges in making a conical data model a practical reality. I have found that this is an ideal solution on paper however taking an iterative approach project by project is very complex. Do you have a real-world example not Greenfield? Greenfield. Yes, in fact when I was consulting before I became product manager for ER studio that's what the majority of the consulting engagements were about I would walk into the organizations that were not necessarily the information elite but they weren't always the misguided majority either but what you really need to do is you need to go and find out what the information is that's in an organization the way I would approach it I would go in I would reverse engineer the databases that they were using on a common basis it's usually fairly simple to figure out what the production databases are that are being used on a day-to-day basis I would reverse engineer all of those into data models and then I would synthesize out of that what the important entities were to an organization as an example and I would actually start to drive that into actually creating the enterprise model and then I would start linking back to these things but iterative I would do it prioritize the databases I would prioritize the types of information I was looking at and I would just gradually start linking this information together and like I said it's never it's never like it's not a once and done it's something that always goes through and I see kind of another thing there like somebody saying reverse engineer thousands of production systems the answer is yes you need to understand where that information is it doesn't mean that you're going to use every single attribute out of all of those things but you want to be able to pull the important information out and link it together so part of the exercise is saying okay now that I have all this metadata that's come back which parts of the metadata are important and that's what you actually represent in your enterprise model it's not absolutely everything so you are not going to tie absolutely every entity in every database back to an enterprise table it's the ones that are important that you're going to tie together okay Ron would you please take a minute to talk about ways to engage management folks who don't get and don't want to include data governance one of the better ways is generally speaking when you look at any organization you can find a bad decision or a loss of revenue or an expense overrun or those types of things that are rooted in bad data sometimes they're a disaster sometimes they're a little more insidious but looking for real examples in terms of if we had better information and better quality information to make those decisions figuring out what those were and actually attaching a dollar value to it so put together a business case in terms of what it actually cost the organization by not having quality data either in terms of a real cost that occurred because of it or because of an opportunity cost of not having been able to realize an opportunity that they could have if they had the quality information now obviously that does involve making certain assumptions and those types of things but also document those assumptions but the best way to get the buy-in is to actually tie it to business objectives so if there are things that are important to the business strategy or that the business measures its success on try to tie it into those types of KPIs to show what the difference was and that once you actually have converted a few people on the business side to understand this you're going to have some champions that will actually help you in your cause as well that's all the time we have for today I want to again apologize for all our technical issues and the beeps today it was a problem with the set-up of the webinar and we'll not do that again in the future Ron, thank you again for a great presentation in the QA and your patience and your fortitude through all this just letting you remind everybody we will be hosting a recorded webinar in the next two business days and we'll send out all of the emails to let you know the links and other requested information and thank you again, I dare off for sponsoring today's webinar as always, thank you for attending today's webinar I hope everyone has a great day and again, thank you Ron Thank you, thanks everybody for joining us