 Hello and welcome. My name is Shannon Kemp and I'm the Executive Editor of DataVersity. We'd like to thank you for joining this DataVersity webinar. A new way of thinking about MDM sponsored today by Mark Logic. Just a couple of points to get us started. Due to the large number of people that attend these sessions, you will be muted during the webinar. For questions, we'll be collecting them by the Q&A in the bottom right-hand corner of your screen. Or if you'd like to tweet, we encourage you to share highlights or questions via Twitter using hashtag DataVersity. As always, we will send a follow-up email within two business days, containing links to the slides, the recording of the session, and additional information requested throughout the webinar. Now let me introduce to our speaker for today, Michael Dohm. Michael has worked at Mark Logic for eight years and currently serves as Health Care Solutions Director for Federal, State, and Local Government Health Care and Health Services. His recent focus and expertise is in solving big data challenges and groundbreaking work to find solutions to difficult enterprise integration problems by taking new and different approaches to those problems using Mark Logic. Mike holds a Bachelor's of Science degree in Mechanical Engineering from the Massachusetts Institute of Technology, aka MIT. And with that, we'll give the floor to Mike to get today's webinar started. Hello and welcome. Hello and thank you, Shannon. All right, good afternoon, and thank you for taking the time to join this DataVersity webinar today to hear about a new way of thinking about MDM. I hope it's going to be educational for you. I think it will be. As Shannon said, my name is Mike Dohm. I'm a Director at Mark Logic responsible for developing solutions that better solve the pains and challenges of our customers. I've been with Mark Logic for more than eight years and I've been working for companies dedicated to bringing together disparate data for nearly two decades. And Master Data Management is all about finding or creating a consistent and accurate view of the most important data to an enterprise. And if you don't know it already, after this webinar, you'll better understand how busting data silos and integrating them plays an important role in increasing the effectiveness of MDM. I'm currently focused on the healthcare domain, but I've spent time with many other domains including finance, law enforcement, defense, and the intelligence community. And we're seeing similar MDM opportunities in all of these domains where you integrate a lot of data. That makes sense. There's a lot of overlapping data and you've got to master it. In healthcare, we're seeing Master Data Management more than I would have expected or more frequently in more of our accounts. We're seeing it in master person indexes for hospitals and insurance companies. And in the state health and human services departments to integrate the many disparate and overlapping benefits eligibility systems. Okay, so what is Master Data Management? There's many definitions because it's a broad domain. It's practiced in many different ways. And I've listed just a few of the definitions from some of the well-known consultants and consulting companies, but there's plenty more that would have been equally appropriate to use. Master Data Management is many different things to many different people, so it's impossible to define in a sentence or two. Most implementations of it are different and they don't all require a full-blown solution with all aspects implemented. So choosing an MDM strategy is personal. It's not entirely unlike choosing a vehicle. The vehicle choices out there are many and varied. They don't all have the same number of seats, sizes of engines, varied tremendously. The purposes of them are different. So your spouse may have a different car from you. It's likely to have a different car for you because it serves a different purpose. MDM's no different. So what matters to you is what should guide you. So before you set out on an MDM project, please carefully consider what your goals and requirements are. Okay, so in addition to your MDM choice being personal, it's also complex. And not only is it complicated, but it's usually removed from the actual business of an organization and formed as an IT back-office function, where those who do the MDM implementation are often insulated from the face of change. And they're often completely unaware of the actual changes taking place in the business units they and the Master Data Management they're implementing are intended to benefit. On this slide, the items included around the center are implemented to varying degrees in different implementations, but there's usually some aspect of each in the successfully managed MDM project. Now, this isn't a comprehensive list, but these are the more common elements. So we start with auditing. It's important to know who did what and when for any create, update, delete, merge or unmerge operations, basically any change. But against the data that you're mastering, so auditing, automated and user actions is critical and you want to be aware of any of these changes to data regardless of where they're done within the system. Data flows, processes and workflows, two of these at once. You have source system updates and structure changes, ETL pipelines to move the data into the MDM repository, rules to dictate what can be merged and workflows to manually handle outliers that require human decisions. Now, this is obviously a complex set of processes that requires a great deal of effort to set up and it requires whole teams to coordinate things. On top of this, every transformation usually leaves some data behind and has the potential to be a point of data loss or data corruption if the transformation isn't perfect. Often the lineage of the data is lost. So where did this attribute of this entity come from? When was it extracted? What was the transform that was run on? What did it look like before? What if all that gets lost in an ETL process? Can you fix mistakes or figure out where in the process things went wrong? You really should be able to answer all of these questions. Now data governance refers to the overall management of the availability, usability, integrity, the security of your data employed in an enterprise. If you can't track all provenance and lineage metadata, can you confirm its integrity? Can you confirm where it came from? Is this metadata and data secured against changes from unauthorized users? Can you track if an authorized user made an unauthorized or highly suspect change? Data stewardship looks over all of this. It involves managing an oversight of all the data assets to help provide business users with high quality data that's easily accessible in a consistent way. Now many MDM solutions have a whole host of products that are all integrated together. And they have inherent integration code, different things that you have to learn. So wouldn't data stewardship be a little bit easier if you had a solution that was all in one unified product? Traditional lengthy life cycle of RDMS-based MDM. So with all this complexity, it's really not a surprise that it takes a long time with many different steps, especially in a relational-based approach. The activities listed here are fairly extensive, but most implementations actually require a significant, if not all, number of these. And this starts out with, the whole process starts out with simply identifying all the source systems and then the data in those source systems. And that data needs to be well understood so you create a data dictionary. You want to narrow this down to a list of unique data elements as many of the systems contain information on the same entities, hence the need to master them. And so you create a canonical list of entities and their attributes. You use it to create a canonical, ideally perfect data model. Next, you map the data sources and figure out what it will take to get the data out of the source systems and how often you need to do that. Transform it to fit the canonical model and load it into the MDM repository. This whole process is better known as ETL. Before you even run your ETL scripts, you need to test them. So these are just the steps that are required before you're even moving data into the MDM repository to begin the actual work of mastering the data. You still have the loading, testing, data disambiguation, cleansing, governance, more testing operations, et cetera, et cetera. So you can see with all these tasks, this takes a really long time. Long enough that some things are probably going to have changed from when you started. For example, the champions that you have could have been promoted. They could have moved on to another organization. They could have left the organization or simply lost interest along the way while you're getting to the point where you could do some actual mastering work. So this lengthy development time creates issues. And yet it's only one of the issues facing traditional MDM. There are many other strong forces working against it, and these are the headwinds referred to in the title. Now we just discussed the lengthy upfront modeling and ETL design processes, which are slow to show any progress. And because they're slow to show progress and the organization continues to move forward, the business goals and data often change while this is occurring. And the relational database approach is too brittle and slow to adapt to these changes. So as I mentioned earlier, MDM is usually owned by IT and disconnected from the impact to the enterprises, profits and loss to its goals. And this makes it even slower to react to outside changes. Most traditional MDM projects fail or only achieve part of their goals because of these reasons. Now, there are some MDM successes, and I'd be lying to you if I denied that. There are some excellent practitioners of MDM. They're not common, and they're able to manage all the moving parts and put in place processes to be aware of and accommodate ongoing changes. And although these people are rare, they can navigate the waters. One thing that is always true is that when they are successful, they're never successful for a low price. It's never cheap to do a successful MDM enterprise project. Now, there's also some custom solutions that are done within a part of the organization where they have control over their MDM, that data models, they can make schema changes to the database at will, they can generally be more nimble. The trouble with these is they're not enterprise-wide, and often they only meet the needs of the current project or campaign, and that's not true enterprise MDM. So I see one of the biggest problems with traditional MDM as the chase for the perfect model. Custom MDM solutions succeed when focused and not change to rigid schemas, but the results exist in project silos and don't benefit the enterprise as a whole. They have a smaller model to work with and to perfect. So there's the core problem with chasing perfection. MDM's goal is usually a truth-based model with a very rigid golden definition of entities and their attributes. In a large organization, agreeing on a single definition is not easy. I think a more reasonable and achievable goal is for a trust-based model. So remember, truth-based versus trust-based. In a trust-based model, you include all of the golden portion, but that's only a subset of the attributes that you have and those are the ones that you manage very closely. And you include this along with a flexible portion to capture and keep all the data. You expose all of this to your end users and let them decide on the other data that they want to use. Okay, so let's delve a little bit more into why the relational approach is problematic. Relational databases commonly require upfront data modeling that requires a very precise definition and near unanimous agreements before users can get access to any of it. You've got upfront data modeling that's expensive, time-consuming, and it requires an intimate knowledge of all data sources and data usage, detailed ETL mapping, coding, and testing for precise data modeling. Query patterns with the questions that users need to ask. They need to be known in advance so that appropriate indexing can be set up to deliver acceptable performance. And even with this, you still lose things like full-text search on unstructured data. And each of these steps is sequential. The waterfall model shown here takes a long time to field. By the time you deliver to the original business requirements, has your business moved on? Because of this, it's often easier to build new systems than modify existing ones. And this contributes to further data sprawl. So instead of slight modifications to existing systems, you end up with yet another database with different data definitions and a lot of data duplication. The relational approach is difficult to change, and as a result, the speed of IT can never quite match the speed of your business. This model assumes a static environment where data sources and access patterns don't change. That's how it works best. If the model is incorrect, if a data source was unaccounted for or its structure changes, the process needs to be revisited. Hence, number six, the restart process. If users ask questions of the data that the DBA didn't anticipate, you might need to add additional data structures or add additional indexes. In many cases, this means further schemas changes and a repeat of the modeling and ETL cycle. Now, I want you to think back to a slide with the Gantt-like chart of steps in the MDM life cycle. Do we really have to repeat all of these again? Unfortunately, yes. When things change, the answer is usually yes. Those business changes, new data sources, and existing data source changes coupled with relational databases requires a repeat. It should go quite a bit faster than the first time, but each of the activities will be impacted, and this cycle needs to be repeated over and over again. It's brittle and slow and costs a lot of money. All right, so if one ETL pipeline for a relational system is complicated, how about when there are nine separate ETL pipelines as depicted in this graphic? Or in a real enterprise system where there's many, many more than that? If you consider that each one of these will require you to rinse and repeat at some point in time, you're pretty much guaranteed that you'll always be working on data loading. So think of Sisyphus from that Greek or Roman story, pushing the rock up a hill every day and having it roll back every night in an endless cycle. It's basically the same here. Repeat things over and over knowing you may never quite get to where you want to be. It's frustrating to those who are implementing it. It's frustrating to those who are paying for it, and it's frustrating for the business users because MDM doesn't quite deliver on its promises. Even limited data consolidation projects require enormous planning and careful execution with multi-year timelines and multimillion-dollar price tags. We talked a lot about the ETL problems from relational to relational, but the source systems could be mainframes, modern architectures. Newer technologies like Hadoop MapReduce might be able to help with some sifting of data sets, but these generally don't have the enterprise security and reliability controls. So you'd end up with a trust level associated with these that's likely to be quite a bit lower. So people might not even use it. Mainframes might store the data hierarchically, usually will, so the ETL to fit it into the canonical MDM model will probably render it completely different or leave a whole lot of it behind. So the takeaway here is that the real world is messy and this makes data integration, which is at the heart of traditional MDM, very difficult. And one more time, traditional data integration using relational databases is at the core of most MDM systems and it's really with problems. It's too limited in scope. It takes too long to deploy. It's too expensive. It's too difficult to change. It doesn't deliver on modern user expectations of getting things fast, and traditional data integration is an expensive model that doesn't deliver within an acceptable cost and timeframe to your users. Complex because it's fixed schema-centric nature necessitates additional products in the stack. It's slow. Well, I think you all get by now why it's slow. It's brittle because the interdependencies of the numerous components make it difficult to make changes. Chains somewhere will break something else. And it's expensive because the time required to model, integrate, and maintain these components over time. Since so few of these work, so few MDM projects are successful, there's got to be a better way, a new approach, maybe with a more modern architecture. So you want a new way to think about it, a new way to do MDMs, a better way to go about it. You want to stop boiling the ocean. You want to overcome the headwinds. Stop chasing perfection. Align the solution more closely with your business. Adjust more quickly and achieve that progress incrementally. Progress that's actually tied to business drivers and events. And measure this progress in weeks and months, not years. Avoid creating duplicate data in the first place. You know, thereby increasing data quality and minimizing the need for fuzzy matches to find potential matches. Adjust more rapidly to changes. And why not also save all the data, all the breadcrumbs from the source data, including the lineage and provenance. What if MDM projects could also be business outcome driven? Instead of project successes and progress being measured in terms of technical milestones by the IT department that were supposedly met, but yet a couple of years later it still doesn't do what people actually want, you'd actually get what the enterprise needs. Believe it or not, it's possible to deliver on all of these. All of these are possible by fixing the architecture. Here at MarkLogic we do this with something called an operational data hub. An operational data hub enables you to take a new approach to MDM. So what is an operational data hub? This might be the first time you ever heard of it. An operational data hub is a hub and spoke approach to data integration where data is physically moved and re-indexed into MarkLogic. It's really not that different from what you're doing when you move data into an MDM repository. But to be a data hub as opposed to something that's a bit more common, a data lake, the system must support discovery, indexing and analytics, real-time access. What makes it operational is real-time access, supporting two-way data flows and supporting transactions that allow it to serve as a system of records to some of the applications that interface with it. In some cases, you have the operational data hub taking over one of the source systems functions and allowing, for example, a mainframe to be retired. With an ODH and operational data hub, you get 360-degree views of important entities. Transactional applications can directly integrate with it. It has all the advantages of a data lake in that it handles the three Vs of big data, volume, velocity and variety. An operational data hub is also characterized by a few other important things. In an operational data hub, data can be loaded as is without requiring it to adhere to a strict schema. A proper hub still harmonizes data, which has the same goal as mastering data. They're nearly synonymous. Data is in different formats with different field names in the different various styles, various source systems. But when it's ultimately served up, the records need to look at least similar to be processed as a group. So all MDM approaches must do some form of harmonizing, and we're not immune in a MDM implemented with an ODH. At the end of the day, well-structured documented APIs and data experts must return coherent data. However, you can take a different approach with an operational data hub. You can make that harmonization agile and progressive. Typically, key requirements are identified early, and the associated data elements are defined to be critical data elements are those that are harmonized first in what you call partial harmonization. And over time, you get additional requirements that drive harmonization of more and more data and the process of increasing the subset that is harmonized over time is progressive harmonization, or in this context, progressive mastering. And we particularly shine here, because Mark logic index is the structure of data from source systems in both harmonized and raw forms. So some search and processing can be done in the data lake style, but critical data is accessed and indexed in harmonized form. So I've spent a lot of time talking about the problems associated with traditional MDM, and I've introduced you to what may be a new architectural concept, the operational data hub. And the operational data hub is at the core of a new approach to MDM, the whole subject of today's webinar, that we're calling streamlined master data management. Now you still have varied source data systems, and they're still constantly evolving. New data sources continually get added to the enterprise, only now they can be easily added to the streamlined MDM system, where it gets harmonized and mastered in place. For each of the source systems, on the left side of the graphic, you see two-way arrows. They're often one-way into the hub, but it's also possible for there to be two-way communication with the hub updating the source system. There you see the streamlined MDM hub being used. On the right-hand side of the graphic, you see the same thing. You see the two-way arrows. But there you see the streamlined MDM hub being used at the point of customer engagement, not just as a back-office function. For example, you see a customer service representative speaking with a customer who just called them. Or you might see a benefits eligibility worker seeing a customer for the first time, wanting to check to see if they're receiving or have received in the past other benefits, benefits found in other systems. One of our other customers, a U.S. criminal history system, is built on a Markologic ODH and includes message documents that come from police departments, counties, agencies, and court systems across the state. These include new booking documents, court appearance documents, and others. They use the ML system to collocate or match and group these documents, which can often be quite difficult as the data elements in the documents can be, you know, they don't always match up. For example, first name, even if it spelled the same column name, one might contain John, one might contain Jonathan, one might contain Jack. And so you want algorithms and the source matches that allow you to find those. So the date of birth field might be off in the month, the day, or the year by a single digit. That could be just due to data entry errors. So after we've merged things, the collimated record includes all of the original source information, data source counts for each matching data element, and a history of the changes to support unmerged operations. So they're doing master data management and they're keeping all the data and adding a whole lot of metadata. And to accomplish this, they're doing all of these things inside MarkLogic. It's basically MDM integrated with the business and aligned with its goals. Now I'm going to move into some of the features that allow us to do some of this. To use this new approach, you need some features that are unique or at least unavailable in relational database systems. First up, we have schema agnostic, so-called load as is. Because of it, you can minimize extract transform load and replace it with ELT or extract load and transform or harmonize. Only now you can partially harmonize to incrementally deliver results early on and maintain buy-in. You don't need to create a perfect data model. One of our customers actually made changes to their data model daily and even in production, they're still doing it on a quarterly basis. And that's possible because of the schema agnostic nature of the database. Now you still want to be able to identify, you still need to identify the data sources, but if you forget one and you need to add it later, because it was just created, you can easily add it because of the flexibility of this data model. So despite being schema agnostic, MarkLogic is actually structure-aware. It's got this universal index that indexes words, values, structure on everything that it sees. And it does this for all the loaded data without actually knowing that structure in advance. And this allows you to run queries that can find anything, even if that's something that you're looking for, happens to be in a data field where you didn't expect to find it. So search searches across every data field, query allows you to query any of the more specifically, within specific data fields. So being schema agnostic is a very key requirement. It's central to the success as it's going to drastically reduce the upfront modeling time. It really does make it easy to handle new data sources and accommodating data changes to existing ones. It doesn't take all the work out, but it does let you do it faster and in a manner that lets you keep up with the pace of change so you're not always far behind. So streamlined MDM needs to be something called contextual. And we talked a little bit about this. So for example, when a call center customer service rep is helping a caller with their new insurance application and discovers there's already a match in the system and they manually merge both or abandon the new one, this sort of contextual or operational MDM allows some of the important data quality improvement activities to be done in the context of the business activities that the entity supports and it prevents the need for downstream disconnected MDM to clean things up later. Well, there's still some of that, but it minimizes it. If you never make a mess, you don't have to clean it up. Same here. The apps interacting with the streamlined MDM database and you're avoiding the creation of data problems that would otherwise need to be cleaned up later. To accomplish this, we use things like field weighted fuzzy matching algorithms along with data merging capabilities that all run in MarkLogic. I'll touch on a few other use cases soon after I've covered some of the other features. All the data. So what does this mean? It means we don't get rid of potentially valuable data. Storage is cheap. So, saving money on storing costs is not a huge issue. Relational databases were designed when disk space was extremely expensive. So it went to great lengths to avoid using that disk space. Unless the data is absolutely needed, don't store it and don't store it twice for any reason. Again, this doesn't apply now or at least to nowhere near the extent when this technology was created. In streamlined MDM, we recommend you master a subset of each entity's data elements and attributes, but keep the rest. Keep all the source data. Keep it in its original format. Index the contents. Keep everything and allow it all to be queried. Just let your users decide on what data they want to include. You can even keep data that doesn't match the golden master data just in case for whatever reason. This allows you to fix mistakes, to reverse changes, to unmerge things that never should have been merged in the first place. And it supports the more complete trust-based model that isn't quite perfect, but provides more capabilities to more different users and lets them make their own decisions. Metadata Unlimited. So you want an MDM system that doesn't limit the metadata you can store about an entity. Being able to store virtually unlimited metadata for every data element and keeping a copy of all the source data in its original state and format, allowing data cleansing, merging errors to be corrected would be impossible with relational-based MDM. Further, maintaining the source information allows system and application users to see all the data and decide what's closest to the true copy. This extra data is stored as metadata. If every element or attribute of data in the same entity came from a different source system or from multiple source systems, you want to be able to capture that in metadata. You have bi-temporal data that you might want to keep, and that allows you to answer queries like, what did we know when at some point in the past? And to do this, you've got specialized indexes that are essentially metadata, four different timestamp values, the valid start date for a piece of data, the valid end date, the system start date, and the system end date, you know, more and more metadata. So you want to make sure that... you also want to make sure that no one can change the data that they shouldn't. So you want element-level data security, the security stored as metadata. You...hopefully I've beat the dead horse and you realize that unlimited metadata certainly has some benefits. Simplified architecture, the so-called KISS principle applies equally well here. Don't underestimate the value of a simplified architecture. Traditional NVM solutions are built on a wide range of technologies. These technologies include relational databases, ETL tools, search engines, app servers, content processing frameworks, and even data virtualization technologies. This makes for not only a complex architecture with multiple products and people to manage them, but it also makes the data integration tasks inherent in traditional MDM expensive, brittle, and inflexible. Things like achieving a consistent security model across all these is difficult. Managing backups, achieving high availability, and overall administration are difficult and expensive to implement and maintain. And having all these products can make auditing nearly impossible, since many of the things that are happening to data in the different products and not all of this is captured or transferred to the central MDM database. Streamlined MDM can be integrated into an operational data hub built on MarkLogic, which is a single integrated data platform with database, search, application services, security, and more, all in one QA platform. Getting everything in a single QA platform means that someone else wrote and maintains the code, test it to make sure it all works together, and this makes for a more stable product, and it lowers your TTO, your total costs. Okay, you want a data model that's as close to the real world as possible. Streamlined MDM benefits from a logical data model where your entities are modeled as hierarchical documents instead of being normalized or shredded and forced to fit into the columns and rows in multiple separate tables as in a relational system. Documents are ideal for handling varied and complex data. They're human readable. They closely map to the concepts and business models of the data, and they avoid the impedance mismatch problem that relational databases have with applications where you got a whole lot of data transformation going on to be able to use it. When it makes sense for an entity to be split across more than one document to so-called normalize it or to be related to another entity, semantic triples can store this relationship, and they can do so with greater context or even multiple contexts, and let me talk a little bit about what that means. In a relational system, a customer is related to a product through a many-to-many relationship, but the fact that there is a relationship doesn't tell you anything about how that customer and that product are actually related beyond the cardinality, and that's really not something that's important to your end-users. Now, the customer may have bought the product. They could have returned the product because they didn't like it. They could have returned it because it was defective or they could have just been confused as to how it works and called customer support to get answers. Now, these are just four of the many contexts between these two entities that I came up with off the top of my head while preparing for this webinar, and there's, of course, many, many more. All these situations can be captured through semantic triples that have the same subject and object but different predicates, so the same customer and product but different interactions between them. Same thing goes for a patient and provider or any two interacting entities in any domain. Each interaction has contextual differences, which can be captured with semantic principles. Similarly, semantics also gives you the flexibility to define master data beyond an object and to compose relevant domains based on attributes such as behavior. Forester calls this approach contextual MDM, and it allows you to support multiple domain models on one MDM system, even with the specific entities having mastered portions. Semantics can also enable you to use a hybrid registry approach. So you've got the oven spoke coupled with a federated or a oven spoke and a registry hybrid type of MDM approach. So the data can be left in the source systems, at least some of the source systems, and linked through semantic triples. So documents and triples combined as possible in MarkLogic is a very powerful combination. Now you need security. If you're going to bring all your enterprises most important data assets into one place, then you better make sure that it's secure for that reason alone. If you want to also allow applications to interact with that data in an operational environment, you're going to need robust data access controls that support all your data governance requirements. Ideally, you want element level data security, which happens to be extremely rare in a flexible schema agnostic database, but it's not on existence. I know somebody who has it. Okay, full auditing. We've touched on it. Lastly, and closely related to security, we've got streamlined MDM requires a system with full auditing. Who did what, when, and why to any piece of data at any time during its life cycle, even if that someone was in automated process or was trying to cover their actions. As I mentioned earlier, in a system of different integrated products, it can be quite difficult to trace what happened when changes were made outside the database. MarkLogic's single unified data platform, the features I mentioned as being critical to streamlined master data management are all closely related. Full auditing requires a single unified platform and it requires metadata if you want to be able to easily query the actions and changes to your entities. Storing unlimited metadata that's easily searched is also closely tied to a document model that's schema agnostic. Being able to search in addition to query requires something that has search engine internals. So these are all related and important features and combined together, they deliver the capabilities you need for streamlined MDM. So let's cover a few of the real world examples. I'm just going to cover two and I know I've run over because of the original mistake. In each of the examples, you're going to see operational mastering with some fuzzy matching, some fuzzy search and field weighted algorithms and data merging to support the search for duplicates and eliminate them. This example is no different. On the right hand you have various applications that perform actions against the MDM operational hub. And on the left side, the source system with one or two-way access. This customer is a United States eastern state. It's engaged Mark Logic to create a single view of beneficiaries in the state system and they're doing this by integrating beneficiary data from their children's social services system, their resource eligibility system for temporary assistance for needy families and supplemental nutrition assistance program among other benefits. And the juvenile services system into one master record for beneficiary in a Mark Logic operation data. So the data source system read initially we're only receiving data from the hub through a REST API. But after it's gone to production, they're going to get added. So this is just yet another example of the flexible data model accommodating progressive incremental results to be delivered to your end users. So this solution allows all benefits programs to search and select existing person records, lets you avoid additional duplication and creation of already existing people. It includes data merging features that of course maintain enough information to be able to unmerge later, supports unlimited search, provides a 360-degree view of beneficiaries and helps to prevent people from being enrolled in overlapping benefits programs as commonly occurs in siloed benefits systems. This is kind of important. There's smaller budgets and that allows them to spend a little bit less money. HealthCare.gov, little operational mastery. Mark Logic is the primary product used for HealthCare.gov. It's run by CMS, the Center for Medicare and Medicaid Services. And I'm going to blitz through a little bit of this because we're really close on time. So you've got applications and information about consumers existing in many forms, coming in from multiple sources. People submit multiple applications. The family remembers, might submit applications with the same people on them. And Medicaid may transfer a person into the system from a state system. As a result, a person may be listed on many applications. So the system is able to, while live, to find all the different matches that it's 100% sure of and it's able to do this under a load of as high as 300,000 concurrent users. And it's able to direct the data, direct people to their existing data and applications as they log in and to merge applications that it's 100% certain of. Of course, it also presents potential matches where it's not 100% certain to a person for them to do a manual decision on merges. We talked about being used by customer service representatives and there's 10,000 of them, so that's a little more example of doing operational mastering under scale. Now, of course, mistakes can and will happen. So just like everywhere else, we keep the original information for the infrequent but necessary unmerged operations. So I hope you've learned from what I shared and I can see and can see that there are alternatives to the traditional approach to MDM. MarkLogic provides one, streamlined master data management to quickly summarize the high level reasons why it's ideal for such an approach. MarkLogic makes it easy to get any and all of your data in and have it be fully indexed. It's immediately queryable using queries of any complexity without needing to know those queries in advance. This combined with enterprise-ready features, such as it being 100% asset-compliant, common criteria certified enterprise-grade security that's good enough for the Intel community, and out-of-the-box continuity of operations supporting features make it perfect for meeting your MDM needs or complimenting your existing MDM tooling. So that's all I've got. I hope you learned something today and if you have any questions, I can take them now. So the question about regarding metadata. Is the metadata stored in the same database as the data instances and can this metadata database be integrated with other commercially available metadata repositories? So you can certainly, it is stored in the same database. It's all queryable together. So the data and the metadata are both first-class citizens in MarkLogic. There's certain kinds of metadata, document properties, for example, that need to be queried slightly differently. But it's all stored in MarkLogic. And the language that, well, you can access MarkLogic through JavaScript. You can access through REST API. You can access it through XQuery. XQuery and XML or JavaScript and JSON. Both of them allow you to easily export data sets. So just maybe the metadata to an external system. This is one way that we might integrate with your existing suite of MDM tools. If you already have an investment, I didn't really touch on it. But if you already have an investment, we can work closely with it and take advantage of some of the tooling that's associated with that. And I know we said, we typically end right at the top of the hour, but just to get a couple of questions in the recording if you have time, Mike, if that's okay. Sure. Can you explain the idea of golden portions again? Is this a core subset agreed by all or different for each project? Okay, so there's various patterns used in how you do your data modeling. And we have this one that we call the envelope pattern. The envelope pattern is this all-containing document that up at the top level of it, you have the stuff that everybody agreed upon. So it is exactly what you're asking. It's the core subset agreed by all. It's not really different for each project. The whole point of the bottom section of the envelope pattern, which contains all that source data and the previous versions of each of the data elements, is what gives you the flexibility to accommodate the different users. And just one last question I'll sneak in here is how is harmonization different than imposing a schema as in traditional relational database? So the beauty about harmonization is that you can load the data, all of the data into the streamlined MDM system without having to do any of the modeling up front. So you can incrementally identify the most important data elements. You don't have to get your model perfect. So really harmonizing is you're essentially modeling. You're moving towards a canonical model, but you're not held back by changes to that model. You do not have to deploy a new schema in order to transform in place inside the database some of the data. So you're basically just adding to the golden portion of the different records. And when you do so, you can continually deliver incremental results to your users. It's really quite powerful. As I said in the talk, we do not eliminate harmonization or ETL entirely, although you really don't need to do ETL, but extract load, play with it a little bit, and then do your transformations or your harmonization. Does that make sense? It makes sense to me. Well, Mike, thank you so much. It really was a fantastic presentation. I really appreciate you taking the time to speak with us today. And thanks to all our attendees for being so engaged in everything we do and for your understanding and patience today and asking so many great questions. I will send those over to Mike. And again, I will get the follow-up e-mail out to you by end of day Thursday with links to the slides and links to the recordings so you can prove at your leisure. Mike, thank you so much again for speaking. Thank you, Stan. Thank you for sponsoring. And I hope everyone has a great day. Thanks. Thank you all.