 Hello and welcome. My name is Shannon Kemp and I'm the Chief Digital Manager of DataVercity. We'd like to thank you for joining today's DM Radio. Where's the data? Let your data catalog find it. Sponsored today by Calibra. It is a deep dive in continuing conversation from a live DM Radio broadcast a few weeks ago, which if you missed, you can listen to it on demand at dmradio.biz under podcasts. Just a couple of points to get us started. Due to the large number of people that attend these sessions, you will be muted during the webinar. If you'd like to chat with us or with each other, we certainly encourage you to do so. Just click the chat icon in the upper right-hand corner for that feature. For questions, we'll be collecting them via the Q&A section in the bottom right-hand corner of your screen. Or if you'd like to tweet, we encourage you to share highlights or questions via Twitter using hashtag DM Radio. As always, we will send a follow-up email within two business days containing links to the slides, the recording of the session, and additional information requested throughout the webinar. Now, let me turn the webinar over to Eric Kavanaugh, the host of DM Radio to introduce today's webinar and speaker. Eric, hello and welcome. Hello and welcome right back at you, Shannon. Thank you so much. We're excited today to go deeper into the information architectures of today's businesses and try to figure out how to get a handle on all that stuff. It's getting more complex by the day. So the topic today, where is the data? Let your data catalog find it. I love data catalogs. I think they're going to save us from ourselves, by frankly. But we'll get into that during the course of this hour. So I'm hosting today with Paul Brunet, VP of Product Marketing at Calibra. He's calling in from the greater New York City area. And like I said, it's one of my favorite topics. So this is one of my favorite stories of being in the business of information management. Years ago, I had the opportunity to do some consulting with the Chief Information Office for the Department of Defense of the U.S. of A. And a guy, a senior level guy over there was talking with me about what's going on with them and what their challenges are. And we got off on various topics that I started talking about. Work I had done back in 2005 to evangelize transparency and federal spending. Well, amazing things happened. And about a year later or so, Congress passed a bill, the Senate passed a bill and the President signed the Federal Funding Accountability and Transparency Act of 2006. And it just blew my mind, so I was telling this story to this gentleman and he laughed and he said, transparency. Oh, so it's your house where I should send that next predator drone strike. You're out of your mind. I don't even know how many oracle licenses we have around here. Now, this is the Department of Defense CIO's office, mind you. So it was just kind of a wake-up call. And his point was it's very difficult to know without there. So if you don't know how many licenses you have, imagine what the data landscape looks like, especially for an organization that's been around for a long time. You have all manner of legacy systems. You have all kinds of policies and rules and regulations that determine what can and can't be done. And if anyone who has dealt with that kind of situation knows, the more regulations the harder life gets, and sometimes they even contradict each other, which makes life really interesting. So let's talk about information topographies. They are just wildly diverse these days and it's getting more complex by the day. So how do you deal with that? Well, frankly, in the past we've had fairly manual efforts on trying to handle that kind of stuff. And then now with mergers and acquisitions, things get more complex. There are all these regulations. And even though GDPR really is for the EU and for citizens of the EU, you can pretty much rest assured that something along those lines is coming down the pike here in the U.S. of A. Of course we do have the can spam law, for example. That means you can spam as long as you follow certain processes, like a removed list and so forth. But there are other regulations as well. HIPAA, of course, comes into play here. The Affordable Care Act, if you're in the healthcare industry and I have an example of health data in a moment here, all these regulations combine to make the situation rather, I'd say borderline urgent. If you don't have some way to see what data you have and then be able to map that data to business processes and know where things are going, you're going to be in trouble just for your business sake, but also in terms of regulation. So data lakes, what have these things done? Theoretically they were supposed to simplify where you host your data and how you host your data and make things less expensive. In reality I think they've made things more complex because now we have a lot of these large data lakes that are relatively ungoverned and that's going to make life even more challenging. Well the bottom line here is that manual efforts to address this are just not going to work. We need automation, we need tools that can basically scan your information landscape, figure out what you have, and then allow you to piece together that picture as it makes sense for your company, for your organization. And that's where data catalogs come into play. So this is not a drill, this is an actual schematic of a healthcare operations information landscape. You can see a whole slew of different systems there and all those arrows talk about data that's going various directions. Well because of some of the regulations they have really had to bend over backwards to get to a situation where they could understand who's who and how much do we send the bill for to which companies and which individuals. This is a serious deal. This is actually for revenue cycle management which obviously is what keeps the fires burning at large healthcare organizations. Again you can imagine with a landscape this complex knowing where the data is on any given patient, for example, is a real challenge and being able to identify, they actually figured out that something like $17-$18 million are lost each year by this organization just in not being able to bill efficiently. It takes days, weeks even sometimes to reconcile individual patients and figure out who owes how much money. That got more complex just from my perspective after the Affordable Care Act in part because it's a whole bunch of new regulations. I don't know if you've noticed this but a lot of times you go to the doctor these days and you try to pay them and they won't take your money. They say no, we'll send you a bill in the mail. Well, for someone like me, that's bad news because I'm really bad about the post office. But anyway, long story short, this is a very complex information landscape. It's going to be very difficult to really get your head around these things and governance, I promise you, is going to cause a whole lot more headaches for people who don't have their act together. Well, that's why we're talking about data catalogs today as a new way to help you incrementally build out a clear picture of what your information landscape looks like. This is one of my favorite slides. You can see the data lake evaporated into the cloud. We started with this whole concept of data lakes being the sort of be-all-end-all, single repository, large and infinitely scalable repository for your data. Well, that died pretty quickly. You can see the major vendors in that space have not only dropped using the word Hadoop, but they started using terms like data fabric or backplane, data plane, for example, if it's Fortinworks. The lesson here, frankly, is that they've realized this one location massive repository for data dream is just not going to happen. And cloud, in fact, as Tony Bear, an analyst for Oven, pointed out, is becoming the de facto data lake. So we're going to have these very diverse, very far-reaching environments with data across different clouds on premise through partner channels and so forth, and that means the process of getting that strategic view of which data lives where and who gets access to it is all the more important. And it's going to happen, I promise you. If things have to happen in business, they do happen. So this is one of my favorite slides of late. Any effort is going to require communication across departments. Key stakeholders need to be in the know, and because it's so serious, you really need to make sure that you keep people apprised of what you're doing and who's involved. This is all very important stuff, so don't keep your plans to yourself that has been a tendency over the years for various reasons, but the vector of movement is definitely in the direction of collaboration, of openness, of transparency, wherever it's required, and of course, obfuscation in areas like GDPR when you have to hide the data. And so with that, I'm going to hand it over to Paul Brunet, who is going to take us through what Kaliber has done and how data catalogs in particular are going to help us find data and catalog data and organize this stuff. So with that, Paul, take it away. Thanks a lot, Eric. I appreciate the setup, because I think it really is a nice lead-in, too. It's funny, nowadays what you do is you do a good job of laying out the issues. Now what we're seeing is a lot of people say, I need a data catalog. What is it? Because we're hearing this, and we know a lot of the challenges that are out there, but one of the key issues and questions is, so what is it and what's the best way that I should be thinking about a catalog? It's going to solve all these different problems going on out there. But what I usually like to do is I just like to start out and just say, why? What is the motivational factor for us? As much as we've been hearing for the last five, seven, eight, ten years, is the idea about digital disruption. But it's still happening, and it's still happening at a greater degree to a broader degree across the different businesses. And this is an interesting thought that came out from McKinsey where they basically said, especially for established companies or the incumbents, that newer digital companies more so based on the digital model have already siphoned off 40% of an incumbent's revenue growth and 25% of their profitability growth. So therefore, what they really feel is that only 8% feel that their business model will remain viable going forward. That's a very alarming number. And you can take a look across all the different industries, travel agencies, ride sharing services, banking loans, cloud services, retail insurance, just pick your industry. And you can really see. So we've always held up those ideas. Well, I got to use my data. It's all about my data and my information and the like. And so what we started seeing is the idea as compared to locking down the idea of data is how do we start freeing it? How do we start thinking about using our data much more from an offense approach as compared to a defense approach? And this was Tom Davenport in the HBR along with the CDO from AIG really came up with this idea that if you think about it between an offense and defense structure and then the idea is that I need control and I need freedom around the data, that there is this data sweet spot. And the idea of it is how do I get there? How do I make sure I need to really free up more of this data so I can have people use it in different ways? But because of many things that Eric mentioned earlier, whether it be from a regulation perspective or some of these other ideas of privacy and the like that are coming up, how do I really think differently about this? And so we've been trying different ways. We had our centralized reporting. Well, we knew who that was. That was really about control and being very defensive about it. Then we started saying self-service analytics, that's going to free us up. That's really going to get us to where we want to. But it still had to be very, very data savvy and the tools and the problems were very complex. And the idea, they never really got to the masses. So then as Eric mentioned, we went to this idea of data lakes. Just put the data out there. Give it to everyone and let it be free. So you're really going on this idea of freedom. But what we did is we started creating other problems for ourselves. There is redundancy in the data. There is diversity in the data. It's being manipulated and publishing the same thing whether it being modified but no one really knows where did it come from. And so what we found and worked with many of our clients and very much represented from this idea that just came up from MIT is we've really created this engagement gap. And MIT does this yearly survey. And you can see on the blue is that the idea of it is we've made the data. You know, it gets out there. We've made it like the data available. But the idea of that we can apply it and gain insights from it is actually going down. So much so that over these years we've actually doubled the engagement gap. Our users don't know how to really take full advantage of all this data and we'll get into what is the definition even of data going to go forward. And so we call this the data engagement gap. How do I really engage with it so I can really think about applying it? And this is the problem that I think organizations are really trying to solve. So why do you need a catalog? Well, you really want to solve this. You want to make sure that data is available but it's used in a manner and it's applicable to the individual that wants to use the data. So, Drake, we have this idea of data catalogs, right? It goes from a data lake. Put the data lake into the catalog and we're going to make it really usable and we can then apply all these different types of analytics with it. But one of the things we've created issues for ourselves when we started creating these data is we just took all the data and just dumped it in there. We didn't put a lot of color around it. We didn't understand the context of it. And we have lots of redundancy. So when we just take this data lake and put it into a catalog, yes, there's some structure and the ability to be able to find it becomes simpler. But we have information in there that is wrong. It's quality. We can't really judge the quality to the levels that we need to. Or we may be putting data in there that shouldn't be in there, especially as we talked about GDPR, privacy, and so that therefore people have access to data, maybe they shouldn't. And so therefore it's really impeding our ability really to think about it from an overall value perspective and a value that we were really trying to drive. So one of the things that we've seen is, for those that really got at the forefront, just let me go put it into a catalog and they started seeing some of these issues. What we saw is, let me think about these two things together, governance and catalog. And the reason for it is, it also allows me to think about, I could have the data lake, but why was the data put in there? Who put it in there? What was the reason for it? Not being overly complex about it, but let's make sure we understand the context of it and therefore we can describe it a little bit better. We can make sure it's aligned to some kind of business initiative. And once again, as Eric also called out, is the idea of integrating across these different pillars. So we can also reduce the amount of data we're maintaining because remember now, with GDPR and privacy, there is a liability with having this data around. So what you want to do is you want to start reducing the amount of data you have because you also want to reduce some of the liability, making sure you focus around that which is really important. And so you can think about this idea of just applying some basic governance to it which reduces the number and is really a way of getting more color around your data. And that's what we really refer to as the idea of understand. Then what a catalog can really think through is then the idea of putting structure on it. You can really find what's relevant to me and you really want to think through the experience of it. Like somebody like a data scientist who really just wants to access to raw data in a very simplified fashion, has a different type of experience than someone in marketing who's really just trying to find the core analytics or a core model that they can apply within their marketing campaigns. So you got to think about that experience in the same way we think about commerce. A catalog is just that repository, but you wrap the experience around it based upon the personalization and what that individual is really trying to do within a commerce experience. And then the last key piece of this is the idea of trust. And what we mean by trust is reducing the visibility. So if I'm not supposed to have access to the data, don't show it to me. Is it the right data? Is the data itself right and start applying this whole idea of trust around it, which once again increases the focus and increases the relevancy of the data as well as the little green representation in there is all about bringing in outside data sources. If I have this idea of a broad understanding and the ability of cataloging, it then frees me up to bring in more data from outside or taking this data because now I have a very broad sensibility around the trust of it. I can really start publishing it externally. And this really allows us to drive much stronger degree of value across the data. And that's what we're really seeing with many of our clients. And so what I wanted to really focus on is what are some of the best practices, what are some of the things we're seeing putting out there? Are there tools and techniques that an individual is using to get full advantage of a catalog? And so I break down into really four key areas. I look at it as approach. Once again, keeping it back into, I've compared to focusing on the technology. What is the process? What's the model I'm trying to implement? What is the approach I want to take? Where does it start and where does it really end to? The second thing is reach. Who is it I'm trying to reach? And do I really understand their experience in trying to get to what they want to get through around that? And how are they thinking through what is it that they really need? Is it raw data? Is it a data set? Is it an analytics? Is it a dashboard? Is it a model? Is it an algorithm? All of these things are data. And then what's really going to help us through is then drive through the idea of value. There's a lot of different ways that people are now starting to think about how do I measure the value? A CEO says, I've heard all about this data. Organizations, whether it be the CDO, the CAO, or a line of business, can you tell me what value my data is having? Because then if I have a good understanding of the value internally, is there a different way now I can take putting an external value on the data I have? And really it then comes back into the idea of trust. And I want to take a little different model on trust with you as we're going to go forward. So the first one is the idea of what is the model? What is the approach that I can take? And like once again, I use this one or come up here. But at the end point in time, if our objective is to get the data in the hands of the individual when they need it, how they need it, what are some of the key steps we need to go through? And then as you're doing this, don't focus around the technology, but think about the process. And here's a simple one here. If you marry it against your data lake model where you got your raw zone, your discovery zone, and your optimization zone, but what are you trying to do? I'm trying to capture. How do I make sure I understand everything that's going into the raw zone? And so there's a request and a choir all the way through to the idea that I can onboard it. Then there is the refinement, how I set up the profiles, the whole idea of cleansing and the ability of being able to put a little bit more of transforming the data. So it's a little bit more around the data sets that I've been looking at trying to utilize. But that's only applicable to a subset of users. Then you go start going through the idea of curation. Here's where do I really think about the tight alignment with what the business is doing, as well as then my EA alignment, my enterprise architecture alignment. And then I start thinking about who's the owner of the data? How can I really drive the stewardship and thinking of that? Because now I'm trying to get into, I'm now trying to think a little bit more broadly based around the reach. So I need someone who's going to own it and make sure it's utilized and how it could be utilized. Now I start thinking about enrichment. How do people start adding things, combining data? Start thinking about my analytics processes. I start thinking about, if I request, how do I really get very simplified the permissioning of it to provisioning of that data? I've now set my centralized governance because making sure it's not only the data's right, but it's the right data. And then I start thinking into a very popular approach is crowdsourcing. How do I get, as compared to essentially managing this, how do I get the community as a whole to start thinking about adding insights around quality or even thinking about different ways of governing or what should be in the environments that people can access it? Or others like me have looked for this data. Or, hey, I know someone so down in the hall was looking for something similar. Let me see. Is there a way I can connect them? So the idea of bringing in the crowd is very important. Then you go into the idea of certification. And this is where you start thinking about the idea of trust. But certification serves two ways. Like it could be a centralized certification where a company says, this is the data you must use. Or it can be the certification of the crowd. Based upon if 10 out of 10 people said this is a really good data source for what it is you're trying to do, different way of building that idea of trust and certification. All the way through to thinking about how am I measuring the impact of my business in how we're utilizing the data? Wow, this is really being looked for. This is being really being utilized. Let me understand a little bit more. And maybe there's a way we can better refine it to make it even more applicable going to go forward. The idea of measuring impact. All the way through the right-hand side, which I really think is key, is this idea of broad consumption. And I think this extends even beyond the catalog, where you're really thinking about different experiences, personalization, people coming in full life cycle about I requested a data source. I've accessed it. I no longer have access to it. So they think about the whole order management system of around data. All the way thinking through that, we're going to get to a point where it's self-service governance. I don't need these centralized teams anymore. That the community as a whole can kind of self-govern itself. So here's just an approach that people use. Because once you start thinking about an approach, then you're thinking, then I start thinking about what is the asset that I'm really talking about? What is the definition of data? And I think this is another key point, which we see in catalogs that a catalog in the processes and the expertise that we're building can be used for many different things. Not only around the data and the datasets, but we can use it for algorithms and models and notebooks. And then as you start thinking going forward, what about all the other things, the glossaries and dictionaries around my company's definitions, as well as then the different metrics definitions that we're putting out there. Now we start thinking about analytics and dashboards and different types of visualizations that are coming forward. So now as you start pushing even more so to the right, you think about usage analytics, to now being able to take my data and publishing in a different marketplace where I can even drive more value and more business for the company going forward. When I think about data, and I love the debates, is it data? Is it information? Is it analytics? Is it insights? It's all of the above. Pick your favorite term and go with it. But if you really think about a very clear understanding of your approach and you start aligning the different assets that your organization needs and is looking for going to go forward, you'll see that there's many different ways you can approach it and allow you to figure out, okay, where's the place I can deliver the most value as it is going to go forward? So then, now that you've got your approach and you're thinking through a little bit more broadly based of what the assets are, now you're going to align it with where can I get started? Where can I get focused? Is it, am I really going to go pick a project around AI and machine learning where it's with a data scientist where they're trying to get access to that raw data? Well, it tends to be on the left-hand side. As compared to, I really want to make sure my analytics team is spending too much time trying to find the data, or they're taking all the analytics and no one seems to find it, so they keep doing the same analytics over and over again? Or is it we're creating the same analytics, okay, in just each of the different pillars of the organization, where if they could see what other groups are doing, they can make it more easily build upon each other? Or am I really trying to take a look at, I've got this people who really are not on very, very strong data savvy, but I really wanted to go to this idea of broad consumption, and you can really go all the way to the right. So around this first area around approach, the first thing is be prescriptive in your approach. Understand a process, don't think about it in regards to technology. That will come secondary once you understand where to focus. The second thing is think about what does data really mean, and it will also really help you to really understand in regards to your approach what you should be focusing on. And then think about it, okay, how can I think about mapping this into the various initiatives I have going to go forward? So that's the first step that I really wanted to go through as we're thinking about this idea of how do I make catalogs really useful, because catalogs is a key capability in here, but I really want to make sure what is the key area of focus I want to do. The second key area is reach. If we take a look at some of the ways we've been really thinking about data, even going back to that data sweet spot, we initially thought about the scientists, we've got the data engineers, and we've got the business analysts. That's really where our focus has been. But if you think about it, in a broader model of our organizations, there are thousands of other users out there that we have yet to really reach, and that's when I mean this idea of broad consumption. It's the person in accounts payable. It's the person in procurement. It's the individual that's working the warehouse. Are there different ways that they can be, the ways that they are looking about and thinking about data? All the way through, if I really extend that even beyond my organization to thinking about at the early outset, organizations that can provide more data into me, or I can think downstream, my channel partners, imagine if I could take this data and I could share it with them in a secure, trusted fashion, imagine what we could be able to really, what type of business we could do. I can get preferential treatment from some of my partners because I deliver them such a great experience that they love doing business with me, that they're willing to share more customers coming into this as we're going to go forward. And so there's this unmet value as we start thinking about this idea of reach. And so today, one of the ways we encourage this, it simply is, oh my God, my head is spinning. Is there a simple way you can really think about this? Well, there's three core roles as you think about data. There's a data consumer, a data provider, and then what we refer to as a data enabler. So the data consumer is the one who's trying to use and apply the quote-unquote data. There's the provider, and many times a consumer provider can be the same thing. And then you have all the teams that really are helping enabling, making this simpler, making it so that we can actually achieve some of the goals that we are going to try to go forward. This can be your IT organization, your data stewards, your data governance teams, your data privacy teams, the ones who are ensuring that when we make this data available, that it's used in the right way, that it's the right data, and that we're providing the different technologies to really help facilitate this as we are going to go forward. Now most people can really get this down. I really get the data side of house, but when you really want to think about broad consumption of data, really think it through about this idea of the business operations person or the business user. Are we talking about key roles, like around business management, HR, customer service and support, your sales teams? Once again, people who aren't really super data savvy, but there's different ways and different ways than what they want to use the data coming forward, or even opportunities that they never realize is possible. How are we thinking about, in regards to our approach to making sure it's accessible to them? And so if you think about this idea of reach and you start applying it against your model, you really start seeing, okay, I got my project, I got my approach. I know what I'm trying to get through. I'm starting to get through my initiative. Now I'm thinking differently about which step in my approach is most applicable to the person I'm trying to reach. If I'm trying to work through the CAO and CDO, once again, they have this broad scope. They're trying to take a look at many times mission to try to solve this kind of end-to-end process. Whereas the business team, many times where the funding for our projects are coming from, they're not as interested in the front side of it. That's the means to the end. I'm really interested in that end-side of the business. Or you've got your data scientist, your analyst, or your data governance scheme. Who's providing it? Who's the initial impetus for trying to solve this? And really thinking about where do they fit into your approach? And once again, the idea of reach, all the way through to that, IT has a very strong role across all different facets of this approach as you're going to go forward. So the second key thing is the focus around reach. So you have your approach, and now you reach. Now as I'm starting to get an understanding of, okay, how can a data catalog think about solving this? What is it I'm trying to achieve? What is the value? What's the problem I'm trying to really get to? And so that's the idea, thinking about value. But before we do, you're like, the idea of this is here's some of the key actions you want to do, understand, map out your citizens, and then re-examine it and the different expectations of those roles. And think about what is the value that each one is trying to get to? Because this is the one where I think we get the biggest stumbling block. I get it, I can conceptualize it, but now I got to go find the funding. Who's got the money to make this real? Logically, it makes sense to me. But how can I get there? And this is the key area. This is really where it really comes difficult. Today, we know, and there's enough different pieces of research out there that say today it's 80% of the time spent trying to find the data and only 20% of the time working on the data. Well, great. Now imagine if I could flip that on its head. I could say I got 80% of my time as working on the data and 20% on the time being able to only find it. Because that data catalog is just there. I can go get it the same way we thought about the data. Like, well, the key question is, from a business impact, try selling that to your CFO. Look at all this time I gave back to the company. Where's the dollars? What's it coming from? How is this really helping me to solve some of the broader issues? And some of those initiatives are within marketing. The idea is that 75% of marketing functions today report marginal return from the digital investments. Well, that's not a good thing. Why? Because we made all these digital investments. We created all this data, but they don't know how to apply it. The next thing is one I really think comes across. Think about your HR department. 70% are increasing their investments in talent analytics, but only 12% are getting results. Once again, we're creating all this data, all this insights, all this analytics, but we don't know how to get into the hands of people so they can apply it back into the models. And this goes all the way through procurement, to your assurance functions, all the way through to different sales areas. What is it that these organizations are doing can be able to do with that time? And one of the things we just recently completed with IDC is working with a number of our clients is, so if I got the idea of governance and catalog, what is the type of return of organization they're getting from it? I get it, I want to avoid regulations, but is there a way I can think about this a little bit more from an offensive posture? What am I doing to try to go get to new business? One of the things I was very telling is that organizations, and many of these projects are small out of the gate, but the idea is that it's driving all new revenue opportunities, like 19 million on average across from an ROI perspective. 510% ROI, three-year ROI payback in seven months. The idea of payback around that stuff. That's pretty impressive. It was funny, Eric, that one of the things you mentioned was the idea of billing. One of the participants in the study basically said that because they were able to do this, had better visibility into the data and the quality of it, they were able to spend more time going in there and they found out there was thousands of customers that they weren't billing. And that translated into millions of dollars of unforeseen revenue. Now that's a value statement. That's a different way that we can really think about catalogs and getting access to data and the value that's really coming forward. And the other thing which was really interesting as you think about ROI and combining with this idea of reach is the benefits go pretty broadly. All the way through it you can see on there from the productivity of your business intelligence and analytics teams. Why? Because they have data and it's trusted. The idea of productivity around your governance teams and once again using some of the automation tools so they can once again focus around not and this idea of changing the role of the steward of locking down the data but looking for new opportunities of sharing it to get data and getting the hands of the people who need it. All the way through to GDPR and other privacy areas of the idea of increasing the productivity of your regulatory compliance. The one result that I found most intriguing was in the area of the quality of data. I'm being very careful not to say data quality because we know that there's tools out there that look at data quality. But the idea of the quality of data. Obviously on the left-hand side quality data I was making it easier to find and it was the data that I was looking for. On the left-hand side. And you can see some of those metrics. But the other point was when there was an issue in some of the quality, 42% less time to resolve those errors. So I'm not spending all the time trying to trace back because there's key elements in there. There's lineage. The idea of where's the data coming from. The idea of I can understand who's the expert, who's the owner of the data in a much more rapid fashion. I can fix the errors as I'm trying to use it in my analytics or I'm trying to take a look at as I'm trying to create a new model that's associated with the data and it's not working out the way. Let me go get my hands on the access to the expert at this. All the way through to the right-hand side which was less frequent data related errors. I'm being preemptive of it. I'm fixing the problem before it becomes an issue. What a concept. The reason why we're able to do this is once again I think one of the key areas is in the area of veracity. The fact that we have data and we have multiple versions of the data but each one's been slightly manipulated and the idea of it is I used the wrong source because somebody called it revenue and it wasn't really revenue, it was a subset of revenue database but the idea of how can I avoid these things before going forward. So much so that one organization says because of this idea of the quality of data that there was millions of dollars of new opportunity created for the marketing team because they didn't have to worry about it. It was right out of the gate. They were able to increase their segmentation. They were able to create more marketing models associated with it. They were able to drive more campaigns out. This is directly from customers, clients of ours that are going to go forward. So what is the key option out of this? Make sure you think about how you want to measure it and I think measurements have got to go beyond just the idea of I can save you time. You've got to track it all the way back into the line of business and into the business functions that we're trying to implement against and it comes in many ways. So we've done approach, reach and value. The last area I want to really talk to you about is trust and there's so many different ways people talk about trust. Can I trust the data? Is it right data? One of the things I think is really becoming a little bit more to the forefront and especially as we're seeing more organizations trying to think about can I put a financial value on the data I have because now my organization is using it much more readily as we go forward and I'm trying to think about different ways. Is my company undervalued without the idea of with all this data? The one thing we have to really remember is that data goes two ways. I have a high degree of trust, people like it and I get more data. What happens if that trust level goes down? Well, what we find out is people stop sharing data. So we have to think, if we want to start putting a broader value on our data, we have to really make sure that trust really becomes even more so in the forefront. And some great research, I thought came coming out of SAP and specifically around some of the global insights. The fact that consumers are willing to share data. We know that. And as I mentioned with GDPR, especially with the Mars, if I don't want it, I can want to see how you're doing it and the like. But it's the number one reason why they will stop doing business with a brand is if they lose that sense of trust. And it's even more so that you can see the other ideas is that once that trust is lost, it's willing, I'm looking at reducing up to over a third or even more the business that they do with an organization. And why? Because there's a contractual relationship here, especially if you think about it from a B to C perspective. As a data consumer, as a business, I'm a data consumer. I consume the data that comes in from my consumers. They are the data providers. They expect free things from me. They expect the idea that I'm going to get some sort. If I share this data with you, I expect some level of value. Is it a discount? Is it a better shopping experience? Some level of value. And this is also true from a B to B sense. The second thing is they expect transparency. How are you going to use this data? What is it you're going to do? And the third key piece of that is privacy. That you're going to keep my data private. So this informal or formal contractual relationship, I think as we're seeing more and more, it's going to become much more of a formal relationship, is key as we're thinking about using the data and how we have to think about cattle because it's just not about making sure people could find it. But I need to make sure exactly that we're very clear about how we're using the data. Because the last thing I want to do is break that trust. Because not only would they share more data with me, but they're going to share deeper levels of data. Is it initially just my name and my address, my telephone number is next, then it becomes my email address? Well, then I can start doing location-based services. Well, because why? Because I see a value with it, and you're doing a good job of using it the way you said. And I like that relationship. I like more and more coming in. So what you find out is, and this is a really interesting piece of work that came out from IBM, what they showed is a very strong correlation. I think they've done it for Telco Banking Financial Markets, is as trust goes up, so does the willingness to share. A very direct correlation. So what we want to do is we really want to build that trust because I want more data, because data has a value. I've now gotten ways of applying that data going to go forward. And I want a deeper level of data that's associated with coming in. Now, as I mentioned, we've got to be careful. Like the idea of it is if I have more data, what can I do with it? And it's not saying you always got to go all the way to the right, but I can create new products and services. I can improve my marketing and sales. I can take a look at new services I want to introduce. But if I don't do a good job with it, it goes the other direction. Trust is bidirectional. If I lose it, I can go further. I can go from right to left, right? And so therefore, whether it be improper usage or lack of transparency or the idea that was failed to protect it. So trust, as you're thinking about, in the context of catalogs is very important. So now we've got this idea, okay, I've gotten a clear view of what is it I'm trying to do from my approach. I'm going to go through and take a look at who's going to be impacted by this. What is it I'm trying to achieve? I've got an idea of what is the value that I want to try. What is the end result I really want to get to? Is it really just time, or is it really trying to do these other areas? And I do have at the forefront of my thinking is, what is the idea of trust? Can I really get that trust? So now in that model, let's talk capabilities. What are the core capabilities? So earlier in my presentation, I said it comes down to three core areas. The idea of you have your core catalog, and you can see the core capabilities make it up there. You've got governance, which provides that level of trust. And I break out, I think it's really important that we think about behaviors and experience differently. Differently than just the core catalog. If you just think about all the best e-commerce sites you've ever gone to, was the catalog really good, or was their experience what really set them apart? And I think it's the same way we think about with data as we're trying to make this an area. Is it how they search browse? How do they really think about finding that data? Do you have crowdsourcing discussion groups? Others like me have searched for something. So the idea of really applying a lot of that different consumer-based experiences so you can draw into that broader reach of individuals that you want to look for. As compared to the core elements of a catalog, it is including the same thing. It's tying it to your glossary and dictionary, so it's got the broad business alignment. It's got the linkage into the metadata. It's got a broader understanding so you can understand, so you can really look at the idea of quality. So from a lineage, and you're tying into your data quality tools that you've currently gotten inside, thinking about profiling and sampling, and you really want to think about some of the workflows that you want to tie it into, why is it there? Who said that this should go into the catalog so making sure that we keep it manageable and once again, we don't increase those unforeseen liabilities as they go forward? All the way through to the right-hand side, you think about it, am I locked into from a trust? Is it aligned to my company policies? Am I linked into my stewards properly managing and looking at as we're going to go forward? To the idea of now I'm being able to offer certification. So it's not I have to put five different data source out there. It's on Net Matters, and I can really help them to build that idea of trust. And because these things are all connected together, you need to really be able to, and it is a process that we're trying to affect, that you need to have broad degrees of integration if you're thinking about different ways of solving it, whether it be for some of the different, for specific industry areas like BCBS, or something we mentioned before, the European regulations like GDPR, that really is impacting on a global basis. So that's what I think about a catalog. That's what I think about a catalog. A catalog is the cornerstone, but we have to think about experience, and we have to think about trust as we're going to go forward. So how do we really bring these things together? If we look at capabilities, what should that experience look like? Well, the thing is that everyone's really thinking about is I just want to go in there and find. I want to find my data. Well, is that finding of data and analytics? Or is it data that's not only to my organization, but am I bringing in external data sources like a Nielsen? And then the ability of being able to, if something's not in there, the idea to collaborate with the community is the whole thing, hey, I'm looking for this, or here's a data source I want to bring in. How do I get it brought into the catalog so that it really makes sure it's not only myself, but others can take full advantage of it. So that's what we think about find. What is that experience of find as you go forward? The next thing is really is making sure you have understanding. And this comes back into what is the data? What does it really mean? Is it a level of quality? Where does it come from? What's the lineage of it? Not only from the sourcing of it, but then if I change something from a source perspective, what are all the downstream implications? And this could be data, or it could be how we're calculating, like customer lifetime value. Hey, we're changing the definition of lifetime value. We're changing the calculation. Let me notify everyone who's used this definition as it goes forward, because we have now an understanding of this. And let me notify them how that is going to impact your analytics, it's going to impact your dashboard. So once again, being pre-emptive on some of those quality issues that can arise later on as we go forward. And then lastly then is you come back into trust. And so the idea of trust is making sure it's in agreement with policies that we have going to go forward, and that we have very clear workflows on how things get handed off so we can better automate this. And we can really think about very ways of really scaling this up, always thinking that we have the trust. And that feeds back into the find, because now I have trust. You can certify data. You can make sure it's very easily brought in. I have data sharing agreements, so it makes it much easier for me to bring in data. And now I even think about data sharing agreements now. And now I can easily share it downstream with my channel partners. Maybe it's a series of agents that are reselling a portfolio product, not only mine, but my competitors. And now I can share the more data about customer behavior, because I've been able to ensure the privacy. I've been able to mask the identities so that they can do it. And now they understand, wow, it's a better experience. I'm more willing to push more business to them, because they're able to give me this better experience. And once again, my data becomes that much more of a competitive opportunity as it is going to go forward. So that's a little bit way of how you can think about these core capabilities of a catalog, which is really about finding and understanding that experience that really helps bring it all together. And the idea of that a catalog and governance is necessary. It's the idea of you get the offense and the defense. It's the idea that you have some level of control, but you're really freeing up the data to go forward. That's what a catalog is about. So when someone says, I need a catalog, what is it? That's the way I kind of talk to them about the way of thinking about a catalog. What is it you're trying to achieve? What is the processes and who's all the individuals that are going to be impacted by it? And how do we think about it going to go forward? And we've seen different individuals come at it from different ways. We've seen a top-down perspective. They're coming, driving it from a BI analytics perspective, and a top-down. We have Adobe, who's a great client of ours. And once again, they've got all this customer behavior data, and they want to get out into what it would be from their marketing campaigns to try to think about recommendation of new products and services to even how they think about designing the product. So this idea of how do I take all these insights and really drive it from a top-down, from an analytics, I'm going to go down perspective. All the way through to Dell EMC, which really is the idea of coming up. I've got the data lake. I've got my master data management solutions. My data warehouses. Use your choice of flavor. Oh, my God. No one's using it. How do I really make sure individuals are really taking full advantage of this? Is there a way I can really think of a better way of giving them a better understanding and a better experience so they're more willing to seeing the value of this data that we're now making available to them? All the way through to this very specific regulatory compliance areas that we have going to go forward. So there's different ways that people are coming into this problem, depending upon the use case. So let me just bring this together. So if I was to say the key takeaways, and especially as we're thinking about this idea of bringing some of the idea of chaos to all this data landscape and how a data catalog can really help you to find your data, where's my data. The three key areas, once again, coming back into what is the core challenge. It's the idea of we've got to become much more digital and data is going to be the cornerstone of that. But it's not just about making the data available. It's about the engagement around the data's key. The second key piece is, as I mentioned, bring chaos to this. A catalog on governance is the right solution. But before you think about it, make sure you focus around your approach, your reach, the value, and your trust as you're going to go forward. And then lastly then is how do you get started, where do you want to come in from? Your journey will vary. But it's going to require you to focus around different types of skills, as well as new capabilities, whether it be capabilities within the catalog or the idea of how I think differently all the way through that I really need to ramp up how I'm thinking about governance, because it is going to need to be a cornerstone of what it is that I'm going to go forward. So with that in mind, now here's who's Kaliber. So who's Kaliber? Kaliber is a company that can really help around this. We really do focus around this idea of find, understand, and trust, always in support of the idea of making sure that you are in looking at it from a data privacy perspective. We are that middle layer in the center that really allows you to reach between all the things you want to do from a BI, analytics, data scientist layer to your data management. It's the idea of being able to provide that, the core catalog, support it by governance, but really thinking about it from a data experience perspective. What makes this unique is the fact that this is how we came into market, was maniacally focused in this area. We don't come from a data management sphere trying to grow up or from a top-down perspective. It really is business user-driven. That collaboration between the end user and the IT-enabling organization and all the other supporting roles around it, that's what we're focused on. We are truly driven by the process of making sure that this data asset becomes a value to the organization. Take a look at your favorite, like Magic Quadrant Wave, whatever it is. We're a leader in all of them. Why? Because we have that combined view. We come from the originations of what is the problem that's trying to solve. But it's not only around the technology and some of that, but the idea of the community that we bring together. We have an offer up, come to our website. We have a community of over 4,000 practitioners, like yourselves, that are trying to figure out how to solve these problems. Come to the community, ask them a question, pose a question. See what they're saying. Find out what the discussions are coming on. Or the idea is that we also make available online classes. If you're not sure where to start, start there. Learn. Learn what others are doing, what other clients are doing, and you can read other customer case studies. All the way through to one of the things we always get started is how do I get started? What's the best practice? I'm not really looking for services, but it's someone who can coach me. We offer a serious different degree of coaching services that comes forward. All the way through to the ecosystem of partners. In order to provide this middle layer and try to think about all these different roles, we have a very expansive partner network that can really help to go in and solve the various problems. With that in mind, that's a little bit more about how we really see and hopefully where's the data, let your data catalog find it, but really it's about helping you to find it and utilizing a data catalog in governance to get there. With that in mind, let me hand it back to Shannon and Eric for some questions. We do have some good questions here too. Great presentation, by the way, Paul. I like that you kind of walked us through how we got here and what's really going on and what those key pillars are. I always like viewing complex issues through some kind of a matrix. You need some kind of a management environment, if you will, just for your thoughts so you can understand what's going on and you have to map these things back to your key pillars until your objectives, right, where you're trying to accomplish. One of these questions is pretty specific and it's a good one. The attendee is asking, given the examples at the top of the webinar about Department of Defense and organizations that have been around for a long time, how does Kaliber work with legacy mainframe systems that use JCL, COBOL, ATA-based, stuff like that? Well, anything. I mean, the nice part of it, because we're not really dealing with the data, we're dealing with the metadata. The idea of it, and we do have a very strong connect layer, so we use, you know, based upon what we call as Kaliber Connect, which is based upon MuleSoft. So we have an integration layer and we have opened up a number of our APIs. So the idea is there will be some that we custom develop that are come out of the box, AWS, Tableau, and some of the others that we're looking at building on. Or we have a partner network that would take a look at bringing in those different data sources. And so that would be the way we would take a look at it. We have to take a look at the specific data source and where the metadata is being stored and what's the best way of bringing it in. But we do have a pretty open infrastructure to be able to address the needs of almost any sort of data that's coming in. But typically we would do this through a business partner. We don't have initially a direct link into some of those mainframe sources. I see, okay, well, that's fine. And in terms of the value here and its comparison to other technologies, right, we've had metadata management tools for decades now. But this strikes me as a sort of supercharged, business user-driven navigation technology to help understand and synthesize disparate information systems. Is that a fair assessment? I would definitely say, and I think it's about driving, and I think it's the way the measurement's going to be measured as compared to, as you said, management, I think is one as compared to, I think this is going to be focused around consumption. And I think that's how we have to think about looking at it and it just doesn't stop at a catalog, which I think is just another way of managing things. But how do we then take it and drive consumption and what do we measure around it? Well, it makes a lot of sense. Okay, good. So let's kind of walk through some other questions here. One of the attendees is asking, kind of walk us through an implementation. Let's say someone decides, yes, I need one of these data catalogs. What does the implementation process look like? How long does it take? Who needs to be at the table? Walk us through how that works. That's a pretty broad, typically you start is like, what is the problem you're trying to solve? And once again, I think many times, especially it sounds like we're coming from a very, very open space, I think it comes back into the business definitions and the glossary and glossary and dictionaries. It's really where we see a lot of it starts because you want that industry alignment. Why is it that I'm focused in on this area? What is the alignment back into the business? What are the key data sources that I need that are associated with it? Let me start there. I think that's the best way. As compared to I've got all this data, let me just put it in there and I'll figure out why I need it later on because what we see is a lot of people that don't start from that reason, you get a lot of noise in there and you're already getting into this layer of you're driving up the complexity. If your focus is on consumption, think about what the impact is that you're trying to solve, the process that you're trying to solve and what is it, the core elements of it. Because many times what you want, especially when you get started, is start with 100 assets. Start with one key pillar. Start with 100 assets and start with your key processes. Map those in. And with a workflow, now that you've gotten it, you've proven it out, what you know is the next step is then, as you mentioned before, Eric is now either going to work two or three other pillars. Well, you've already got the process defined in your workflow. You just got to extend it. Now I can very quickly go from 100 assets into a thousand assets into it. And then you think about how that can scale from there. But don't think about that I have to go solve the entire organization. It's much different. Start with the alignment into the business and we typically see that through a glossary and dictionaries because that will get your business definitions. And you focus on the prioritization of what assets are important to that and you use that to then define your initial processes and you define them into the workflows and then you really think about some of the initial integrations that you need to go through there. And then it's much easier to scale from that perspective. Interesting. That's a good answer. And let me throw this one at you. Attendee is pointing out data models and data modeling were not mentioned. Are those not part of the entire data catalog framework like metadata glossary? You just mentioned business dictionaries. So that's part of it. But what about data models? Can you ingest those or how does that work? To me, it's the same way. I mean, if you think about what I want to be able to do a data model could be something that I put in there and it's no different. It's like why I need to have who's put it in there, why it's put in there, what's all the key elements that I want to manage around it. How do I judge the quality of it and basically how can people find it? I think about it. It's very applicable to different ways you can think about it as you're going to come forward. And there's no reason why it can't be included in there. Okay, good. Here's another really good question. An attendee is asking, does Kaliber help highlight data source design issues, i.e. transactional systems that didn't have proper design to ensure data integrity, cause data anomalies, issues with quality of data, and so forth. Can you give some visibility as you roll out into source systems maybe that have some flaws? Wow, that's an interesting one. I think what you find out is, to your point is, once again, since we're very much focused around the metadata and therefore going into the actual data itself, but what you'll find out is, I think there's different ways you can do it. If you may get available in there, one is if you think about the crowd sourcing. Your community, the first one that uses it will publish out there saying, there's a problem with this. The second thing here is, because now you know who the owner is that's associated with it and there's a quality associated with it. You've also included in there who's the owner of the data, who's the steward of that. Is there a way I can immediately find, get in contact with them if I find out that there's issues going to go forward. Additionally too, as you're going to go forward, because you also include the idea of reference data and sample data, and the people being able to see about how it's being utilized and other things it's being utilized with or not being utilized with, that may be another indicator as you go forward. I think there's different implicit ways that the system itself and your approach can really help you to take a look at different ways of quality of data as compared to data quality itself, because there's enough tool up there around data quality and that's not an area of focus for us. But the idea of you really want to take a look at its usage, individuals around it, others that are thinking about utilizing it and really trying to establish that from a much more collaborative approach I think will be a little bit more of a core area for Calibra. Yeah, that makes a lot of sense. And I love this concept of data citizens that you talk about. It seems to me this is a great development, because if you look at the data management environment, especially over the last 20 years or so, and even oddly enough, increasingly over the last few years, because of this whole Hadoop movement, which required Java programmers to help you even query the data, you had the tendency for environments to acquire deep technical expertise with data structures, data management protocols, technologies, and so forth, in order to get somewhere. And that made data usage pretty esoteric, or at least high-powered data management kind of rare, like it was the power user who could do all that stuff. And it seems to me that Calibra is one of these companies that's really pioneering for the role of traditional business users, non-technical necessarily, they could understand some technical stuff. But business users to really be able to wrap their heads around the data that they have, how it connects to the enterprise, where they can get additional data to share, and so on and so forth. So I like this concept. Are you finding that companies are really resonating with this when they hear this concept of data citizen? I think it's what organizations are really striving for, because it's not only then, as I think it's the management of the data, but you really want to understand the ecosystem that you're building up. How is it being utilized? Are there areas that I can look for, other areas that I can expand from there, if I find something that's wildly being utilized, and let me go focus there, as compared to it's a splatter than its approach, and let me think about, I'm going to put it all in there and see what sticks. I think the idea of netting it down and helping to get that initial focus is useful, but then letting user behavior drive where do we go from here? We've got a number of organizations, Unicox Automotive, which once again is growing a lot through different acquisitions. And one of the key challenges is how do I share data across all these different businesses, so I can really take a look at an omni-channel problem that we're trying to solve, and the idea of really being able to utilize this idea of data citizens and data citizens across multiple businesses and the like, to really take advantage of now that it's become a part of a collective whole and truly breaking down those pillars and those silos of data, I think it's really what they're trying to get through. Once again, it comes from this idea of data citizens and what's necessary for them to go drive the consumption of the data. Okay, I know you talked a little bit about this, but maybe if you could expound a little the roles that bring your technology in and that really get it. Chief data officer comes to mind. That's an increasingly popular role. What about chief analytics officer or chief operating officer? Is there a tendency for one of the other to kind of take the lead here? What we see is it depends upon the industry. If you're an organization that is very much already very digitally driven, you'll find out that many times will come in to that chief analytics officer or even from a marketing team because they're very comfortable with the idea and the broad usage of data and they may be that the problem that they're trying to solve is a customer 360 challenge. We've taken a look at those that a little bit more from a little bit more of a regulated industry, a little bit more of a custom development shop like banking financial markets, you'll see it will come in through a little bit more of the CEO type of audience. Now they're trying to take a look at how do I look at it across different parts of the business. Always with an eye of privacy and compliance or some other regulation that I initially brought the focus around to it. Now I'm trying to look at a way of expanding it. I think you see it's very industry specific. It will vary by industry and then also by function and that's why it really depends upon who is the champion. I think most of the time because who sees that purview if we come back to the idea of the approach, that is either the idea of the CAO or the CDO tends to be it as compared to other parts of the organization we usually see different parts of that, different steps within that and overall approach as it is coming forward. Of course the idea of IT is really a key support function but even more so now where you're seeing the IT organization trying to think about ways of how do I build privacy into my data ops process. I think you're going to see a much stronger role that the CIO or CTO will play in this as they're trying to marry this idea of privacy, building privacy up into DevOps. We're just about out of time but I'm going to throw one last question over to you. It's kind of an interesting one. Let's say it's a curve ball to end the program here. One attendee is asking what about when incoming data is from different sources for the same data. Validation has been performed on the data at this point and a new additional data source is found. How do you best include yet differentiate that source in the trust portion to avoid cause? This kind of gets back to single version of the customer right? Like multi-channel type stuff. Can you talk about that very quickly? I think the idea is the idea that there's no single version of the truth as compared to multiple versions as long as they're trusted. As long as it's clearly defined and the person looking for it understands what's been done from it and a broader understanding whether it be through the tagging or the idea of other information provided around it, that's okay. That's one of the things that we're seeing. It's okay to put it out there if there's a need for it and as well as it's broadly understood. Therefore, I know if I'm using it, I have a very clear set of expectations around what happened to the data and what I should be using it for. This idea that there's only one version of the truth, that's not going to work. There are multiple versions of truth, but the idea is as long as I can trust it and it's a very clear understanding why I can trust it and if I do have any questions where I can go through and source it, that's what we're seeing more and more and more happen. I think that's the idea of don't focus around consolidating all down but making sure if there are different versions, understand why and that comes back into lineage and what was done to the data and why is it being put in there so that business alignment is critical as it goes forward. That's great stuff. Let me hand it back to Shannon. We went just over an hour but great stuff, great answers. A lot of great questions. Thank you to the audience for those wonderful questions keeping both of us on our toes. I love it. Yes, thank you, Eric. Thank you, Paul, for this great presentation and information and as Eric mentioned, thanks to all of our attendees for being so engaged in everything we do. Just a reminder, I will send a follow-up email by end of day Friday for this presentation to all registrants with links to the slides, links to the recording of this session and Paul's and Calibra's information. Thanks, everybody. I hope you all have a great day. Thanks, Paul. Thanks, Eric. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thanks, everybody. Take care, folks.