 Hi everyone and welcome to Data Citizens. Thank you for making the time to join me and the over 5,000 Data Citizens like you that are looking to become united by data. My name is Jim Cushman. I serve as the Chief Product Officer at Calibra. I have the benefit of sharing with you the product vision and strategy of Calibra. There's several sections to this presentation and I can't wait to share them with you. The first is a story of how we're taking a business user and making it possible for him or her to find data, use data and gain benefit and insight from that data without relying on anyone in the organization to write code or do the work for them. Next, I'll share with you how Calibra will make it possible to manage metadata that scales into the billions of assets and again, load this into our software without writing any code. Third, I will demonstrate to you the integration we have already achieved with our newest product release. It's data quality that's powered by machine learning. And finally, you're gonna hear about how Calibra has become the most universally available solution in the market. Now, we all know that data is a critical asset that can make or break an organization. Yet, organizations struggle to capture the power of their data and many remain afraid of how their data could be misused and or abused. We also observe that the understanding of an access to data remains in the hands of just a small few and astounding three out of every four companies continue to struggle to use data to drive meaningful insights. All forward-looking companies are looking for an advantage, a differentiator that will set them apart from their peers and competitors. What if you could improve your organization's productivity by just 5%? Even a modest 5% productivity improvement compounded over a five-year period will make your organization 28% more productive. This will leave you with an overwhelming advantage over your competition. And uniting your data-literate employees with data is the key to your success. And dare I say, survival. To unlock this potential for increased productivity and a huge competitive advantage, organizations need to enable self-service access to data for every data-literate knowledge worker. Our ultimate goal at Calibra has always been to enable this self-service for our customers, to empower every knowledge worker to access the data they need when they need it, but with the peace of mind that your data is governed and secure. Just imagine if you had a single integrated solution that could deliver a seamless, governed, no-code user experience of delivering the right data to the right person at the right time, just as simply as ordering a pair of shoes online. That would be quite a magic trick and one that will place you and your organization on the fast track for success. Let me introduce you to our character here, Cliff. Cliff is that business analyst. He doesn't write code. He doesn't know Julien or R or SQL, but Cliff is data-literate. When Cliff is presented with data of high quality and can actually help find that data of high quality, Cliff knows what to do with it. Well, we're gonna expose Cliff to our software and see how he can find the best data to solve his problem of the day, which is customer churning. Cliff is gonna go out and find this information. He's gonna bring it back to him and he's gonna analyze it in his favorite BI reporting tool, Tableau. Of course, that could be Looker, could be Power BI or any other of your favorites. But let's go ahead and get started and see how Cliff can do this without any help from anyone in the organization. So Cliff is gonna log into Clebra and being a business user, the first thing he's gonna do is look for a business term. He looks for customer churn rate. Now, when he brings back churn rate, it shows him the definition of churn rate and various other things that have been attributed to it, such as data domains like product and customer and order. Now, Cliff says, okay, customer's really important. So let me click on that and see what makes up customer definition. Cliff will scroll through customer and find out the various data concepts, attributes that make up the definition of customer. And Cliff knows that customer identifier is a really important aspect of this. It helps link all the data together. And so Cliff is gonna wanna make sure that whatever source he brings actually has customer identifier in it and that it's of high quality. Cliff is also interested in things such as email address and credit activity and credit card. But he's now gonna say, okay, what data sets actually have customer as a data domain in? And by the way, why I'm doing it, what else has product and order information that's again, relevant to the concept of customer churn? Now, he goes on, he can actually filter down because there's a lot of different results that could potentially come back. And again, customer identifier was very important to Cliff. So Cliff further filters on customer identifier and he further does it on customer churn rate as well. This results in two different data sets that are available to Cliff for selection. Which one to use? Well, he's first presented with some data quality information. You can see for customer analytics, it has a data quality score of 76. You can see for sales date enrichment data set, it has a data quality score of 68. Something that he can see right at the front of the box of things that he's looking for. But let's dig in deeper because the contents really matter. So we see again the score of 76, but we actually have a chance to find out that this is something that's actually certified. This is something that has a check mark and so he knows someone he trusts is actually certified this as a data set. You'll see that there's 91 columns that make up this data set. And rather than sifting through all of that information, Cliff is gonna go ahead and say, well, okay, customer identifier is very important to me. Let me search through and see if I can find what its data quality scores. Very quickly, he finds that using a fuzzy search and brings back and sees, wow, that's a really high data quality score of 98. Well, what's the alternative? Well, the data set only has 68, but how about the customer identifier? And quickly he discovers that the data quality for that is actually only 70. So all things being equal, customer analytics is the better data set for what Cliff needs to achieve. But now he wants to look and say, other people have used this, what have they had to say about it? And you can see there are various reviews, four different reviews from peers of his and the organization that have given it five stars. So this encourages Cliff's confidence that this is a great data set to use. Now, Cliff wants to look a little bit more detailed before he finally commits to using this data set. Cliff has the opportunity to look at it in the broader set. What are the things can I learn about customer analytics, such as what else is it related to? Who else uses it? Where did it come from? Where does it go? And what actually happens to it? And so within our graph of information, we're able to show you a diagram. You can see the customer analytics actually comes from the CRM cloud system. And from there, you can inherit some wonderful information. We know exactly what CRM cloud is about as an overall system. It's related to other logical models. And here you're actually seeing that it's related to a policy, policy about PII or personally identifiable information. This gives Cliff almost the immediate knowledge that there's gonna be some customer information in this, PII information that he's not gonna be able to see given his user role in the organization. But Cliff says, hey, that's okay. I actually don't need to see somebody's name and social security number to do my work. I can actually work with other information in the data file that'll actually help me understand why are customers churning and what can I actually do about it? If we dig in deeper, we can see what is personally identifiable information that actually could cause issues. And as we scroll down and take a little bit of a focus on what we call, or what you'll see here is customer phone because we'll show that to you a little bit later. But these show the various information that once Cliff actually has it fulfilled and delivered to him, he will see that it's actually masked and or redacted from his use. Now, Cliff might drive in deeper and see more information. And he says, you know what? Another piece that's important to me in my analysis is something called is churned. This is basically suggesting that has a customer actually churned. It's an important flag, of course, because that's the analysis that he's performing. Cliff sees that the score is a mere 65. That's not exactly a great data quality score. But Cliff is kind of in a hurry. His boss has come back and said, we need to have this information so we can take action. So he's not gonna wait around to see if they can go through some long data quality project before he pursues. But he is gonna come up and use at the speed of thinking, he's gonna create a suggestion, an issue. He's gonna submit this as a work queue item that actually informs others that are responsible for the quality of data, that there's an opportunity for improvement to this data set that is highly reviewed, but it maybe it has room for improvement. As Cliff is actually typing in his explanation that he'll pass along, we can also see that the data quality is made up of multiple components, such as integrity, duplication, accuracy, consistency, and conformity. We see that we can submit this issue and pass it through. And this will go to somebody else who can actually work on this. It will show that to you a little bit later. But back to Cliff. Cliff says, okay, I'd like to work with this data set. So he adds it to his data basket. And just like if he's shopping online, Cliff wants that kind of ability to just say, I wanna just click once and be done with it. Now it is data and there's some sensitivity about it. And again, there's an owner of this data who you need to get permission from. So Cliff is gonna provide information to the owner to say, here's why I need this data. And how long do I need this data for? Starting on a certain date and ending on a certain date. And ultimately, what purpose am I going to have with this data? Now, there are other things that Cliff can choose during this. One is, how do you want this data delivered to you? Now you'll see down below, there are three options. One is borrow, the other is lease and others buy. What does that mean? Well, borrow is this idea of, I don't wanna have the data that's currently in this CRM cloud database moved somewhere. I don't want it to be persisted anywhere else. I just wanna borrow it very short term to use in my Tableau report and then poof, be gone. Cause I don't wanna create any problems in my organization. Now you'll also see lease. Lease is a situation where you actually do need to take possession of the data, but only for a time box period of time. You don't need it for an indefinite amount of time. And ultimately buy is your ability to take possession of the data and have it in perpetuity. So we're gonna go forward with our borrow use case and Cliff is gonna submit this and all the fun starts there. So Cliff has actually submitted the order and the owner, Joanna, is actually gonna receive the request for the order. Joanna opens up her task queue, sees there's work to perform. Says, oh, okay, here there's work for me to perform. Now, Joanna has the ability to automate this using incorporated workflow that we have in Clebra. But for this situation, she's gonna manually review that Cliff wants to borrow a specific dataset for a certain period of time and he actually wants to be used in a Tableau context. So she reviews it, makes an approval and submits it. This in turn flips it back to Cliff who says, okay, what obligations did I just take on in order to work for this data? And he reviews each of these data sharing agreements that you as an organization would set up and say, what are my restrictions for using this dataset? As Cliff accepts his notices, he now has triggered the process of what we would call fulfillment or a service broker. And in this situation, we're doing a virtualization access for the borrow use case. Cliff suggests Tableau as his preferred BI and reporting tool. Again, you can see the various options that are available from Power BI, Looker, Size Unstot Spot. There are others that could be added over time. And from there, Cliff now will be alerted. The minute this data is available to them. So now we're running out and doing a distributed query to get the information. And you see it returns back for raw view. Now what's really interesting is you'll see the customer phone has a bunch of X's in it. If you remember, that's PII. So it's actually being masked so Cliff can't actually see the raw data. Now Cliff also wants to look at it in a Tableau report and can see the visualization layer. But you'll also see an incorporation of something we call Calibra on the go. Not only do we bring the data to the report, but then we tell you the reader how to interpret the report. It could be that there's someone else who wants to use the very same report that Cliff helped create, but they don't understand exactly all the things that Cliff went through. So now they have the ability to get a full interpretation of what was this data that was used? Where did it come from? And how do I actually interpret some of the fields that I see on this report? Really a clever combination of bringing the data to you and showing you how to use it. Cliff can also see this as a registered asset within Calibra. So the next shopper who comes through might actually, instead of shopping for the data set, might actually shop for the report itself. And the report is connected with the data set he used. So now they have a full bill of materials to run a customer's sharing report and schedule it anytime they want. So now we've turned Cliff actually into a creator of data assets. And this is where intelligence begets more intelligence. And that's really what we call data intelligence. So let's go back through that magic trick that we just did with Cliff. So Cliff went into the software, not knowing if the source of data that he was looking for for customer product sales was even available to him. He went in very quickly and searched and found his data set, used facts and facets to filter down to exactly what was available, compared and contrast the options that were there, actually made an observation that there actually wasn't enough data quality around a certain thing was important to him, created an idea or basically a suggestion for somebody to follow up on, was able to put that into his shopping basket, check out and have it delivered to his front door. I mean, that's a bit of a magic trick, right? So Cliff was successful in finding data that he wanted and having it delivered to him. And then in his preferred model, he was able to look at it into Tableau. All right, so let's talk about how we're gonna make this vision a reality. So our first section here is about performance and scale, but it's also about codeless database registration. How did we get all that stuff into the data catalog and available for Cliff to find? So allow us to introduce you to what we call the asset lifecycle. In some of the largest organizations in the world, they might have upwards of a billion data assets. These are columns and tables, reports, APIs, algorithms, et cetera. These are very high volume and quite technical and far more information than a business user like Cliff might wanna be engaged with. Those very same really large organizations may have upwards of say 20 to 25 million that are critical data sources and data assets, things that they do need to highly curate and make available. But through that is a bit of a distillation, a lifecycle of different things you might wanna do along that. And so we're gonna share with you how you can actually automatically register these sources, deal with these very large volumes at speed and at scale and actually make it available with just the level of information you need to govern and protect, but also make it available for opportunistic use cases such as the one we presented with Cliff. So as you recall, when Cliff was actually trying to look for his dataset, he identified that the is churned data attribute was of low quality. So he passed this over to Eliza who's a data steward and she actually receives this work queue in a collaborative fashion and she has to review what is the request. If you recall, this was the request to improve the data quality for is churned. Now she needs to familiarize herself with what Cliff was observing when he was doing his shopping experience. So she digs in and wants to look at the quality that he was observing. And sure enough, as she goes down and looks at is churned, she sees that it was a low 65% and now understands exactly what Cliff was referring to. She says, aha, okay, I need to get help. I need to decide whether I have a data quality project to fix the data or should I see if there's another dataset in the organization that has better data for this. And so she creates a queue that can go over to one of her colleagues who really focuses on data quality. She submits this request and it goes over to her colleague, John, who's really familiar with data quality. So John actually receives the request from Eliza and you'll see a task showing up in his queue. He opens up the request and finds out that Eliza is asking if there's another source out there that actually has good is churned data available. Now he actually knows quite a bit about the quality of information stewarding this. So he goes into the data quality console and does a quick look for a dataset that he's familiar with called customer product sales. He quickly scrolls down and finds out the one that's actually been published. That's the one he was looking for. And he opens it up to find out more information, what data sets or what columns are actually in there. And he goes down to find is churned is in fact one of the attributes in there. It actually does have active rules that are associated with it to manage the quality. And so he says, well, let's look in more detail and find out what is the quality of this dataset. Oh, it's 86. This is a dramatic improvement over what was seen before. So we can see again, it's trended quite nicely over time. Each day it hasn't actually degraded in performance. So we actually responds back to Eliza and say, this dataset is actually the dataset that you wanna bring in, it really will improve. And you'll see that he refers to the refined database within the CRM cloud solution. Once he actually submits this, it goes back to Eliza and she's able to continue her work. Now, when Eliza actually brings this back open, she's able to very quickly go into the database registration process. For her, she very quickly goes into the CRM cloud, selects the community to which she wants to register this dataset into the schemas community. And the CRM cloud is the system that she wants to load it in. And the refined is the database that John told her that she should bring in. After a quick description, she's able to click register. And this triggers that automatic, codeless process of going out to the dataset and bringing back its metadata. Now, metadata is great, but it's not the end all, there's a lot of other values that she really cares about. As she's actually registering this dataset and synchronizing the metadata, she's also then asked, would you like to bring in quality information? And so she'll go out and say, yes, of course, I want to enable the quality information from CRM refined. I also want to bring back lineage information to associate with this metadata. And I also wanna select profiling and classification information. Now, when she actually selects it, she can also say, how often do you wanna synchronize this? This is a daily, weekly, monthly kind of update. That's part of the change data capture process. Again, all automated without the require of actually writing code. So she's actually run this process. Now, after this loads in, she can then open up this new registered dataset and actually look and see if it actually has achieved the problem that Cliff set her out on, which was improved data quality. So looking into the data quality for the is churned capability shows her that she has fantastic quality. It's at 100. It's exactly what she was looking for. So she can with confidence actually suggest that it's done, but she did notice something. It's something she wants to tell John, which is there's a couple of data quality checks that seem to be missing from this dataset. So again, in a collaborative fashion, she can pass that information for validity and completeness to say, you know what, check for nulls and empties and send that back. So she submits this on to John to work on and John now has a work queue in his task force. But remember, she's been working in this task for Cliff and because she actually has actually added a much better source for is churned information, she's gonna update that task that was sent to her to notify Cliff that the work has actually been done and that she actually has a really good dataset in there. In fact, as you recall, it was 100% in terms of its data quality. So this will really make life a lot easier for Cliff once he receives that data and processes the churn report analysis next time. So let's talk about these audacious performance goals that we have in mind. Now today, we actually have really strong performance and amazing usability. Our customers continue to tell us how great our usability is, but they keep asking for more. Well, we've decided to present to you something that you can start to bank on. This is the performance you can expect from us on the highly curated assets that are available for the business users as well as the technical and lineage assets that are more available for the developer uses and for things that are more warehouse based. You'll see in Q1 or Q2 of this year, we're making available 5 million curated assets. Now, you might be out there saying, hey, I'm already using the software and I've got over 20 million already. That's fair, we do. We have customers that are actually well over 20 million in terms of assets they're managing. But we wanted to present this to you with zero conditions, no limitations, we wouldn't talk about, well, it depends, et cetera. This is without any conditions. That's what we can offer you without fail. And yes, it can go higher and higher. We're also talking about the speed with which you can ingest the data. Right now we're ingesting somewhere around 50,000 to 100,000 records per hour. And of course, yes, you've probably seen it go quite a bit faster. But we are assuring you that that's the case. But what's really impressive is right now we can also help you manage 250 million technical assets and we can load it at a speed of 25 million per hour. And you can see how over the next 18 months about every two quarters, we show you dramatic improvements, more than doubling of these for most of them. Leading up to the end of 2022, we're actually handling over a billion technical lineage assets and we're loading at 100 million per hour. That sets the mark for the industry. Earlier this year, we announced a recent acquisition, LDQ. LDQ brought to us machine learning-based data quality. We're now able to introduce to you Kaliber data quality, the first integrated approach to LDQ and Calibra. We've got a demo to follow. I'm really excited to share it with you. Let's get started. So Eliza submitted a task for John to work on. Remember to add checks for null and for empty. So John picks up this task very quickly and looks and sees what's the request. And from there says, aha, yes. We do have a quality check issue when we look at the churns. So he jumps over to the data quality console and says, I need to create a new data quality test. So Cliff is able to go in to the solution and set up quick rules, automated rules. He can inherit rules from other things, but it starts with first identifying what is the data source that he needs to connect to to perform this. And so he chooses the CRM refined data set that was most recently registered by Eliza. You'll see the same score of 86 was the quality score for the data set. And you'll also see there are four rules that are associated underneath this. Now there are various checks that John can establish on this. But remember, this is a fairly easy request that he receives from Eliza. So he's gonna go in and choose the actual field is churned. And from there, identify quick rules of an empty check. And that quickly sets up the rule for him. And also the null check, equally fast. This one's established and analyzes all the data in there. And this sets up the baseline of data quality for this. This data, once it's captured, then is periodically brought back to the catalog so it's available to not only Eliza, but also to Cliff next time he were to shop in the environment. As we look through the rules that were created through that very simple user experience, you can see the one for is empty and is null that were set up. Now these are various styles that can be set up either manually or you can set them up through machine learning. Again, or you can inherit them. But the key is to track these rule creation in the metrics that are generated from these rules so they can be brought back to the catalog and then used in meaningful context by someone who's shopping. And the confidence that this has neither empty nor null fields, at least most of them don't, will now give confidence as you go forward. And as you can see, those checks have now been entered in and you can see that it's 100% quality score for the null check. So with confidence, now John can actually respond back to Eliza and say, I've actually inserted them, they're up and running and you're in good status. So that was pretty amazing integration, right? In four months after our acquisition, we've already brought that level of integration between a Calibra data intelligence cloud and data quality. Now, it doesn't stop there. We have really impressive and high site set. Early next year, we're gonna introduce a fully immersive experience where customers can work within Calibra and actually bring the data quality information all the way in, as well as start to manipulate the rules and generate the machine learning rules on top of it. All of that will be a deeply immersive experience. We also have something really clever coming, which we call continuous data profiling, where we bring the power of data quality all the way into the database. So it's continuously running and always making that data available for you. Now, I'd also like to share with you one of the reasons why we are the most universally available software solutions and data intelligence. We've already announced that we're available on AWS and Google Cloud prior, but today we can announce to you in Q3, we're going to be available on Microsoft Azure as well. Now, it's not just these three cloud providers that we're available on, we've also become available on each of their marketplaces. So if you are buying our software, you can actually go out and achieve that same purchase from their marketplace and achieve your financial objectives as well. We're very excited about this. These are very important partners for us. Now, I'd also like to introduce to you our system integrators. Without them, there's no way we could actually achieve our objectives of growing so rapidly and dealing with the demand that you customers have had. Accenture, Deloitte emphasis, and even others have been instrumental in making sure that we can serve your needs when you need them. And so it's been a big part of our growth and will be a continued part of our growth as well. And finally, I'd like to actually introduce you to our product showcases where we can go into absolute detail on many of the topics I talked about today, such as data governance with ARCO or data privacy with Sergio or data quality with Brian. And finally, catalog with Peter. Again, I'd like to thank you all for joining us and we really look forward to hearing your feedback. Thank you.