 From around the globe, it's theCUBE covering Data Citizens 21, brought to you by Calibra. Hi everybody, John Walls here on theCUBE continuing our coverage of Data Citizens 2021. And with now Kirk Hasselbeck who's the Vice President of Engineering at Calibra. Kirk joins us from his home. Kirk, good to see you today. Thanks for joining us here on theCUBE. Well, thanks for having me. I'm excited to be here. Yeah, no, this is all about data quality, right? That's your world, you know, making sure that you're making the most of this great asset, right? That continues to evolve and mature. And yet I'm wondering from your perspective, from your side of the fence, I assume data quality has always been a concern, right? Making the most of this asset wherever it is and whenever you can get it. Yeah, absolutely. I mean, the challenge hasn't slowed down, right? We're looking at more data coming in all the time, laws of large numbers, but you kind of have to wonder, a lot of the large organizations have been trying to solve this for quite some time, right? So what is going on? Why isn't it just easier to get our arms around it? And there's so many reasons, but if I were to list maybe the top one, it's the diminishing value of static rules. And a good example of that might just be something as simple as starting with a gender column. And back in the day, we might have assumed that it had to be an M or an F male or female. And over the last couple of years, we've actually seen that column evolve into six or seven different types. So just the very act of assuming that we could go in and write rules about our business and that they're never going to change and that the data is not evolving. And we start to think about zip codes and addresses that are changing, you know, Google Street View, however you wanna think of it, every column and every record is just changing all the time. And so what, you know, many large organizations have done and they've written maybe 40,000, 50,000 rules and they have to continue to manage them. So I think we all try to get our arms around rule creation and it's not even just about that, it would also be about if you had all the rules in place, could you even keep up with them on a day to day changing basis? And so one of the largest companies in the US sat down with myself and team early on and said, so what am I up against? I'm really either going to continue to hire a mountain of rule writers, you know, as they put it per department to get my arms around this and that'll never end. Or I need to think of a better way, which was the solution that we were ultimately providing at that time. And, you know, and what that solution really entails is using data mining to learn and observe all the data that's already there and to curate the rules based on the data itself, right? That's where all the information is. And then ultimately we have this concept of adaptive ruling, which means all the variants in that column, all the new values that come in every day, the row counts, the sizes are all being managed. It's an automatic program so that the rule is recalibrating itself. And I think this is where most chief data officers sit back and say, if I have to protect the franchise, right? If I have to put a trusted data program in place, what are my options and how does it scale? And they have to take a really hard look at something like this. You know, the process that you're talking about too, it just kind of reminds me of a diet in that nobody wants to go through that pain, right? We all want to eat, all we want to eat, but you're really happy when you get there at the end of the day, be like the way you look, like the way you feel, like the way you act, all those things. So to me, almost like what you're talking about in terms of this data, in terms of rule setting, right? Governance and accessibility and all these things, it can be a tough process, it can be, but it certainly seems well worth it because you make your data all the more valuable and essential to your business. Is that about right? Yeah, that's right. That's right. You know, it's funny you compare to a diet. Sometimes I think of a patient stress test, you know, almost like a health exam and we're spending so much time testing the analytics or testing the models and looking at accuracy and can anybody achieve 89 to 90% but we're probably not spending enough time testing our data assumptions, right? Running that diet or health check against the data itself. And I would say that every Fortune 100 or even Fortune 1000 probably considers themselves a data-driven business at this point in time, which means they're going to make decisions quickly based on data. And if we really pull that thread a little bit, what's the cost of making decisions on incorrect data? And it's terribly scary as we start to unfold that. So you're absolutely right. They're taking it very seriously and it takes a lot of thought of how to get enough coverage and how to create trust in that type of environment. Yeah, it's almost too. It's like the concept of input bias a little bit here where if you're assuming that certain data sets are accurate and pertinent relevant, all those things and then you're making decisions based on those data sets, but you might be looking at kind of an input biases if I'm hearing you right, that maybe you're not keeping your mind open as to what really should be important or influential in your decision-making in terms of data and then obviously acting on that appropriately. So you have to decide maybe on the front side, what data matters and you help people do that and then help me make decisions based on good data, basically, right? Right, that's right. And to be fully transparent and candid, we weren't as strong in the what data matters piece of it. We were very strong early on in giving you broad coverage, meaning we made no assumptions, right? We wanted to go out and attack the whole surface of the problem and then sort of have a consistent scoring methodology and as we've partnered and now become acquired by Calibra, which is an exciting path, they are very good at what's called critical data elements and lineage and doing graph analysis to sort of identify the assets that are most used and that's where we see a huge benefit in combining those two powers. So you kind of got there quickly, but ultimately we are combining the forces of total coverage at scale with what is most important to you. They mentioned we, how about our DQ, you were the founder of that that was purchased by Calibra. Tell us a little bit just about how that came to be and first of all, we did it at our DQ. Well, that was all about and then how this marriage, if you will, or how this relationship with Calibra evolved and then you were eventually purchased. Yeah, absolutely. So I mean, I had this passion that I couldn't hold back on in the data community. Once you see it this way where you can use data mining and compute power to curate and manage rules and then take it much beyond there and to predicting and seeing around the corner for tomorrow, you have to go that direction. So that's exactly what my self and team did. And what we started to see with the early adopters of our software was that they were getting a seven figure return on investment per department and they were able to replicate this across many departments. So we've had a great lifespan with those customers staying and growing and expanding but we were getting a little bit of market pressure from the investment community as well as that same customer community that they wanted us to integrate with their data catalog and the data catalog of choice every time in the conversation was Calibra. And interestingly enough, I ran into the likes of Jim Cushman and the whole thing unfolds from there. I think they were seeing a little bit of a similar story saying, doesn't catalog and lineage belong together with quality. And when we sat together, it was like three market forces suggesting the same answer. And as we laid out the roadmap and the integration, we just can't see it any other way. There's no way I'll be bold and say that it goes back the other way, not just for this company, but for the industry. Data governance and data intelligence will absolutely combine quality lineage catalog and all of the above in the future. It is becoming that clear, I think. Yeah, this is kind of a big picture question about all that data quality right now. What's driving this avid interest that organizations are showing? And it's a small, medium enterprise. It's everybody, but in your mind, you've been involved in this for a number of years now. Why now? What is it now? Is it just that we have so much more data available that so much of it's unused, that what we have and we're realizing and what we have is pretty valuable, but what's the driver? What's the big push here? Yeah, it is a tough question and I have gotten this one before and it's interesting because it's been around since the 90s, right? So it's a very fair question. There's a couple of things I think that are driving it. One, as we start to see more data in Tableau dashboards and pick your favorite BI tool, you start to realize the data's not correct. You look at your house on Zillow, whatever you find out it's mislabeled, it doesn't have the right bedrooms, maybe humans are entering into the listings and as data's become more available visually, we're more critical of it and now businesses are becoming more data driven where humans aren't involved as much and the actions are automatically being taken and it becomes an embarrassing moment if your data is incorrect and we can really measure that cost at this point. You do see some other factors like cloud migration. Well, that adds a risk to your business. Could you possibly port everything, not just the servers, not just the software, but all of your data into another system and think that there would be no errors in that process? So as people are kind of creating their next generation platforms and then probably even a touch of COVID accelerating that cloud migration adoption and even just technology adoption. So for a multitude of reasons, there's just more data and there's more data quality concerns than ever before. So if you're talking to a prospective client right now, which you probably are, what do you want to share with them or what would you encourage them to consider in terms of kind of their data venture, their data journey, if you will, in terms of refining what they have, in terms of mining appropriately, in terms of governing it appropriately, all these things that maybe haven't been given a lot of consideration or deep consideration? Yeah, I think the two things, although if you listen to my other talks, I can talk forever about all of those items are probably, maybe just do the napkin math of all the tables, all the files, all the Kafka messages, all the columns and fields and attributes and kind of just multiply that out and try to figure out how you would get coverage and if you could, how you could maintain it and why shouldn't we be trading compute power for domain knowledge and things at that point? I think that's the first place to start and probably the second is actually the act of traditional data quality rules puts you in a binary situation. It basically says you will either have a break record or you will not. So it's a yes, no question. What it never will tell you is what the answer should have been. And if you take a deeper look at the solution that we're providing to the market, we're actually predicting to you what the correct value is. And it's a complete paradigm shift. It obviously is much more scientific but it's much more powerful to get you to the end answer more quickly instead of just going through break records. Right, tremendous capability that you just described and on that, I'm gonna thank you for the time but just think about it, right? We're not only gonna help you make more sense of your data, we're also gonna help you make better decisions and show you what that path might be or what you probably should be considering. So it certainly opens up a lot of doors for a lot of companies in that respect. Kirk, thanks for the time. Sorry, we didn't have enough time to hear that guitar in the background but next time I'm gonna hold you to it, okay? Yeah, that sounds good, John. I really appreciate it. All right, very good. Kirk Hasselback joining us from Calibra. We continue our coverage here of Data Citizens 21 on theCUBE and I'm John Walz.