 Welcome to theCUBE's coverage of Data Citizens 2022 Kaliber's customer event. My name is Dave Vellante. With us is Kirk Hasselbeck, who's the Vice President of Data Quality of Kaliber. Kirk, good to see you, welcome. Thanks for having me, Dave, excited to be here. Hey, you bet. Okay, we're going to discuss data quality, observability. It's a hot trend right now. You founded a data quality company, OwlDQ, and it was acquired by Kaliber last year. Congratulations. And now you lead data quality at Kaliber. So we're hearing a lot about data quality right now. Why is it such a priority? Take us through your thoughts on that. Yeah, absolutely. It's definitely exciting times for data quality, which your right has been around for a long time. So why now and why is it so much more exciting than it used to be? I think it's a bit stale, but we all know that companies use more data than ever before, and the variety has changed and the volume has grown. And while I think that remains true, there are a couple other hidden factors at play that everyone's so interested in as to why this is becoming so important now. And I guess you could kind of break this down simply and think about if Dave, you and I were going to build a new healthcare application and monitor the heartbeat of individuals, imagine if we get that wrong, what the ramifications could be, what those incidents would look like, or maybe better yet, we try to build a new trading algorithm with a crossover strategy where the 50-day crosses the 10-day average. And imagine if the data underlying the inputs to that is incorrect, we'll probably have major financial ramifications in that sense. So it kind of starts there where everybody's realizing that we're all data companies. And if we're using bad data, we're likely making incorrect business decisions. But I think there's kind of two other things at play. I bought a car not too long ago, and my dad called and said, how many cylinders does it have? And I realized in that moment, I might have failed him because I didn't know. And I used to ask those types of questions about antilock brakes and cylinders and if it's manual or automatic. And I realized I now just buy a car that I hope works and it's so complicated with all the computer chips, I really don't know that much about it. And that's what's happening with data. We're just loading so much of it and it's so complex that the way companies consume them in the IT function is that they bring in a lot of data and then they syndicate it out to the business. And it turns out that the individuals loading and consuming all of this data for the company actually may not know that much about the data itself. And that's not even their job anymore. So we'll talk more about that in a minute, but that's really what's setting the foreground for this observability play and why everybody's so interested. It's because we're becoming less close to the intricacies of the data and we just expect it to always be there and be correct. You know, the other thing too about data quality and for years we did the MIT CDO IQ event. We didn't do it last year, COVID messed everything up but the observation I would make there, let me thoughts is data quality used to be information quality used to be this back office function and then it became sort of front office with financial services and government and healthcare, these highly regulated industries. And then the whole chief data officer thing happened and people were realizing, they sort of flipped the bit from sort of data as a risk to data as an asset. And now, as we say, we're going to talk about observability and so it's really become front and center. That's the whole quality issue because data is so fundamental, hasn't it? Yeah, absolutely. I mean, let's imagine we pull up our phones right now and I go to my favorite stock ticker app and I check out the NASDAQ market cap. I really have no idea if that's the correct number. I know it's a number, it looks large, it's in a numeric field and that's kind of what's going on. There's so many numbers and they're coming from all of these different sources and data providers and they're getting consumed and passed along, but there isn't really a way to tactically put controls on every number and metric across every field we plan to monitor. But with the scale that we've achieved in early days even before Calibra and what's been so exciting is we have these types of observation techniques, these data monitors that can actually track past performance of every field at scale and why that's so interesting and why I think the CDO is listening right intently nowadays to this topic is so maybe we could surface all of these problems with the right solution of data observability and with the right scale and then just be alerted on breaking trends. So we're sort of shifting away from this world of must write a condition and then when that condition breaks that was always known as a break record but what about breaking trends and root cause analysis and is it possible to do that with less human intervention? And so I think most people are seeing now that it's going to have to be a software tool and a computer system. It's not ever going to be based on one or two domain experts anymore. So how does data observability relate to data quality? Are they sort of two sides of the same coin? Are they cousins? What's your perspective on that? Yeah, it's super interesting. It's an emerging market. So the language is changing, a lot of the topic and areas changing the way that I like to say it or break it down because the lingo is constantly moving as a target on the space is really breaking records versus breaking trends. And I could write a condition when this thing happens it's wrong and when it doesn't it's correct or I can look for a trend and I'll give you a good example you know, everybody's talking about fresh data and stale data and why would that matter? Well, if your data never arrived or only part of it arrived or didn't arrive on time it's likely stale and there will not be a condition that you could write that would show you all the good and the bad. So that was kind of your traditional approach of data quality break records but your modern day approach is you lost a significant portion of your data or it did not arrive on time to make that decision accurately on time. And that's a hidden concern. Some people call this freshness, we call it stale data but it all points to the same idea of the thing that you're observing may not be a data quality condition anymore. It may be a breakdown in the data pipeline and with thousands of data pipelines in play for every company out there there's more than a couple of these happening every day. So what's the calibre angle on all this stuff made the acquisition, you got data quality observability coming together. You guys have a lot of expertise in this area but you hear providence of data you just talked about, you know, stale data you know, the whole trend toward real time. How is calibre approaching the problem and what's unique about your approach? Well, I think where we're fortunate is with our background, myself and team we sort of lived this problem for a long time you know, in the Wall Street days about a decade ago and we saw it from many different angles and what we came up with before it was called data observability or reliability was basically the underpinnings of that. So we're a little bit ahead of the curve there when most people evaluate our solution it's more advanced than some of the observation techniques that currently exist. But we've also always covered data quality and we believe that people want to know more they need more insights and they want to see break records and breaking trends together so they can correlate the root cause and we hear that all the time. I have so many things going wrong just show me the big picture help me find the thing that if I were to fix it today would make the most impact. So we're really focused on root cause analysis business impact, connecting it with lineage and catalog, metadata and as that grows you can actually achieve total data governance. At this point with the acquisition of what was a lineage company years ago and then my company LDQ now Calibra data quality Calibra may be the best positioned for total data governance and intelligence in the space. Well, you mentioned financial services a couple of times and some examples. Remember the flash crash in 2010 nobody had any idea what that was they just said, oh, it's a glitch. So they didn't understand the root cause of it. So this is a really interesting topic to me. So we know at Data Citizens 22 that you're announcing you got to announce new products, right? Your yearly event. What's new? Can you give us a sense as to what products are coming out but specifically around data quality and observability? Absolutely. There's always a next thing on the forefront and the one right now is these hyperscalers in the cloud. So you have databases like Snowflake and BigQuery and Databricks's Delta Lake and SQL pushdown. And ultimately what that means is a lot of people are storing and loading data even faster in a SaaS like model. And we've started to hook into these databases and while we've always worked with the same databases in the past they're supported today. We're doing something called native database pushdown where the entire compute and data activity happens in the database. And why that is so interesting and powerful now is everyone's concerned with something called egress. Did my data that I've spent all this time and money with my security team securing ever leave my hands? Did it ever leave my secure VPC as they call it? And with these native integrations that we're building and about to unveil here is kind of a sneak peek for next week at Data Citizens we're now doing all compute and data operations in databases like Snowflake. And what that means is with no install and no configuration you could log in to the Kaliber Data Quality app and have all of your data quality running inside the database that you've probably already picked as your go forward team selection secured database of choice. So we're really excited about that. And I think if you look at the whole landscape of network cost, egress cost, data storage and compute what people are realizing is it's extremely efficient to do it in the way that we're about to release here next week. So this is interesting because what you just described you mentioned Snowflake, you mentioned Google actually you mentioned, yeah, Databricks Snowflake has the data cloud if you put everything in the data cloud okay you're cool but then Google's got the open data cloud if you heard Google next and now Databricks doesn't call it the data cloud but they have like the open source data cloud so you have all these different approaches and there's really no way up until now I'm hearing to really understand the relationships between all those and have confidence across you know, it's like Jamak Tagani you should just be a note on the mesh and I don't care if it's a data warehouse or a data lake and where it comes from but it's a point on that mesh and I need tooling to be able to have confidence that my data is governed and has the proper lineage, provenance and that's what you're bringing to the table is that right did I get that right? Yeah, that's right and it's for us it's not that we haven't been working with those great cloud databases but it's the fact that we can send them the instructions now we can send them the operating ability to crunch all of the calculations the governance, the quality and get the answers and what that's doing it's basically zero network cost zero egress cost, zero latency of time and so when you were to log into BigQuery tomorrow using our tool or say Snowflake for example you have instant data quality metrics instant profiling, instant lineage and access privacy controls things of that nature that just become less onerous what we're seeing is there's so much technology out there just like all of the major brands that you mentioned but how do we make it easier? The future is about less clicks, faster time to value faster scale and eventually lower cost and we think that this positions us to be the leader there. I love this example because you know everybody talks about wow the cloud guys are going to own the world and of course now we're seeing that the ecosystem is finding so much white space to add value connect across clouds sometimes we call it super cloud and so or inter-clouding. All right Kirk give us your final thoughts and on the trends that we've talked about and data citizens 22. Absolutely well I think you know one big trend is discovery and classification seeing that across the board people used to know it was a zip code and nowadays with the amount of data that's out there they want to know where everything is where their sensitive data is if it's redundant tell me everything inside of three to five seconds and with that comes they want to know in all of these hyperscale databases how fast they can get controls and insights out of their tools. So I think we're going to see more one click solutions more SaaS based solutions and solutions that hopefully prove faster time to value on all of these modern cloud platforms. Excellent, all right Kurt Hasselbeck thanks so much for coming on theCUBE and previewing Data Citizens 22 appreciate it. Thanks for having me Dave. You're welcome all right and thank you for watching keep it right there for more coverage from theCUBE.