 The Carnegie Mellon Quarantine Database Talks are made possible by the Stephen Moy Foundation for Keeping It Real and by contributions from viewers like you. Thank you. We're super excited today to have Dr. Lublina Antova, who's the Principal Research Scientist at Datometry. Again, we give her talk about the kind of cool things they've been working at Datometry. Dr. Antova got her PhD under Kristoff Koch at Cornell. And prior to that, she did a master's degree from Sarlene University in Germany and a bachelor's degree in computer science from Sophia University in Bulgaria. So she's been at Datometry since 2016. 15. 2015. Okay. And again, the way we'll do this as we do every week, if you have any questions, please unmute yourself, say who you are, where you're coming from, and ask your question and feel free to interrupt any time as possible at any time you want. And as always, we want to thank the Stephen Moy Foundation for Keeping It Real for sponsoring this event. Lublina, the floor is yours, go for it. Just tell me when you want to change slides. Yes. Thanks, Andy, for the introduction and sorry, everyone, for the delay. We've had some technical difficulties. So again, my name is Lublina, and I will be talking about the technology we're developing at Datometry. And yes, next slide, please. So the first sentence that I'm going to start with, it's something that's probably pretty obvious to everyone. And that is that everybody, all the enterprises are moving to the cloud. And not only that, but they also need to move next slide, please, their data warehouses with it. And we know that the biggest cloud providers, Amazon, Google, and Microsoft, sorry, they have their own cloud offering for data warehousing. And we've seen some pretty big IPO recently that also shows that cloud warehouses are pretty modern and pretty desirable for the enterprises. Next slide, please. However, it turns out that adopting those cloud warehouses is one of the most challenging problems that IT faces. And next slide. So this is a quote from Gardner, something that, for me, is a former academic and was not so obvious that actually, even though we have so many new and better databases, it's actually pretty hard to migrate legacy applications to start using those new databases. And what Gardner estimated is that 60% or more of those migration projects tend to run over schedule, they run over budget, and ultimately fail. So the purpose of the stock is going to be to answer why that is the case and also what we can do about it. Next slide, please. So this is a picture. And I don't really want you to look at the individual boxes because I understand there's plenty of them. But that's how a usual solution architecture looks at one of our customers. So in the middle, you have your data warehousing data warehouse that's in the usual cylinder that we use for drawing databases. And those data warehousing applications are usually used by two sets of applications. So on the left-hand side is the ETL applications that pump and load data into the data warehouse. And on the right-hand side, you have the reporting and analytical applications that read and make sense of that data. So here are the various components that are involved in those applications. And all the boxes painted in gray are the things you need to worry about if you want to migrate that architecture to a new data warehouse. So essentially replace the cylinder in the middle. So let's simplify this a bit. Can you move to the next slide? So again, this is the same picture, just lightly simplified. This is your typical database stack. So on the bottom, you have your data warehouse. And here I've shown Teradata as one of the kind of most prominent data warehousing databases out there. And on the top, you have the applications. So those applications, this will come as no surprise to you, but those applications send queries down to the database. This could be read queries. This could be queries that create objects. This could be created at modified objects. And the data warehouse evaluates those queries and returns the results back. And the application makes something with those results, whether it's reporting, whether it's visualization, whether it's something else. So if you don't want to worry about migrating the applications, our solution proposes the following. As we can move to the next slide. We introduce a virtualization layer in the middle that intercepts all the traffic that comes from the application, including queries, including loading requests and so on. And we do the real-time translation of those requests down to the database, down to the new database. So the application never needs to understand that something changed, that it's actually speaking to a cloud data warehouse. It still thinks it's talking to Teradata. And we take care of actually making those things work in making those requests compatible. Next slide. So before I tell you how we do it, I just wanted to give you an example of why translating SQL queries is so difficult. And I know it's probably obvious to most of you that have seen at least one or more database engines. But this is a very simple example that is a simplified version of a customer query. And even in these few lines of SQL text, we already have some differences that make this query non-portable to a new system. So if you can click. So yeah, the first example is just a custom keyword. Sorry for the font formatting. But essentially Teradata lets you abbreviate, select into cell. And that's pretty simple. But the next one, next slide please, is a comparison between a date column and an integer. And that's typically not supported. So it works on Teradata just because in Teradata dates are represented as integers. So there is a native comparison between the two data values. There's some more differences next slide. So that is a vector sub query that is standard in SQL but may not be supported on a new and upcoming data warehouse. You compare two values against a set of two other values coming from a sub query. Next slide. And that was supposed to highlight the rank function, which is a non-standard way to express window functions in Teradata. And if you click to the next slide, there's also a qualified clause, which is a proprietary Teradata clause that lets you put window functions in a predicate. So it's similar to a having clause. So even in this little example, there is a number of things that make the query non-portable. And if you try to rewrite it yourself, if you click, there's going to be a banner. You basically die from a thousand cuts because there's so many things that you need to worry about. And this is just as... At a curiosity, like the previous slide, you showed that you're feeding in from MicroStrategy, you're feeding in from whatever, Informatica. At a curiosity, is there any one application that generates adverse or like grotesque Teradata queries or are they all sort of equally guilty? All of them. So if you move to the next slide, this is the overall architecture of HyperQ. This is our virtualization platform. So here on the left side, you have these legacy applications. They connect through a driver. They speak a protocol that's implemented by a driver. And this could be a standard driver like ODBC or JDBC. But it can also be a proprietary driver of the data warehouse that speaks a custom protocol. And on the right hand side, instead of the old data warehouse, you have one of the supported data warehouses that you want to move to. So in the box in the middle, we have the various components, and I probably cannot point now that I'm not presenting. But from left to right, so it's not just about the SQL text. It's also about understanding the protocol that the application speaks because we also don't want to have the customer have to change those. So we do have a protocol engine that understands the various messages that are sent from the driver. And once we understood the message, we unpacked the query out of that message, and we give it to the next component, which is the thing in the middle, the translator. So that's the component that I'm actually going to describe in more details. But the important part here is that the translator takes the input query expressed in the dialect of the source system and translates it into something that can be run on the new data warehouse. We also have other things like metadata manager because we need to be able to access metadata in order to resolve the various object references in the query. And another important component is that when we run the query and we get the results back, we also need to package those results and send them in the format that the application expects them to be. So in some cases, we have to do things like type premappings in case the type system is not exactly the same. And also, we have to package it in the message that the protocol defines for returning results back to the client because, again, we don't want to change a single thing in the way the application used to run against the old system. So next slide, please. I mean, maybe you'll get into this, but there's different, sometimes there's different behavior of built-in functions and things like that. Presumably, you have counted for all the possible things that people would execute on stones like a big query or something on the other side. And you know that. I guess my question is how much of this is hand coded versus how much of this can be trained and learned? Good question. So we hard code the semantics because that's usually well defined in the various documentation of the source and both the source and the target system. So we do make sure that things have the same behavior. So we do make sure that things return the same data types. And if not, we add explicit casts to actually make them return the same type. We don't do any learning of what the results should be because we have to be correct. Sure. Okay. Yeah. Thanks. Yeah. So next slide. So, yeah, again, zooming in on the query translation framework because to me that's the most interesting part of the product and that's the one that I've worked on most. So again, we said there's a query that comes in that queries expressing the dialect of the source system. In our case, Teradata. So we have a parser. We have to parse that query. We have the grammar rules of the source language encoded. And not surprisingly, the parser comes back with an abstract syntax tree, like most parsers do. So the next step, if you can click the next slide, is we actually have to bind this abstract syntax tree into something that is semantically meaningful. So we, as part of that process in the binder, we would resolve object references. So we know what tables are being referenced, what columns those tables have, what types the various columns and expressions are. And the result after this step is very close to the relational algebra that you know from textbooks. So the result is another tree called extra, and it stands for extended relational algebra. But it's essentially a tree consisting of joins, group buys, and other types of operators. So this is a more or less straightforward translation to an extra. And we may have to apply additional transformations to make that query tree be executable or close to executable on the new system. And I'll show examples of what we do with some constructs. But essentially we have a transformation framework which applies a variety of algebraic rules to make the original join tree or the original extra tree closer to something that will be executed on the new system. So the result after the transformation step is again an extra tree. And that extra tree in the final step we can feed to a serializer. We can click to the next slide. And what the serializer does is it will walk down that tree and will produce SQL text. So that now can be sent to the back end and execute it. So again, if I'm jumping ahead, stop me. So that query rewriting thing, the idea is you know the target is BigQuery. And you know BigQuery can only handle many queries and do certain kind of nesting or have certain patterns. Therefore you have the rewriting specific to the engine you're going to put it towards. That is a very good question. Yes, the correct answer is yes. And I have examples later, so I'll come back to that. All right, next. Next please. Yeah, so this is a framework that is modular and extensible so you can add new features easily by implementing new parser rules. You can implement transformations that as Andy suggested are specific to a particular system. You can also turn off transformations because those database systems evolve. So in the future if they add native support for something that you had a rewrite for, you can actually turn it off and that will make all the queries that use this construct now use the native construct. And if you think about it, if you had to go through the manual translation of queries, even if the database system evolves, you will probably not want to go back and rewrite those queries manually. But here you basically take advantage of it with one step. So I have a few examples to just show you how things look. So here's a simple query. You select from a table T and you have some equality predicate. So after parsing you have an AST that looks like this. So you think you're having to do a selection on top of a table T. But when you start binding, well, first of all, you'll resolve T to an actual table in an actual schema. So here DBO is the default schema. So we pull the metadata for that one. And when we start binding the equality predicate, we see that, well, one of the columns comes from T but the other one doesn't. And on most systems that will raise an error because you forgot to reference S in the from clause. But on Terrier, that's actually pretty valid syntax. And the implicit meaning of that is that you're actually joining the two tables. So after binding, instead of having a selection, you will end up with a join between the two tables T and S. And something that I don't show in these trees is that as part of the binding, we also build up the properties at each of these levels. So for relational operators, we will derive things like what are the columns, what are the output column types and so on, because those are then needed for binding the upper level. And if you've done query optimization or something similar, that's pretty standard. For scalar operators, we will derive things like what is the type of the expression, is this expression nullable or not, because that may be handy later on if you want to apply transformations. So next slide, please. This is awful. Do you want something without declaring it? You can do something without declaring it. And one other example that I wanted to put in the slides, but I thought I will not have time to go over it is you can also do stuff like select A is B, B is C, and so on. So basically, reference variables or columns that were defined at the same level. So it makes for some, yeah. I mean, they knew their syntax was deviating, so they're trying to get more strict in every version. This is like some shit in the 80s that somebody wrote and they've been living with it for 30 years. And a lot of applications use it. So that's what makes the whole problem so much harder. So this is the next step after binding. We've kind of normalized the query a little bit so that it's kind of more closer to relational algebra. And then in the transformation framework, we implement this back-in-specific logic for the various systems that we support. If something is not supported, we may have to rewrite it into something that is, and I have examples for that later. We also do some optimizations in that framework. For example, if you have a sequence of insert statements that target the same table, we can batch them into a single insert, which is usually much more efficient to execute, especially on a cloud data warehouse. We do things like common expression elimination, sub-query unnesting. Some things that will help the downstream optimizer deal better with the query that we give it. And also, we do type enforcement, and that's to make sure that the various functions and operators return the same type as they did on the original system. Because sometimes, even if the function looks the same, it may have different requirements for the input argument. So maybe it doesn't work with the integer and begin, but it works if you cast one to the other. So we do that in the transformation framework, too. Next slide. So I have a few examples. So one is simple. So you have a date arithmetic expression, which just subtracts one date from another. And that is sometimes supported natively as an operator. But on other systems, you have to call a built-in function. So if you click to the next slide, some systems, you will have to express this difference by just calling explicitly a date diff function. So that transformation will be enabled for those systems, but will not be enabled for the ones that support it natively. So that's pretty simple. So the next example is... So if you remember the query that I showed you earlier, where on Teradata, it's perfectly fine to use dates and integers interchangeably on other systems. That's not the case. Then you may have to cast the date expression explicitly into an integer. And that happens with... If you click to the next slide, there will be a tree. Basically, you extract the various components of the date and you plug it into this math expression that computes the integer. So once you have this query tree, you can then give it to the serializer and it will produce SQL that looks like this. And that's part of the larger query tree. So it will be a snippet of the SQL that will be sent to the database. Any questions so far? This is crazy. This is awesome. It gets even more awesome. Next slide. So what I've shown you so far is mostly algebraic rewrite rules, where we assume that whatever the query and the query constructs that come in can be expressed with some other shape and form of a query tree. But that's not always the case. In some cases, we'll get a query that we cannot execute directly as is. And we might have to break it up into separate components and run those and combine the results in the mid-tier. So some examples are the more procedural constructs like macros, stored procedures, recursive queries, and so on. Another famous example is enforcing uniqueness constraints. So if your table had a key or unique constraint defined, those are very rarely portable because most of the cloud data warehouses decided not to implement uniqueness constraints. And if now if your application code depends on data being properly dedupped in the target table, you're running into trouble. So what we do is we would essentially enforce the constraint on our end by what we keep track in our metadata store that there is a constraint, first of all. And then if there's a DML operation that comes in, we would essentially run it in a transaction where we would try to run the DML. If it introduced any duplicates, we would roll back and return an error to the user. If not, we will commit the transaction. So that way we can emulate that same behavior in a set of queries instead of a single query only. So I promise I have an example so that that one covers recursive queries. So this query, you don't have to read the SQL test. So what it does is it is a query which computes all the employees that report either directly or indirectly to a manager with a given ID. So that's a standard SQL construct. So if you click, there will be some little animation. It has this recursive at the beginning to kind of warn the parser and the optimizer that this is a recursive query. In the base, that's kind of the upper half of the query, we say, well, give me all the employees whose manager has ID 10. And union those employees with everybody who reports to the employees that you just computed. So because this definition of reports uses itself, that's why it's recursive, when the downstream system doesn't support recursion, we'll have to actually evaluate the recursion on our end. So we would step through the stages. So if you click once more, so we would first execute that base of the recursion, we'll find the employees that report to manager 10 directly. And then we would iterate over until we find no more employees. So we'll basically join that Delta table with the original employees table until we don't add any more records to the Delta table. And the final result, we'll just have to, if you click once again, we'll just have to click, sorry, yes, to read from that final work table, which contains all the entries that we computed by running the recursive query through hyper queue. And we have to also clean up some of the intermediate tables that we've created. So the hyper queue part has no data, right? The transaction thing you just mentioned, that would be all that you're maintaining state of the transactions in hyper queue. And then it's responsive, we're doing the lookup to say, do I have a straight, do I have a straight violation and whatnot. So we assume that the downstream system gives us transactions. So okay, okay, okay, we don't, we try to stay away from implementing a transaction manager, because well, they're smarter people that have done that. So we assume that we can just do begin transaction. So we do send that as a query to the backend. And we do send the various create temp table, create table, insert into and so on queries. But we kind of handle the control flow. Got it. Okay. Makes sense. All right. Awesome. So I wanted to give you something more kind of deeper on the implementation side of things. So we've implemented that hope framework in Erlang, which is a functional programming language. We're just kind of awesome when you do want to do query writes, because you start with a pattern, you transform it and you produce a new pattern and give you kind of an idea of the depth and breadth of the implementation and what we support. I thought I will run a little statistics. So I did that on the weekend. So that's the current snapshot of our implementation. So for tier data, we have about 150 different parse node constructs. Those get mapped to about 235 relational operators, some of traditional operators that includes DDL commands, utility commands, and other things that are necessary to support the customer applications. And we currently have almost 250 transformation rules that massage the query to make it work on the back end. Next slide, please. So I'm kind of coming towards the end of my talk, and I wanted to point out a few properties of HyperQ. So one thing that is important is we're now in between the application and the database. And an important aspect of it is that we don't want to introduce overhead, and that's a question we commonly get asked. What about overhead? And we've measured it and we obviously ran experiments for our signal papers, but most of the times it's negligible, because we delegate the actual data processing to the back end. So we kind of have to handle things like metadata lookups and translations, but those are typically much faster to execute. So at the worst case, we've seen something like 2% overhead, which for a virtualization framework is pretty acceptable. So the other important aspect is, well, is there something you don't support? Or how much of the language do you actually support? And that's again an empirical result that we've gotten. And the result is based on analyzing a number of customer workloads of millions of queries. And we've discovered that there's about 99.5% support for the data dialect of SQL. And that is important because our value proposition to customers is that they don't have to change anything. So we can't come up and say, look, we support only 50% of your queries, because that means the rest of the 100% will have to be manually rewritten. And the thing that we don't support yet is either on our roadmap or are features that we've decided not to support, because there's little demand for those. So that's kind of a business decision. Can you give an example of something that is in that 0.5% you don't support? And actually, out of curiosity, the ones you say, you say, we're just not going to support this, because you just love demand for it. I guess the first question is, how much to work is it effort in your part to say, okay, here's this function that somebody wrote that's in from Teradata from 1989 that they'd still support. How much effort does it take to say, okay, what is that function and how to actually convert it? Yeah, I mean, there's sometimes one conscious, well, so if it's something that has a direct corresponding feature on the back end, then it's a day worth of work. We've sometimes implemented custom UDFs that occasionally make it into the built-in functions if we kind of make it a point with our partner warehouses that this is an important feature, which, again, is probably easy. There are other kind of bigger features, especially the ones that need emulation, things like stored procedures where there's a variety of constructs that we need to support, like error context handling and exceptions and all these things. So these are usually, and again, I mean, we start with a simple implementation and we add the to the backlog for things to implement based on what we see in the customer workloads. Things that we've decided not to support, something from the top of my head is something like Xquery. So there's some Xquery functionality interior. We said, well, the 1% of queries that touch this will just hire an SI to help you rewrite those because there's more important things. Yes, okay, that's a good response. And yeah, so the third point is something that I didn't talk about, but it's also important if you're an enterprise that's looking to migrate this. The others like beyond relational, we also support things like ETL tools and we map those ETL tools to the kind of fast loaders of teradata to the fast loaders of the target system so that the performance is still as expected. So out of curiosity, like, I mean, teradata is, you know, it's from 1979. There's, I don't know how many different iterations of the drivers for ADBC or JDBC, like how many different versions are out there? Like, did you guys make an effort to say, let's go find every JAR file or every version JDBC that they support and just try to put that through your whole like testing pipeline to see what you support and don't support? Or like, do you, do you cut it off and at some point say we only support up to this version? Like, how bad was that actually supporting all the versions of the teradata driver? Yeah, so I think we, from the teradata driver, I think we support two or three major versions and again, that's based on customer needs. I think the oldest one is 13 and so we don't support anything before that and the kind of majority of stuff is in teradata 15 and I think there's a couple of new things in teradata 16, but it's pretty much backward compatible in the same. Got it, okay. I just, the reason I asked is, I remember Mike was telling me that like in the early days of Vertica, you know, there's how much effort they went into just trying to support all different versions of Postgres JDBC that all the customers had. It was a huge pain in the ass with big whack mole and my student Matt, he got our, you know, he, we, our system runs the latest version of JDBC for, for Postgres and it seems to work. We, what we haven't, we don't have customers, we haven't looked to see how much harm is going to be supporting all the older protocols. Yeah, again, it's kind of driven by customer demand. We've, I think we've started with one of the versions initially. We've kind of taken the spec of the protocol and implemented it and then we saw, oh, there's some customers that are on older versions that differ significantly. So we, when I hadn't kind of debugged that oversight engineer, that one too. Got it, because they give you money, you make, so you do it. Yeah. Yeah. I mean, now that we've done, it's applicable to all of them. Sure. Yeah. So yeah, so kind of the summary is that we provide the pretty extensive functionality and we don't incur overhead. I guess, next slide. So this is kind of more of a marketing slide. I'm not sure Andy will like it, but again, we will, our value prop is that we do it for the customer. We enable those migrations and now they run in a significantly less time. I think the quote was the original migration sometimes tend to run three, four, five years and now it can be done in a few months. And obviously it costs less money because you have to pay less people to do manual stuff and rewrite queries. And you also don't have to pay the teradata license while you're migrating because now migration is done much faster. So the migration time though is getting the data out of teradata and shoving it into whatever backend thing. Like your thing is instantaneous, right? Like, other than like maybe there's some functionality even on support, but in theory, like they point at a datometry and it just works. So the migration cost really is just moving the data, right? Yes, yes and no. I mean, obviously we're a startup or somewhat advanced startup, but customers, the way that the usually projects go is that we start with a kind of workload analysis to see if we actually support them or not. And we have a tool that I don't show here and hopefully we'll be able to write a paper about it. But it essentially runs millions of queries kind of through a preview version of HyperQ and it counts how many things we support. It spits out things that we don't support and we need to add the backlog if we are to support this customer. And that's kind of the first step. And if those numbers are convincing, we go ahead and we propose the implementation plan. And, you know, things that need to be implemented, get implemented. The data itself, data is not static. So there is this one-time migration of data, but usually you have to kind of point your ETL scripts to pump data into the new warehouse. So what customers typically do is when they start a migration project is they run the two warehouses side by side for a few months and that helps them resolve whether there are any bugs that they see, but also kind of tune performance if they need to. And sometimes that involves us implementing transformations to make better use of the data warehousing features that they now use, but also kind of compare and make sure that the things are correct. So there are still some overhead into doing that, but there's no manual query that needs to happen unless it falls into this 0.5 percent of the features that we don't support. So yeah, there's still some overhead, but it's typically much smaller because we don't do any manual rewrites. Okay, awesome. Next slide. Yeah, so this is a recent case study with one of the UK retailers that migrated and if somebody's interested there's a link, but essentially they report on how they migrated from Teradata to Azure using Datometry. And kind of coming to the end of my talk, here I presented Datometry's HyperCube product, which is a virtualization framework that is this middleman between the application and the database. So what it does is it decouples the application from the database, so now I can switch the database and not have to worry about making your applications run against the new system. Next slide. So we've published a lot of that work, so my talk was mostly based on our two sigmot papers, our two most recent sigmot papers, so if anyone is interested, please take a look at those, and that's all. Awesome, I will clap on behalf of everyone else. All right, anyone have any questions for Lou Boulay now? Okay, so I guess my question would be what's harder? Is it harder to support a new backend data warehouse, or is it harder to support a new front-end Exadata instead of Teradata or something like that, or are they equally challenging? Since we've mostly added backends, that is relatively easy, assuming obviously that the backend is kind of functional and has the usual construct, so if it's something that doesn't let you run DML, then it's probably not that easy to add, but if it's a fairly functional backend, usually adding a new one involves implementing a metadata interface that lets you look up metadata, and again that can be simple if the database supports something like information schema, because we don't have to change queries. The other thing that's different for new backends is the serializer component that actually takes care of producing SQL, and that is also fairly kind of common, because we on the front side of things, we have to worry about the various dialects and the various proprietary features, but when we serialize SQL, we can stick to ANSI SQL or something that will be common across all these backends, and just worry about the differences, so in terms of co-chairing, that's actually pretty compact too. For supporting a new front-end system, that will typically involve more work, because you have to implement the parser, we have to make sure that the semantics is preserved across all these operators, so actually I didn't show that, but one of the earliest papers we had was, and actually when I started, we were targeting not teradata, but a different database called KDB, so this is a database. Oh, that's not even SQL, that's not SQL. We were mapping everything to SQL, so kind of this internal extra representation or the origins of it, that date back from the time when we supported KDB, because we wanted to map things into SQL, and we wanted to map things into a more relational kind of target system, but the parser was obviously completely different because that's not SQL. Yeah, that's crazy. The teradata market is way larger than KDB, right? Which explains why we use it for KDB. So the other question I have is the UDS stuff, that translation, so one, that's not parsing SQL or that's parsing now PLC or whatever, I don't know what teradata is, right? Is it sort of the same steps or that's a little different, right, because that's like mapping from one DSL to another DSL, but are there semantics about the SQL inside of it that you can take advantage of to rewrite and be more intelligent? Does that make sense or no? Yeah, yeah, that's so, yeah, so store procedures in teradata are essentially a bunch of SQL statements that are connected with control flow, like if the Naos do that, if there's an exception or if you get this error from the backend, raise this exception to the user or execute this function to the user and return the results to the user. So it's part of the same parser, so it's part of the, it's still kind of SQL with control flow and we do have extra nodes, we do have these operators that express control flow nodes in the, in the extra tree. We don't really make a difference. We don't support truly procedural, truly things like Python and so on, we can assume that this can be just, if Python is a supported language in the new system, those can be just dropped in there. And what about the transaction stuff that I'm sort of curious about as well, because like, it's one thing to match SQL, right, but now if there's implicit semantics of the transaction, of transactions that are, that are, you know, that teradata might do something different than, than, than what, you know, BigQuery might do. Again, I realize these are, these are data warehouses and people aren't running like hard core applications, but like, like, like, are there variants in like isolation levels that you have to deal with from going from one system to the next? You're asking the hard questions. Yeah, so there may be a difference in isolation levels, but we try to stay away from those in the sense that those are usually not as important to warehousing applications. We obviously want the target system to support transactions because they do show up and we do need them for some of our emulation. But not at the level where we have to worry about isolation levels. Yeah, okay. All right, let's say anybody else have a question. I have one last question. Anybody else? Okay. I'll ask you the question. I like to ask everyone else. How stupid are your users? Like how, like when they come to you and you think data geometry is going to do XYZ and you're like, what are you talking about? This is your way off on this. Like, or they're people that like, if they're already sort of managing teradata, they sort of have some intelligence about what the hell they're doing and they're obviously coming to you because, you know, they're sophisticated enough to realize, you know, the path they're going down is not the right thing and this is a better approach. Now I have to be careful because you're going to post this video online. I have to be careful how I answer this. Yeah, so there's one difficulty which is the users that, well, those IT departments, the ones that are responsible for managing the teradata instance are usually not really the business users that are writing the applications. So sometimes those people don't really know what's the business logic in those applications and that's part of the kind of the deal that we don't want them to worry about the logic and we don't want them to worry about the implementation of, or if it's going to be a manual rewrite about the kind of the correctness of those rewrites. So that's one. The other one is since the interface, we preserved the teradata interface so anything that they used to know about teradata now kind of holds because they can still write or modify these applications if they want them. We also, obviously if somebody is moving to the cloud, they probably want to take advantage of what these cloud warehouses give you so now kind of be stuck in the in the 70s with teradata but develop new stuff and that's no problem too because, well, you migrate your stuff, you don't have to worry about that migration project running over time and once you're done, you can now start developing new applications or even changing and re kind of re innovating or re implementing stuff that you wanted to but didn't want to couple it with the migration step. Got it. Okay and then I think, I mean, the market slide is useful because like, was this one? I mean the cost is massive, right? Like, because teradata is so expensive, right? And going to the cloud is a fraction of that. That's huge. Yeah. I mean, and I used to work at a database company and something that, I mean, I wasn't bothered with it so much back then but because I was developing this cool new database technology was that even though, you know, you can believe the data, your database is better. It is still very hard to convince the customer to move to it because of that migration cost. Most of these database vendors usually target kind of greenfield opportunities where you develop your application from scratch but not really kind of bite the full market with the old legacy applications. Yep. It's a reason why IBM still makes bank on IBM apps, right? That fucker's not going anywhere. Yep. Okay. This is awesome. The winner. Thank you for doing this. Thank you for sticking with us even though it's technology issues.