 All right, good afternoon. My name is Josh. It's a pleasure to be here. Thank you all for coming. I'm going to talk today about doing continuous integration tests for your database migrations. So to be clear, this isn't an open stack specific talk. Rather, it's a talk about how to test your database migrations and perhaps why you should test your database migrations. And so you could hopefully apply this to your projects that you're working on if there are internal projects or open stack projects, hopefully open source projects or otherwise. So yeah, as I said, I'm Josh. I work for Rackspace on open stack, which is awesome. So this talk was actually originally going to be given by Michael Steele. Unfortunately, he was a bit too busy to give it himself, so he asked if I could help. However, he is in a room, so if I'm unable to field any questions, I'm sure he'll clarify. We both worked on this together, and we've basically applied this to the NOVA projects so we have implemented continuous integration for the database migrations there. So we need to cover some basic terminology, make sure we're on the same page. Most of you probably know what this is, so I'll do it kind of quickly. A schema version is a number that we give to the database in a particular state. We do this to be able to determine how to upgrade people who have already deployed software to a newer version. So at schema version 200, you might have 10 tables, and then at 201, you might add a new table and have 11, or a new column, or you might drop a column, and so on and so forth. And a database migration is the process of moving between these two versions. So it's the database migration that will add in that table for you or remove it. And a database migration can go in both directions, so you can undo and upgrade. There's an argument to be made whether or not that's useful, but we won't go into that today. A data set for the purpose of this conversation is referring to a database copy that we've taken so that we can test against it. So we might have a data set with 100 users, 100,000 users, and so on and so forth, so that we can test how it goes at different scales. SQL Alchemy is the object relation map of that Nova users. It's what actually crafts the SQL for you and then executes it. So you can hand SQL Alchemy an object and say, persist this to the database, please. It will get the query ready and commit it to the engine that you're using for you. And Zool is the gating system that the OpenStack infrastructure team have written. It basically launches a bunch of jobs when triggered by various events, and we'll look at that in a bit more detail to come. So just to give you an idea of what a migration looks like, this is a very simple one I could find in Nova. Basically, the server name column was no longer required on the instances table. It was moved into the metadata. So now this migration removes that column altogether when you upgrade, and when you downgrade, of course, it's added back in. So this is kind of what a migration looks like and we want to make sure that these are tested well. And yes, we're talking about testing and you probably know what continues integration testing, but in this case, we're basically just saying, it's testing on every single commit or every single patch to ensure that it works and to make sure that we have confidence in the master state of the code so that you can do continuous deployment or utilize the code, how you choose. So what do we want to do? This extra kind of testing, right? I mean, OpenStack's very good at the testing that they do. There's a whole heap of tests that the infrastructure team do. There's style tests and unit tests and scenario tests and so on. But one thing that wasn't particularly great was testing upgrades against production data. So in the unit tests that Nova has, it would actually test the migration. So there would be a table with, say, 10 rows and a column type would change. Before the column type, it was checked, oh yes, it's currently a Varchar. After the column type's changed, yes, it's now a Boolean. Okay, that's good, it works. So this is fine when you have five rows and it passes the test. But what happens when you have a hundred thousand rows or more, a hundred thousand's not actually that many. Turns out it's very, very slow and it can take tens to 20 minutes to do. And this is an issue because when you're running a public cloud or any kind of cloud actually requires a bit of downtime to or at least of your APIs to do this maintenance. So it requires doing this within a maintenance window. And when you do work in a maintenance window, you might be doing more than one migration. So when you start adding up three or four migrations, the total gets a bit unwieldy and it uses up hours of your migration window when you have SLA is to uphold. That can be an issue. So performance is a concern and one of the motivations behind this. But also the schema drift. So again, in the example I gave that a table might wanna convert a column from a Varchar to a Boolean because someone had originally stored true or false as a string for some odd reason. But what if somehow someone had managed to store a different value in there? How do you handle that different value? In the case of using real production data, the actual data in the database might not be what's expected. So the unit tests that exist in NOVA may only have those five rows that it was testing against. But we don't know if the real data looks like that. And we actually had a few cases in doing this where people who were running real open stack clouds couldn't upgrade their database because somehow it had gotten into a state that wasn't compatible with the migration. And so this kind of testing makes sure that your production data will, or so your new migrations will work on your production data. So yeah, we wanna make sure that we catch schema drift and similarly broken downgrades. So you might be able to upgrade the database but can you go backwards? But that's mostly to do with schema drift. So what are our goals here? What are we trying to do? We wanna catch these slow migrations and we wanna catch the migrations that don't work against real datasets. So that is to say we wanna ease the pain for operators. We wanna make sure that the migrations that land in the code don't only work for the five rows that are in the unit test, but also the 500,000 rows that a real deployment will have to upgrade. But we also wanna test existing migrations. What if someone proposes a patch into Nova that affects the way that SQL Alchemy works? They could significantly slow down the operation type of upgrading the migration. So we actually wanna test every single patch in Nova against every single existing migration because if someone hasn't run that upgrade yet we don't know how slow or painful that's gonna be given a new migration may affect it. So we need to also catch these problems early. Once a migration has landed in code there are a lot of people running OpenStack in a continuous deployment environment. So this is to say that every time a patch lands they're merging it in. And we don't know what state their databases are in. So if they have run the upgrade as soon as it lands we can't change that upgrade because then we'll end up with two people with two different schemas and then the code would have to support this and it would be terrible to maintain. So it's very important that we catch these problems early. So the primary aim here though it all comes back to easing the pain for the operators. For those who haven't seen this is how Nova does its migrations. It's Nova managed DB sync. You can supply it with a version number if you like but otherwise it'll go to the latest available version. So this is the command that can cause operators pain. So how do we determine if it's gonna cause pain? I mean firstly we can look at if it fails to run that's quite an obvious metric. It could have failed because the schema has drifted or because the code, the patch was just obviously wrong. But we also are concerned with the timing. How long does it take to move from one version to another? We can also look at some statistics. How many reads or writes or deletes did a migration do? And we can determine whether or not that's acceptable within a given range. So how do we implement this? This is a very crude illustration of how the open stack infrastructure team have built their continuous integration. A human will propose a patch into Gerrit. Gerrit is the code review system that most of you have probably seen on review.openstack.org. Zool, which I mentioned earlier, subscribes to the event feed from Gerrit and it will notice if a patch is created. And Zool says, oh, a patch is created. Oh, I need you to launch the PEP 8 jobs, the unit test jobs and the scenario jobs. Can anyone do those for me? And Jenkins says, hey yeah, I can do the PEP 8 job or I can do the unit test job. Please hand it to me. So Zool will send the work to Jenkins. Jenkins will check out the patch, run the tests and report back some results. So that's the basic CI that runs an open stack. The system we've implemented runs in parallel to this. Now there's a number of reasons for this. The primary ones being we didn't want to burden the infrastructure team of running our code while we're developing it, while it might not have been stable. And the second reason was that we also want to test against datasets that we don't necessarily want to be made public. And we'll look into security in a bit. But for now we run this system as a third party test. So we have our own Zool that subscribes to the same feed of events from Garrett. And our Zool will notice when a patch is uploaded to Garrett and say, oh, I need somebody to run the database migrations against 100 users or against 100,000 users or so on and so forth. So then TurboHipster, which is a job runner that we've developed, will say, hey, I know how to test database migrations with 100 users or 100,000. So Zool hands the job to TurboHipster, which then runs the migration, checks the metrics, uploads some logs for extra information and returns the result. To give you an idea of the scale, I believe in the Ice House release in the NOVA project alone. So in the six months window, there was about 260 unique contributors doing around 100 patches to Garrett today. And we run about eight different datasets in different engine combinations. So that's about 800 tests per day. And we're able to sustain this testing with a return time of about 30 minutes, which is quicker than Jenkins and the other tests on only 20 nodes. So it's not a huge burden to do this kind of testing. And frankly, we have redundancy in that. So we could have less nodes and still maintain that time. And if you're not concerned about a fast turnaround, you could do this and you're happy to wait. You could do this on one or two nodes and wait for the results to be returned. So let's look at TurboHipster. To be clear, the name was actually kind of random, but it stuck. I never actually thought I would be giving a talk about hipsters, but it turns out I am. What is TurboHipster, as I said, is a test runner or a job runner. That's a series of plugins so the plugins know how to run different jobs. It will register with Zool. So it says to Zool, I know how to run the database job or even I know how to run the PEP eight jobs. And when it receives a job from Zool, it will run the tasks that is needed in the plugin. Upload the logs to Swift or to make an SCP or copy on disk and return results. So Zool actually sends the results back to Zool. And Zool will collate this for every single job that it requested. And Zool is actually what reports back to Gerrit. Even though in Gerrit you might see Jenkins or TurboHipster, that's only there for legacy. So this is what a TurboHipster config looks like. It basically wants to know a bit of information about the Zool server. And also things like the Gerrit site. So where to get the patch from. The Git origin. Where to get master or stable branches from. The Gearman host, which Gearman is the protocol that Zool talks to its workers over. So it needs to know where that is. The Zool comes with a built-in Gearman server. So it's usually the same, but you can use a different one. We set up a few directories, like where to store logs, where to check out it into. How to publish those logs. We use Swift of course, but you could use other things there like copying on disk. And you can extend the config in a ConfD directory. In the ConfD directory we have our plugins listed. There's two main parts to a plugin that are required. It's the name and the function. So the function is what is handed to Zool. So in this case of the first example it's saying, I know how to do the job. RealDB upgrade Nova, MySQL, DevStack 13.0. Or that is to say, I know how to run the tests for the database migrations, for the Nova project against the MySQL engine, for the data set DevStack, and 13.10.7 happens to be the date. We took a snapshot of it. So when Zool asks somebody to run this job, Turbohistar says yes. The name of the plugin is the plugin is loaded. So it will run against the realDB plugin. In the data sets directory, specifically this plugin and contains information. So the data sets directory contains a config file. The most important part is the seed data down the bottom. That's used to load in the database before the tests are run. There's some other information here, such as the username and password, how to set up the database to be tested against as well. And there's some tuning information to as to what is appropriate for tests to pass and we'll return to that in a bit. So the realDB upgrade is actually the part of doing the testing of the migrations. Until this point, we've just looked at how you can do some basic CI. And it's a bit of an anti-climax because it's not that complicated. Now for the patch set, it will check out the code, of course. It uses Gerrit Git Prep for this, which is a script developed by the infrastructure team. It's impressive in how it calculates what it does, but needless to say, all you need to know is that it will prepare the Git directory in the state that it expects the code to be in when the patch does merge. So we create a copy of this tree and we call it working. So then we need to bootstrap the database. So this is just loading into MySQL or the engine, whatever the seed data was. So that's just passed into that. And then we upgrade to the current state of trunk. So the data sets that we're testing against might be at various versions because we want to see how they upgrade. So we've taken snapshots at certain points of time and they might be at version 170 to begin window or at version 200. We don't necessarily know or necessarily care. But one thing we do is we check out each stable version of NOVA, run the migrations against that, move to the next stable version, and then move to trunk and then check it out. So that's what this is doing here. You don't need to read it, it's just checking out the stable branch and then it's running the DB sync. So that's the actual part that will do the synchronization for us. And so the DB sync is the part that calls NOVA managed part we looked at earlier. And the only reason that we have the version supplied here is so that between each migration we can get out the INODB statistics. Otherwise, we'll just start grading to the latest version in the current branch that is checked out. So then we upgrade to the current state of the patch. So the patch might also have some migrations that we want to run. Then we downgrade back up to the first version in the last stable release so that we can test the timing of the downgrade. Then we upgrade again to the bottom of the patch set or better visualize something like this. So whatever version you start at, it doesn't really matter. It will go through each release, this Havana Icehouse master, which will become Juno, then your patch set. Once it's done that, it will go back up to the beginning of Icehouse or version 216 and then back through again with the patch set checked out so we can test against that. And so pass or fail is based on analysis of the logs or the output from NovaManage. I mean, the first metric we actually use is whether or not NovaManage failed. Did the migrations just not work because the patch didn't work or because of schema drift? So we check the error codes, return codes, and if it has failed, then we can return false at this point. Otherwise, this is roughly what a log file looks like. The migration is being run, then it says it's done and it goes through each one and does that. So to get the timing information, we just take the difference of the timestamps. But in between each of those, we get the INODB statistics, which look like this. And so we're able to pull out of that things like how many rows have read or how many are updated. So is five rows read too many? I mean, it's a pretty low example, but what is appropriate? I mean, maybe five rows read is too many for a database where you've only got one user. But if you have 100,000 users, it seems pretty appropriate. So we need to be able to tune our testing on a per dataset basis. So if we come back to the dataset config, we can see here that we've got configuration items such as INODB rows read. In which we specify down the bottom, you can see the default. So we're saying that for this particular dataset we're testing against, no migration may do more than 100,000 reads. If it does more than 100,000 reads, then it fails the test. And that's the failure metric we're using. However, we have to have exemptions to this rule because some migrations may have already existed before we implemented this, but also sometimes you just need to do more than 100,000 reads. It might be unavoidable to have a migration that does that. So that's what the other items in this JSON part represent. We're saying here that migration 215 to 216 is actually allowed closer to a million reads. We do the same thing for rows changed. No more than 100,000 rows can change for this particular dataset. And we have exceptions to the rule. And of course, for timing, we say it can't take any more than 60 seconds and with exceptions. And these work in two directions, right? So migration 213 back down to 230, so you're downgrading, is actually allowed 120 seconds as opposed to 60. So once we've looked at these metrics and determined if the job has passed or failed, Zul collects these into a report that looks something like this. So each one of these is a different dataset that has run on a different Tuberkipser node. Zul's collated the results and reported it back to Gerrit. The blue part is actually a link to a log and then you get a successful failing message and some extra information. So when developers see this, they can drill into why it might have failed. As an aside, the user 002 one that's failing here was failing because of a schema drift in one of the downgrades. So we had to backport a fix into Nova before this pass test. It's now working, however. So we actually uncovered a bug there. So I alluded to security before and security is a bit of a concern. It's a little bit scary. I think this probably applies to everyone running some kind of continuous integration, specifically when doing it in open source or as a public project, maybe less so in your company if it's only an internal project, but it's still a concern. Specifically, we don't necessarily know who is proposing the patches or if they have a malicious intent. We don't know what that code looks like and we're executing it on our nodes. But it's perhaps worse for us because our nodes need to contain copies of databases that we want to test against. So how do we ensure that these databases are secure? And I should point out that we're not actually testing in our current deployment, Raxbases public cloud database before you go trying to attack it. But we have done security mitigations. I mean, firstly, there's only a few people who have access to these nodes. So if you do write the database out to disk, well, it's gonna be difficult to get it off. So you might say, why don't we open a port somewhere and we just upload the data to one of the servers we're using? Well, we run all the code with networking turned off. So you can't open sockets and you can't upload the code, the database. So the only thing that leaves the system is the logs. So what if you dump the database there? Well, we actually filter the logs and if we notice anything suspicious in them, we won't serve them. And maybe this isn't ideal. Maybe someone will write a patch that somehow modifies the logs and the code, they won't be able to investigate the logs to why the code failed. But it's also not that concerning because we can go in and manually have a look and say, well, actually this patch isn't malicious. We can release these logs or provide some extra feedback. And we're also working on data set anonymization. So if you were somehow able to get a copy, they wouldn't mean anything to you. And we also want to do within the anonymization kind of a statistical jumble of like by adding in extra instances to a particular user so you can't get indications of what kinds or how many resources a particular user is running. So how do we do networking turned off? We initially investigated using Linux containers which looked promising, but it turns out to be very problematic with Python virtual environments. I mean, every patch that merges may contain a change to the requirements for the Nova project. So we have to set up a new virtual environment to pip install the requirements for Nova on every patch, which means we need networking for that component. And then once we tried to move the virtual environment into position, it just failed. And we probably could have gotten this to work, but we actually found a simple solution which was to use network namespaces, which is a feature of modern Linux kernels. Basically allows us to define a different namespace for the migrations to run in as opposed to the default one in which you can have different routes or different IP rules and so on and so forth. So we set up a namespace called knownit and we let that peer with the default namespace so that we can get back to MySQL. So better visualized the database upgrade process runs in a different namespace or the other processes and is only allowed to access MySQL. And this is roughly how we execute the command. We just say run the command within the knownit namespace. So how might you do this yourself for your own system for whatever project you're working on? Well, the first thing is to set up some kind of continuous integration. Something to trigger the jobs. Zool is very good at this. If you're using Garrett, it's an obvious choice. You can read Zool's documentation and you can upload it. So you can set it to trigger on patch sets and launch jobs from TurboHipster. We've looked briefly at how to configure TurboHipster. You know, you need to have as many copies of databases you want to test against. Maybe you could simulate some database that you can test against. But you'll probably need to tweak or rewrite parts of the shell script, specifically the part that calls NovaManage. We'll probably need to call however you're doing migrations in your project. But it's actually not that complicated. It's one line change and maybe we could make that more modular. But the output or the log parsing is also a little bit of an issue. You'd have two options there. You could write a parser for your logs that will interpret the output from your database migrations. Or you could change the output of your database migrations to look more similar to Nova. So as promised, we found some interesting bugs doing this. Actually, early on in this process before we were voting on patch sets, a migration landed that would take 20 minutes to run against a database with 45,000 block devices. And that's actually not that many, particularly when you consider some of these may be in a deleted state. So this is very, very slow. And the DBCI that we're doing at that time caught this. And when we let the developers know about it, they went, oh, that's probably not very good. So they proposed a new patch to improve it. And they waited for the CI system to finish. And they looked at the results and they said, oh, that's actually improvement, but it could be better. So they proposed a new patch and they kind of iterated on this, looking at the results from the CI until they got it to an appropriate timestamp. And they actually managed to get the oil operation down to a sub-section, second operation from above 20 minutes, which is quite impressive. It turns out they just defined everything in SQL because it turns out database engines are good at handling data. You can see how they did that by visiting this link which you can pull from the slide there. Yeah, I believe the original was pulling all the data out of the tables, manipulating it and writing it back in. So it wasn't ideal. And another example is one that I've used a little bit in this talk already, is that we needed, or we had a column that needed to be a string originally, but due to some information being removed or refactored, I believe it might have been moved into metadata instead. It was now a true or false string or a boolean. So why don't we make it a boolean? Well, changing the data type on a column requires a whole table lock and a rewrite of the entire table. This is fine when you have five rows that you're testing against, but when you're testing against 500,000, this is a very long process. So how do we handle this kind of problem when we actually catch them? You need to open a discussion with the developer. Why do we do this? Why do you need to change the column type? Can we optimize it? Are there alternatives? And so forth. So the result of this one was that to maintain API compatibility has to be a string anyway. When the API version is bumped, the column itself may be dropped altogether. I'm not entirely sure of the details, but needless to say, for now, the patch didn't merge. And this is a good example of TurboHipster catching something that would have been incredibly painful for operators to do. And so we gained some secondary effects from doing all of this as well. We've run, I've lost count, thousands and thousands of tests, against multiple engines. We tested against MySQL and Pocona. We only tested against those because those are the ones that we have example data sets in. But we should also be testing against Postgres or other systems as well. But we're able to use this information and we can look at it, we can actually determine that Pocona on average is slightly faster than MySQL, at least at these migration operations. And we actually use this information to tune the acceptable times for a particular data set. So we're able to take two times the standard deviation and say that the migration should take no longer than this amount of time for any given patch. And if it does, then that new patch fails. So we can do some interesting tuning from that. There's still a little bit of guesswork. What is an acceptable baseline for any new patch? Because the timing baseline has to apply to every single table. So while it might be 100,000 user data or open stack database, there might be a table with only 10 rows in it or one with many, many more. So that baseline timing's a bit tricky. And it's also a little bit of a guesswork. We think 60 seconds is more than enough time for most migrations to run within. If they need more than 60 seconds, then let's open a discussion to why. And that's a little bit arbitrarily set, but it's also based off our experience of having seen how it works against databases. So we're two from here. Zool currently only triggers off Gerrit events, so patches submitted to Gerrit. Zool's actually quite modular. Wouldn't be difficult for it to launch off, say, GitHub pull requests. So that would be a nice improvement so that other people can utilize this a little bit easier. Dataset anonymization's important to us. We need to make sure that there's a way of ensuring that your data's safe if you do this kind of testing. We're very close to finishing implementing node pull for our TurboHips to workers. So that way when we have only not many patches being proposed, there can be a couple of TurboHipses around, but if there's many, many patches, then it can scale out to a few dozen nodes or whatever is required. Node pull's quite an interesting project to look at. We also should probably test against other engines. So Postgres and RADB, and there's some DB2 people as well that are interested in this. And that's not particularly complicated. It just requires a little bit of tweaking of the shell script so that the seed data's set up correctly. SQL Alchemy will work with these engines so we don't actually need to worry about what Nova's doing necessarily. And we want to work on reducing our false positives. We're getting pretty good at this. We have some interesting side effects like noisy neighbors affecting performance. And we've done a few things to improve on this already. We'll get better as it goes on. So there's a bit of time left for questions if there are any. Yeah? You mentioned downgrade testing several times. So I am asking, it may be a little off topic. Do you think it's actually reasonable to test downgrade migration? Let's consider a case. You have a big data set. You try to upgrade it and upgrade failed. Would you trust downgrade code to return your database to a working state? So before you upgrade, I would kind of hope you took a backup and that might be a good source to use in that scenario. But that's a very good scenario of why you want to do this kind of testing because it wouldn't have failed if you were doing this testing. However, you need to consider the case where the upgrade was successful or at least it seems so and your cloud was still running. You had users launching new instances or making changes and then you discover, oh, we broke some other component over here or we want to roll back. Can you restore your backup you took? Because what happens to those people that have created the instances? So downgrades allows you to do it in a less lossy way possibly. I mean, the example of dropping a column and then having that column added back in on a downgrade, well, that's lossy as well, right? You're not going to have that data. So then the question is whenever you have a migration that affects data in a table, do you take a backup of that? And there are migrations within NOVA that do take up backups in shadow tables so that when you do a downgrade, it can restore the contents. However, what I'm actually alluding to is that this is a complicated problem. We don't have a good answer for this and there's a debate in the NOVA project whether it's useful. We believe it is useful because I don't think you can rely on backups alone, particularly if you need to roll back for any reason. I think ideally we just need to get really, really good at testing. Then we won't definitely need to downgrade. Okay, thank you. No worries. Any other questions? Yeah. Are you going to shout it out and I'll repeat it? Are there any projects which are running Darbo's hipster in the upstream gate? And if not, what's the, like for all the projects, all the projects, do you suggest that they run their own infra and what is the strategy? Okay, so all the projects don't necessarily run their own infra. It's only third parties that run their own infrastructure. And third parties need to do it for various reasons. So you do it where you need to run tests on hardware that the infrastructure team don't have. So if you have some proprietary product that you need to test against. The reason that we do it as a third party is because we can't handle the databases off to the infrastructure team. Which means if something does go wrong, they can't fix it. So we have to sit separately to them. And when you sit separately to them, you can't gate on it. So we can run checks and we can return comments back into Garrett with votes. But we can't gate on it because the infrastructure team, there can only be one canonical gate for it because the gate has to predict what the state of the project will look like once the code merges. So we can only ever vote at the check level. Does that answer your question? For example, let's say for Solom, we wanted to set up something like this. What would you suggest we do? Okay, I suggest that you talk to me later. And there's a few things, right? Like firstly, the Solom project might have some volunteers who are willing to run a parallel infrastructure like we have. But the other thing is that if you have example data sets that you're happy to hand to us, we can run these for you. So we can set up a shell script and we can just have one turbo hipster that's running in parallel to the rest of the infrastructure running all these tests. But you know, it raises an interesting question like what if your company, if you're a private, a public cloud provider, if you wanna test your data sets, you're probably going to need to do it in private. Yeah. Anyone else? Yes? Okay, the question was what version of MySQL you're using to test? It's a good question. We're using, I think, 5.5. I think it would be whatever the stable release is in the 1204 LTS, I hope you won't do it. But that's a good question. We actually came up to a few issues where if, actually one of the reasons for one of the schema drifts of one of the deployers who couldn't upgrade was because they were using a different version of MySQL at one point and then changed. And it got very complicated and very messy. We did actually fix it for them upstream. But yes, that's interesting. The result there is that we should probably test multiple combinations of data sets against databases. But we're not doing that currently. That's kind of difficult. Anything else? Yes? Okay, so the question was, now that we've done this, how do we feel about the migrations being written upstream so separate from the actual data? Well, the developers are the ones that are writing the code. And we have a pretty well-established code review system. So developers that are making choices such as maybe this information should be in a meta table as opposed to in the instance table or somewhere else. And they make these decisions on the project benefit. And so this tool will help catch where they're going to cause pain for operators. I guess the flip side isn't if an operator was writing the patch, they could have at least tested what they intend on doing against the data sets that they have. But I think there's a bit of a discrepancy here between what the operators have and what developers have. And developers need to be making decisions regarding the database that will improve the project looking forward. And of course, we're never going to get the schemers right on the first try. Yes? Right, so the question is, and correct me if I'm wrong, after the migrations are done, what are we doing to test the resulting database as opposed to just performance? Yeah? Okay, so the unit tests kind of cover this scenario where the unit tests will inspect the result of the database after the transition's done. Unit tests typically also cover what is this functional testing and integration testing. So as long as your cloud hasn't got a lot of schema drift introduced from it, then it should be fine. Now, how do we catch that schema drift? That's tricky and we don't necessarily do anything for that. Perhaps what we need to be doing is running Tempest against existing databases. But that's a different, more complicated problem. We were interested at improving this for operators. I think I've got time for one more, but it has to be quick. Or we can finish there. Oh, yeah? It's a question. Yeah, if they're quick. Did you run into problems after you tested all the migrations and now switched on to refraction of IAM and then run into problems because of some, let's say, case you didn't have before? Okay, so the first question was a lead-on from the previous one in how do we make sure that the data looks correct going forward? Well, again, I think we need to use Tempest to ensure and run Tempest against the dataset. But the unit tests will catch these kinds of things as well as I alluded to. And so the second question was, would you mind? Sorry. Okay. Well, we run out of time. That's all right, we can talk later. But thank you.