 And I'm going to be talking about databases and how do you apply techniques to the database development to enable continuous delivery. So you may be doing continuous delivery on the app side, maybe not having that much success on the database side, and what are the practices that are going to help you make that a success. So just like simple stuff like, why do you do continuous delivery, collaboration, and a bunch of other stuff? We'll go through each one of them and figure out how this works. Like a two-minute primer, and why do you do continuous delivery in the first place? Basically to get fast feedback and release frequently so that you can get fast feedback. This is also the reduced risk of releasing itself. Like if you back up a bunch of stuff, release once in six months, you're carrying a lot of risk in the types of things you develop itself, like business features, are they useful, and that kind of stuff. Plus technical risk also, like will this deploy properly, and will this fail, are there bugs, and that kind of stuff, like integration risk and things like that. So reduced risk, you basically release frequently, and you have reduced the risk. There's also this learning and responding to customer needs. Like business is changing so fast nowadays. If you don't have something in the market to counteract what the competitors are putting out in the market, like maybe you're at a big loss. And if you have a six-month cycle, and your competitors have something in the market, you are basically giving them six months to get more customers and stuff, and then you are behind. So if you're releasing frequently, you can catch up to them faster. Counteract that competitive force much faster. So that's why continuous delivery is important too. So how do you achieve continuous delivery, right? So one is basically a close collaboration between everyone that's involved, including business all the way down to ops. Basically close collaboration between all of them helps you achieve continuous delivery. A corollary to that is like automation. Like if you're releasing frequently, you don't want to do that. Frequent releases manually, right? It's a lot of work for manual people. Mistakes happen and all that stuff. So you want to automate as much as you can in this whole delivery pipeline or delivery process. And you want to reduce the gap between development and ops, like there's no longer, like I wrote a code and you go maintain it and you go watch for it. You will be woken up when a pager goes off and that kind of stuff, right? So if you are developing, you should be responsible for that code and work closely with the ops people to make sure they don't go to woken up when a pager goes off and that kind of stuff. How do you bring the database team into this whole pipeline then, right? So that's the whole notion. If you have a database silo sitting somewhere else and rest of your organization going like agile and continuous delivery, but they are in their own silo, it's not going to help you in terms of continuous delivery itself. So how do you bring them in your organization and how does it apply basically to databases, like all of these things that I've already talked. So let's go one step at a time and figure out all the techniques that you already do on the development side for collaboration and all the other related stuff that you do for agile. How do they apply to the database, right? So let's say collaboration techniques. You pair with devs, you pair with testers, you pair with BAs. When was the last time you paired with DBAs to design something, right? So if there is a story that needs some data work, pair with the DBAs, right? Don't just like do something and throw it over the wall for the DBAs. Make sure they are involved in the design of that object or of that data schema or of that data architecture, whatever that is, to make sure you're pairing with them. What pairing gives them is the context of the requirement, like what is it that you're trying to achieve and it gives them that context. Throwing a requirement or throwing a script over the wall to them doesn't give them that context. And what you gain from that interaction is knowledge of existing data sources, legacy data that you have to deal with in production, which you don't necessarily have in the dev environment, right? You also can version control everything on the database side, like all the devs are so used to version control everything. You can do the same on the database side too. Like you can put literally everything, like here's like a screenshot of one of my apps, like that's the DB folder, literally everything is under there, right? Like something about Delta's, even the data model, like if you're doing Irwin and stuff like that, that's also in the data version control. If you have anything specific for QA or performance and things like that, that's also in version control. So once the DBAs and the devs start putting all the database-related stuff in version control, visibility increases. And at the same time, you get a nice audit of who changed what at what time for what reason and things like that, which devs are already super used to. Like when you check in as a developer, when you check in something, you are referencing the requirement that forced the change. You are referencing when it was checked, by whom it was checked in, why it was checked in and stuff like that, which is not generally available on the database side. So if you do that, you are bringing the database also into this nice audit that's going on in the version control itself. You can also do a lot of automation techniques and you can start with, like let the developers provision their own application database, right? So generally speaking, we ask the DBA, like give me a schema or give me application schema or give me a dev schema and things like that. And after like a couple of days, they'll reply back, give you a schema, which you work on. Maybe sometimes you basically work in a common schema, everybody is working with the same schema. What if you let the developers provision their own schema, like a Docker image or like a Vagrant script that basically brings up a machine with a database on it that you can connect to and work with it, right? So that kind of stuff can be done for the developers. So here's a sample of this. This repo is actually available. You can clone from it and play with it and stuff like that, like all the examples I'm showing here, right? So as a developer, if I join the project, I'm just gonna do a clone of the repo that I'm gonna work with, right? Once I clone, I'm gonna do a bunch of things with it, like for example, change the build properties file to put my own username and things like that. So let's say I put Malmo as my username that I want in the database, right? I'm gonna fire off a build target here called and create schema. Now what that does is goes in and creates a user for me called Malmo and assigns me a bunch of writes and stuff like that. So I have my own schema right after I clone the repo by myself. This is provisioning by the developer for themselves, right? So there is no involvement or email or throwing our stuff off the wall and things like that. Once you have that, you got an empty schema for yourself and you say, great, like what do I do with this empty schema now, right? So there is stuff you could do for that too, right? So here's like a bunch of stuff that's going and doesn't take much time to get your schema. Once you get that schema, the font is a little small, but bear with me here. So once you get that schema, you can also do like one more target called dbinit or whatever you call, create my schema, whatever you wanna call it. And what it does is basically works with your schema that you just created for yourself called Malmo and creates all the schema that is necessary for your app to work with, right? So your app may need like a bunch of tables, triggers, stored procedures, functions, views, whatever you want, right? All of that can be scripted and put in a script file that can be recreated for you, right? So this version control scripts can be recreated for you. So here you provisioned your own schema and now you're provisioning all the database objects that you need yourself. And once that is done, you can create some sample test data for yourself if you're a dev, like a bunch of customers, a bunch of invoices that your test can work with, like integration test can work with. All of that can be done. You can create your stored procedures, your triggers, your whatever views, functions, whatever that is, right? So all of these are basically scripts that are being coordinated by a build script file that I'm using and here. You may as well be using May 1 or Gradle or whatever the latest thing you want to use. Once all of that is done, what you now have is a schema that has all the provisioned objects that you need for your app to work with, right? So you don't really need to go find someone to do anything, any of this kind of stuff. So this is useful for automated provisioning of things for the dev and you're reducing friction because this is what we do on the application side. If a developer basically says, I need to work, run the code myself, they basically provision their own environment for run the code. Why don't you do the same with the database also, right? So generally speaking, this is what we are used to working under, like a bunch of people work with a common schema, right? How many of you are in this situation? Like a common dev environment for everyone, yeah? Three, four? How many of you are in this kind of situation? Maybe one, right? So in this scenario, what's happening is there's a bunch of people on the team working with like a dev database, right? And when you are in this kind of situation, let's say for example, Andy and Joe here want to make a change, right? They go make a change here and the change may be like rename a table, split a table, split a column, drop a column, whatever that may be. Once they do that, immediately everybody else gets that change because they're pointed all, pointed to the same schema, right? So even if they wanted that change or not wanted that change, even if Andy and Joe are comfortable with that change or not because they may want to try different versions, all of them get the same change. And if it's a breaking change, everybody breaks their builds. And now you have suddenly introduced like a stoppage kind of situation in the team because nobody can run the test, nobody can do anything with this database because in a weird state, unless Andy and Joe here check in their fix, all right? So instead of that, if you are in this kind of situation here, like Andy and Joe can do whatever they want with their schema doesn't affect the rest of them. And as long as they don't check in the whole bank, there is no effect on these people. Even when they check in, these people can only take the change or can take the change only when they want it, right? So you only do like a sync when you are ready to sync. You don't sync just because you want to sync, right? So that kind of stuff frees up Andy and Joe to experiment, right? So I'm a firm believer in the notion of experimenting. So whatever design you come up with, if you are not comfortable with it, you should be able to change it, right? So in this scenario, they may come up with some kind of a table structure, data structure to match with the domain object. They may not be happy with it. So give them the opportunity to change that, right? Instead of just checking in with the fear of breaking others here, give them the opportunity to experiment. So the more they experiment, the better design they'll come up with and then check in, right? So once they check in, the rest of the people can sync up and take the change and do stuff. This also frees you. Like for example, I'm showing here a common infrastructure hosted like IS hosted or ops hosted database here. This is the same scenario here, ops hosted database, but everybody gets their own schema. You can take this further by like using Wegrant or Docker or stuff and you can run it on everybody's laptop. At that point you have freed the developers away from this common infrastructure because generally database is considered as common infrastructure. You can do a Docker or a Wegrant script on top of that and then it becomes independent and the developer can now work from home, work on the planes, work on trains, like wherever they want to. They don't necessarily have to be connected or VP and into your office, right? So now obviously you're saying there are so many changes, everybody's doing a change. How do you bring it all back together or integrate, right? So we can also apply the integration practice that we are so used to in the developer land to the databases also. So let's see how, right? So for example, you may be in this current situation where the data team is doing all this change management, like you request a change, they make a change in the dev DB, they will also make the change in the dev DB here so the CI servers can run against it and they make the same changes in like production, UAT and QA, like the change is controlled by them, right? Instead of this model, what's better to do is go into this model where whatever change is being necessary to make that story work, you make those changes and you can pair with the DVS there, right? And make that change and once that change is made, you can check in that change in source control, right? And instead of working with the same dev database, you create like a different schema for the integration database. It's a fresh copy and everybody reruns their scripts so that to make sure everything runs together, right? And that migration script, when you check in, it triggers like a CI build. The CI build uses the same migration script that you checked in here and created here, checked in here. It's the same migration script that's using to apply the changes. Once the build is successful here, right? You can package that up, right? Package all the migration scripts. You can create a package of a jar, war. If you're in the .NET environment, maybe a DLL, if you're in Ruby, maybe you just do a tag or something like that, right? So whatever you do to create the artifacts, you basically create that along with your migration scripts and when you're ready to deploy, you're deploying the jar, war, whatever your artifact is to that environment and you're applying the same scripts to that environment, right? So this does like three or four very interesting things. One is how many times have you deployed an app to an environment and then realized a column or a table or a view is missing and the application is throwing errors like column not found, table not found. All of that just goes away because you have linked your app to your database structure right here. And when you deploy both of them together, the database and the app are always in sync. You will never have that problem again, right? So this kind of takes care of that. The integration part takes care of making sure you're testing your app with the database that it's supposed to go live with. So if you have a bunch of integration tests on your application site that are talking to the database like let me insert a customer, let me fetch a customer, let me create an invoice, let me get an invoice and that kind of stuff, it's actually going through all the layers, hitting the database and creating there and you are getting it back. So it's kind of integration test to make sure your app works with the persistence layer and down to the database like the object is persisted and retrieved, right? So that kind of integration test helps you make sure that the whole stack is working instead of just a mocked out top layer test, right? So when the build gets created and the migration scripts get packaged, you are pretty sure that this app has worked with this set of changes, right? The other interesting that happens here is now all the changes that were done to your database are version controlled in the version control system that you are using. So whoever wants to see what change in my database, you have a list of changes that are available for you to see and audit and do all kinds of stuff, especially on a regulatory like banking and government kind of stuff, they want to know exactly what changed for this deployment and you can just say, okay, here's the log from my Git repo or log from my CVS or SVN, whatever that is, like here's the log, that's what changed and here are all the files that we changed, right? So it's very easy to give them that kind of stuff instead of someone maintaining it. The question that in migration scripts are normally created scripts on the database, there is a state one and from where we migrate to a state? Yes, we'll get there. Next, two slides. So those two or three important aspects give you this flexibility of tracking the database, what happened, who changed, what changed and linking the database with the application because your application is just objects and the state of the object is stored in the database, so they both need to be in sync at all the time, right? So that's why it's important that you kind of link this database code with the application code to make sure they both are at the same metadata level, right? So when you do this, you can basically test the application code with the database in a single place, making sure all of the layers work together. You can generate code and database artifacts as a single piece, like a single linked unit, build number four has this jar and build number four has this database schema. It's very easy to link them together, right? And you can basically check all the integration level tests in one place and you can also show the current state of database to all, like this application goes with this schema and it's very visible to everyone instead of magically someone creating the database for you, right? So here's like a snapshot or screenshot of like Jenkins here, for example, you generate the application and within this you're basically packaging up all the change scripts, right? So if you take this further, like how do you track changes, right? Because you may keep making changes, how do you keep track changes and how does developer one know that developer two is not changing the same object or renaming the same object, how do you keep all of this in sight? So you can write each change basically as a Delta script or a migration script. There are a bunch of frameworks that work this way like Rails migration is one, if you know flyway, DB deploy, liquid base or DB maintain, these are all frameworks that let you manage these kind of migration scripts, right? So each change is a Delta script and they are basically migration scripts are a development time activity, not like a deployment time activity. So when you are going to make a change, the devs are gonna have all the context of the story surrounding the change and if you work with your DBAs to like script the right kind of change, the DB also get the context and all of the context is given to you at development time, which is very important, right? Like imagine the devs are making a bunch of changes and during deployment time, the DB is just doing a database diff trying to figure out what changed. They don't have any context, like the business context is not there, like why the change was needed, what was it changed if you're splitting a column, how was the column split? Like are you splitting at spaces, are you splitting at commas? Like all that context is lost, right? So you said when you're actually making the change, the context is there with the developer. So if you pair the DB with the developer at that spot and make the change there itself, it's much more useful and it becomes a context rich migration that you're doing, right? So that's why it's a development time activity, it's not a deployment time activity, right? And you can package those migration scripts and create like an automated deployment package out of that and use the same scripts on the developer machine, on other developer machines, on your QAT, QA or UAT or production or any other environments you have here reusing the migration script as it moves between environments so that you can put them in a good delivery pipeline as such, right? So here's the example of how this will look, right? So for example, there are a bunch of developers making the change. I think you had this question here. So you can sequence those changes, like the first change I make to the database is change number one and I make my change, right? And when you make your change, you check in into your source control repository and let's say this developer does a pull, they get that change, like change number one, right? Let's go to a scenario where this developer and this developer are making the same change at the same time, like maybe different changes at the same time and they call both of them one and one, right? So it's because they don't know and the other thing has not been checked in yet and it's not been pushed. So even when they sync, they don't see the change. So when that happens, both of them are creating one and one and both of them let's say sync up and they push to the main repo, the CI server will fail because it will see that there are two changes named as one, right? So all the frameworks that I mentioned so far, they'll complain that you cannot have two changes as one because that's where you will fail the sequencing of changes, right? Wasn't that your question, right? So you will have the sequence of the change is very important because the state of the database moves from one place to the next, to the next. It's always a foreign moving change that you have to maintain, right? It's just like GitHub, right? Or Git, for example. Git is basically stores change information only, right? So when you are applying two changes at the same time, there is some synchronization it has to do to figure out which one was first, which one was second, which one was third. So that's what we are doing here too, right? So once you check in your change and sync up the source control repository, CI server will pick it up and run it against the integration. So all of these changes that are current here get all applied here. You may decide the moment the build is successful, I'm gonna push the change to QA, for example, because whenever latest changes are there, you wanna push it to QA. So the QA is always current. All the changes have been applied here. The pre-powered or UAT database may be a little behind because you want it to stable for the business you have to check and that kind of stuff. So this stays behind and only one to five have been applied. And because the changes have been not yet released to production, this is still stuck at one to four, right? So when you decide to push from here to here, all you have to do is take that change number five that was done here and that's all is pushed from here to here, right? So it's basically a script being applied here, here, here, here. And it went through like changes being tested four times already and all the devs have already run change number five on all of their machines multiple times. So the confidence in the script running properly is very high, right? So you can deploy them without any user intervention, right? You can even take these frameworks and the app can upgrade its own database by itself, right? So when the app runs, it figures out, oh, I'm at four, I need to be at five. Let me apply five and it'll upgrade its own database and then start instead of waiting for someone to apply the migration on the database, right? So there is coordination done by the app itself instead of the third party doing the coordination which gives you interesting ways of doing this especially if you have like 50 of these like microservices architecture, you don't really want to figure out what change I need to apply where. Let the app do its job, it upgrades itself and keeps on going, right? So that gives you a powerful metaphor to do continuous delivery and this delivery pipeline then keeps flowing and most of the DBAs are working in this area instead of like trying to manage change, do change management here. They are coordinating with the devs trying to make sure that the changes that are being applied are the right changes. If there is any kind of like design standards and performance restrictions, indexing, all of that stuff is being done here instead of like complaining about the devs here, right? That's what happens most of the time like the devs do the changes here and when the DBA gets the changes here they're complaining like, oh, this name standard is not right or this index is not right and by that time it's too late to like go back and change that stuff, right? So instead if you focus the effort of the DBAs here then all of this flows directly using those migration scripts concept, right? So let's go back to deployment now. Now that you have these migration scripts you can upgrade basically at any time and it is a development time task, not a deployment task. So once during development time you have figured out what that migration script is you can easily apply it to any environment at any time, right? You can even go back here for example a very funny scenario that happens is there's a bug let's say in production, right? And someone wants to like fix it and recreate the database as of this four because that's where the bug is happening, right? So you will take the repository, the GitHub repository as of whatever this deployment was and basically do a checkout from there and recreate the database to be four. You don't have to like get a backup copy from somewhere like give me as of version four or anything like that. You can reprovision that state whenever you want. It's a very interesting side effect of this way of doing the database stuff. So you can package all of this and during continuous and you can apply it to whatever environment you basically want and then now you can deploy frequently like every change you make as it goes the app gets deployed, the change gets deployed. As the app gets deployed, the change gets deployed. So this cycle is continuous like bunch of projects I have used like we have deployed like literally two, three times a day, right? So that kind of technique helps you reduce risk because one change is very small and that risk it carries with that particular change is very less, right? So now to do this and keep the system 24-7, you need to apply some kind of refactoring techniques. You can't just like change stuff randomly because dependent systems may break, right? So let's see what kind of techniques you'll have to apply for this, right? So basically a database refactoring you do, it could be a small radial change or a data change or you want a DB code change basically to improve design to accommodate new requirements, right? So the good part of this is there are a bunch of techniques and patterns you can do. The thing you need to remember just like code refactoring, you cannot do database refactoring the same way. Like renaming a column has a bunch of side effects, right? Renaming a method, the side effects are there but they are constrained within your app and most of the time you can find them by just doing a compiler or stuff like that. It will basically find you all the names where it's not being referenced or so. But in the database, that's not true. Like you rename a column, you have no idea where it's being used, how it's being used and all the supporting apps basically will fall apart, right? So this is generally the case at enterprises, right? Like if you're working on this app, this is your database, there's a bunch of people talking to it, right? This is the anti-pattern that Mary was mentioning morning, like database-based integration. Like a lot of people are talking directly to your database and imagine you go here and change a column name, right? So if a bunch of people are using, they're gonna complain like, where's my column, what happened and stuff like that. Bunch of them are gonna fail. Bunch of them may even find out like a month later and by that time we have lost the context like what happened and things like that. So to get away from this, you need to get into the habit of giving interfaces to your other users, right? And we are used to giving like APIs, deprecating APIs on the code side. How can you do the same stuff on the database side, right? Like how do you give like a temporary API for the database to accommodate certain requests which have been already been changed but you want to make sure that the old application stays alive and doesn't like complain. Like for example, this app, if you rename a column here, it still should work but you should have that new column also, right? So let's basically see how that can be done, right? So when you want to make this kind of change, you start the change, you make the change and you put in some scaffolding code. I'll elaborate later what I mean by scaffolding for code but once you start this change, you are expanding your database to accommodate a change, right? And there is a transition period in between. Like the previous slide we saw the other applications, you basically notify the PMO or do some email change saying, yeah, I'm going to rename this table or I'm going to rename whatever or I'm going to change, split, whatever that change is, right? So in that transition period, the old schema as well as the new schema are live at the same time, right? And once you're done, you are going to basically end that refactoring, remove the old schema, remove the scaffolding code and that's a contract phase. During this time, you're going to basically be contracting that change that you have kept around, right? So remember these three phases as we talk about examples here, right? So keeping the old and new alive is very important because your database should be able to handle multiple versions of the app or multiple versions of the calling client, right? And you can create interfaces in the database to support that kind of thinking. So you can wrap tables with view, you can create calculated columns, you can create triggers to sync data, there's a bunch of techniques you can apply. So let's look at an example, right? So there's a customer table that has an ID and a name and I really want it to be split out as first name, last name because I want to do interesting things with it, right? And this is the name which is the old. So this is the transition period, this is the start period and this is the final contract stage, right? So let's take an example, right? So in the start, the name has a value of like promote Sadalgay with a space in between and during the expanse phase, the name is still there and I have a first name column of promote and the last name as Sadalgay, right? So I have split this with a space as a separator and then old column is still there and the new is still there too. So if someone comes in from a client that I don't know about, some other third party application that I don't know about and they are looking for the name column because they don't know about the first name and last name, they will get the data that they are looking for and I go in with the new application and I care about the first name, last name and the data is there, right? So I can use with that and when everybody has gotten up to this way of looking at the schema, I can contract it and remove the name column total, right? At this point, nobody should be accessing the name column at all, right? So start, expand and contract, right? So let's see how we can implement this change, right? So when I start, I can say my name is this, this is what the current state is. I can expand it with two different ways. One is without any data migration. So I just add two columns and they say I'm done and let the application deal with this in a lazy way, like when it reads, it doesn't find anything in first name and last name. I'm gonna read the name column, split it, put the data in here and persist it back, right? So that's a lazy migration way of doing this. I could do this, like with data migration, I could say name is this, I'm gonna split it at this and update these two columns with that data, like it's update script that I run and I'm gonna do this, right? So either way, whichever direction you pick, there is some intelligent thing you have to do either on the app side or on the database side, right? And then at the end, I have this contact stage where I do first name and that time and the name is kind of removed away, right? So this is still theoretical. Let's see how you would implement this as a change. Like this is a very simple scenario using DB deploy as a framework and I say alter table customer at the first name and the last name and if I want to roll back the change, I can drop those two columns and I'm done, right? So I'm not doing any kind of data migration here, nothing's happening. I'm just adding columns or removing columns if I want to roll back the change, right? So let's go back a little bit ahead with the synchronized data, right? So if you go back here, what happens if someone new puts a new row here, like let's say Martin Fowler, right? Someone adds a new row here called Martin Fowler. How are these two gonna get that data, right? Or if someone adds a first name of Martin and a last name of Fowler, how is this column gonna get that Martin Fowler combination, right? So you could use a synchronized method for that, right here. So I add those two and I create a trigger here which is basically a before insert or update and if the old is there, right? If the new is null, then I'm gonna split it. If the new is there but the old is null, then I'm gonna extract the first name out of it and populate the right columns, right? So the data is always synchronized for the old application and the new application at the same time. So you are giving interface now to your schema so that both of them survive at the same time, right? And the reason you are doing is you cannot force your whole enterprise to like change the schema all at the same time because everybody is on a different release schedule, everybody has different priorities and things like that, right? So this is what I'm doing. So even here I'm not doing any data migration, this is just a synchronizing technique. You can take this further and say I'm gonna add those two columns, I'm gonna extract the first name and last name from that name column and I'm gonna still put the trigger. So all the existing data will be migrated and new data coming in will be synchronized, right? And you can take this further, right? At some point and say, okay, at some point I'm gonna contract this and basically drop the name column off. There may be situations where your customer table may have millions of rows and you don't want to take the hit of a drop during the time when the table gets locked, right? So you may be like the five minutes or so the table is out of commission and you don't want that downtime at all, right? So you could do, again, apply a bunch of techniques, like in Oracle, for example, you can say set unused and that name. So the Oracle just marks that name column as unused, doesn't show up in the metadata anywhere and it happens instantaneously. There is no lock applied or anything like that, but you still got what you wanted, which is a contract phase and later maybe in a month or so when you may have some downtime, you can remove that column totally to reclaim space, right? So again, this is where your DBS can help you because they will have much more knowledge about how to use the database technology and the database techniques and tips and tricks and you can also learn a lot from them on how you can use those to enable continuous delivery, right? So basically when the drop takes forever, you can name it as that, right? So when I do the drop, right? There may still be called applications there which are relying on the name column, right? Because they are not upgraded or people don't want to invest in changing that application because it may be sold that we are gonna rewrite it in five years or two years and we don't want to invest anything in changing that state. You can say, okay, fine, we can do that, we can drop the name column and I can add a virtual column which is basically a combination of first name, last name and it shows up as name in the metadata and people can query it, right? You can't write to it, but you can query it, right? So basically that's a virtual column in Oracle, in MySQL it will be called generated column. So that is one more way to give an interface to the database for the outside users, right? So using techniques like these, you can do a lot more smart data migration and database migration while you are constantly deploying new stuff to the database also, right? So this enables you to move at the same speed at which the app is also moving because developers can move much more faster because they can constantly refactor, deploy, refactor, deploy. If your database is also able to move at the same pace, then you can achieve like true continuous integration and continuous delivery in the whole stack instead of just the tops, like the application stack, right? So let's take another example here, like the table here is named as Cust Order and I want to like really name it as customer order because that makes much more sense. It like matches with my domain object and things like that. So I could do this, this is starting, this is my expanded state and this is my contracted state, right? So you can say alter table, Cust Order, rename it to customer order. I am basically applying a view here called Cust Order which matches the old name and it is a one-on-one view on top of the customer order, right? So whoever comes into the database looking for Cust Order now, hits the view, gets the data that they are expecting, whoever is coming in looking for customer order, gets the table, gets the same data and they are both transferred now, right? So you don't necessarily have to go and modify the old databases to match up with the new schema because now we are given again an interface layer for them to look into the database without them being having to change, right? So there are interesting techniques you can apply like this to the database while you are moving at the pace that the application is moving, right? You can also put data in the migrations, right? Sometimes we do functionality for which there are no like online screens for people to put data, like a list of states or a list of countries or list of currencies are common examples for this. Nobody wants to build screens for that. You just put them directly in the database and the app uses it in dropdowns or in other places, right? Like business units, you could do the same for currencies, you can do the same for like, if there is a bug that you want to delete a bunch of data because the bug introduced a bunch of wrong data and things like that. All of this can be part of migrations because you are fixing the bug, changing the code and you are fixing the data for that bug fix and both need to go at the same time, right? So that kind of techniques, like data can also be part of migration, like data fixes basically, right? So a big tip here I can give you is don't do large refactorings at one time. Like if you want to change a bunch of stuff, break it down into smaller pieces and apply them as they go because large, really large refactorings are very risky to deal with and also very risky to roll back if you do any change, like basically you have to roll back and you can sequence this bunch of small refactorings together to create that big effect that you're looking. That's the same thing even we do in like code. You won't change like 50 classes at the same time. You change one at a time, check in sync. You will change the next again, like check in and sync. Same technique applies on the database side also, right? And basically migration script should be all checked in and basically you run the same thing on all of the environments. One of the rules I follow is when a developer is writing a migration script, whenever you are applying it to any of the environments, it has to be a checked in migration script. You are manually not changing the database any place other than the dev database, right? The developers database. Everything else, it's basically applying migration scripts. If there is a fault in the migration scripts, make sure you check in a new migration script that fixes the fault instead of manually fixing the database and that kind of stuff. So don't introduce any manual changes in the middle of the pipeline, right? And changes to data are also migrations, right? So when you go to deployment, it basically a matter of injecting the property of the database, like username, password, the URL for the database, and you can run the same ant target to every place, right? It doesn't matter where you are upgrading, which database you are upgrading to, you just need to pass the properties or connection properties for that database and you just upgrade that database as you go, right? So that's all I had to say. That's the book I was talking about and this is about continuous integration and databases. So I'm ready for questions. Yeah? A few questions actually. Sure. The first one is too, because all the projects, I think, a lot of teams try doing this. And while all is good, there's one, we ran into some problems with broadcasting because on the integration server, right, we have all everybody's migration scripts. But because the place where we do broadcasting, we go a whole bunch of production life there. And then if you want to do this, then it's every single developer, right? Even though they are not, sometimes they're not very concerned with the broadcasting or performance, right? But they are also responsible for putting in all of the migration scripts, including the ones to create, as I said, by assimilating 100,000 additional rows. And then unless they drop a column or they add a new few columns, and then we ran into a big problem because now there's a lot of overheat on each of the individuals in the group. That's one. And the second one would be when you're trying to do this, right? I mean, on the database side, it's fine, but for the application developer, right? Let's say if they are going to do something on the stock position, because they are not going to do that, right? It's a statement, right? And if, let's say, you convert it to a view, then all of your stock positions are probably not gonna work, right? Yeah, so that is another issue. So how do you tackle two of these? And then a third one, in relation to what you talked about this morning, I'm known as QL, would that be a better option going forward or would that be something that is, that would might render our DBMS or this kind of methodology, not rather than a differential? Sure, so let's me take the third one first. Like what if you use no SQL instead of relational databases, then you don't have this whole problem of schema migration. And like I said during that talk, when you say schema less, there is no schema in the database, the schema is in the app, right? So you carry all that complexity in the app, right? So imagine this same name and first name and last name you had in this. Now you have a document that has name, some documents have name and some documents have first name and last name. Your application code needs to be intelligent enough to figure out what is there, what is not, and then convert it to the right domain object, right? And as you have carry more of these changes, like you imagine there are like 50 of these kinds of changes, then your code becomes more complex, right? So you're carrying a lot of technical debt on your application side instead of it being on the database side. So that is the third. The first one is about like, what if I have a lot of data, like especially to recreate bugs and stuff like that, every developer will have to carry that and then worry about like, okay, my column get dropped, it takes more time or updates take more time and that kind of stuff. The pattern I have seen for that is you create a separate script that will create that data for you, right? Create the, like basically a database multiplier, for example, you create three customers and tell this script like make 500 out of this or a thousand of this. So that script is a separate script that you keep in a different place. If the dev needs lots of data, they run that script and it basically multiplies that. If they don't need it, they don't basically run that script, right? So not every developer is then constrained with that size. A lot of times we also tend to have not enough like data generators for testing. So we tend to just copy production and that creates other problems because now you have to deal with privacy, you have to deal with all the other stuff and then you have to anonymize like production sets of data which again like who wants to carry like 22 terabytes of data on some other environment, right? So you said if you had like a script to basically like multiply your existing test data to like I have one customer make me like 500,000 of these and you basically like do random names and random customer IDs and basically generate a bunch of them. And if you want it, you run it. If you don't want it, don't run it, right? So then you are working with small or big as your need desires, right? What was the second I forgot? Right, so if you have a stored procedure that needs to insert into a table and stuff like that, at that point the stored procedure is also code in your version control repository, right? So you can deal with that, change the version control repository like change the stored procedure there itself to point to the table instead of the view, right? What if the stored procedure is owned by some other team at that point you have to negotiate with them to see what works or maybe create a newer version of stored procedure for you and older version for them or like you can encapsulate the stored procedure itself and based on what version it is, then it goes to the table, right table and that kind of stuff. You have to do that through code itself. Because in doing your transition phase, right? You already need the duplicate of the scroll. Yes, you do. And the contractions basically kind of let me move out as well as change the vehicle. Yes, yes, you have to. And that's where like integration level testing helps here, right? If you have integration level test that going all the way to the database and coming back up, all that layer gets tested and all those changes also get tested while you go, right? And the whole notion here is to keep the old apps alive. If you don't have that requirement, then you can change yourself because code is under your control, right? Then you can change it yourself. So depending on what kind of requirements you have on that particular database do, does it affect others? Does it not affect others? In some places I have seen the database is extracted as a separate repo and it's a dependent for every project, right? So if you go make a change in the central repo, then you have to run like a contract level test for all applications. And if something fails, you have to go fix that application to deal with the new change or make sure you make a change in the database such that it can handle all of the applications, right? So as a dependency, like this is what we do with even libraries that we develop, right? When we develop a library, we basically externalize it. It's a separate repo or maybe even publish just a jar or things like that. You do the same with the database also like you extract it out from your application or from multiple applications, create a common repo for your database and everybody is kind of dependent on that. If somebody goes makes a change in that database, it creates like a, it triggers basically downward builds on all environments or all applications. And if they're contract with the database fails, you know, okay, there is some problem somewhere that you need to fix, right? Yeah? Do you create tests for your database materials? You could create that. It is a very expensive proposition if you really think about it, right? Because you have to mention, okay, this last question I've been told to shut up. So it's a very expensive proposition because you have to keep the previous state and the new state at the same time to test that migration. So usually, yeah, usually what I have done is given the new state, can I still get the same customer with the same properties and assert on that customer, do you have the same first name? Do you have the same last name? And if I get, like if I have a method on you called get name, which combines first name and last name, does it still match the old stuff, right? So you can create that kind of test on top, but maintaining old test, old state and new state creates really expensive really fast. From experience, do you actually do that? Yes, we do that the new way I said, like where you actually assert that the customer is still the same instead of, like if a customer was there with like one, four orders that had three items each, then I basically get the customer back again, ensure that they have four orders and three items each and the value of those orders was this, that kind of stuff, right? To ensure that the data integrity is there. Yeah, I'm here so you can ping me later, but I guess we'll have to shut this down here, yeah? Thank you. Thank you.