 So, this is a quick introduction about how can you do continuous delivery for databases, because usually databases seen as a slow thing, do not touch it unless you really need to. How can we break that assumption about that and it is a misconception, how can you break that? I am sharing my experiences with that and it is just a beginner level talk, there is a lot thing that can be done on this. So, I am trying to convey that it can be done and explore yourself and do it, that is the whole idea of the talk. Quick show of hands, how many of you are writing code on a daily basis, programmers for that matter and from this how many of you are managing databases, I mean managing the sense you interact, cool. So, I hope that will convey something to you, I will be able to convey something to you. So, just a quick introduction, I am a co-founder of a company called Yoga Tree, it is a digital marketing product for yoga studios. And as we started this few years back, one of the thing that I really struggle, I think if you have worked on a product company, especially a start-up company, is that it evolves. It is a evolvability, how do you support that to the product? You think that this is what the users want or you think that this is how users should be doing it, you go out to the market with that and see that is not how they are using it. Then you change sometimes slightly, sometimes a bigger change. So, how do you adapt to that and how do you support that, especially for database. If you have production, if you have users, live users, if you have to make a drastic change here to your product, how do you make sure that the slow in nature of database and high risk, anyone who has had a database outage, they do not want to have it that in your life anymore. But the solution to that is not to touch database, you have to come out of that fear and find out how can you make more changes to it, how can you make it less risky. And that is where the concept of refactoring is doing things in a small steps and then at the end of a series of small steps, you get a bigger change out. That is the concept of refactoring and the same refactoring concepts can be applied in database 2 and that is what is called database refactoring. Many of the concepts that I am talking about, the example, the concept that I am using in this presentation is there in the book called database refactoring. I highly recommend you to check that out. And what it does, database refactoring helps you to evolve or evolutionary, it suppose evolutionary architecture and it suppose continuous delivery because the steps are small. You are not making a big bank change, you are making things in a small fashion. So enough of talk, let us look at some examples. So these are real examples that I had done in the product. I reduced the scope of it so that it is easy to convey. So we have, this is a digital marketing product. So we have the concept of people taking memberships with yoga studios. So you join a yoga studio and you take a membership and it can be for 3 months, 6 months, you pay that and that they record in their system. So when we started off, we call it as billing cycles, we had these fees in the database. Then we realized after some time we got feedback from the customers that some members do not pay everything in one shot. They pay in installment, especially if it is a yearly subscription and then they pay it in installments. So this kind of a structure where you have all the payment details in one table, you need to have one cycle but multiple payments, that will not work in this case. So we had to split it and that is what called the split table refactoring that we are doing over here. So this is what it should be. We should have one to many relation between the cycles and the payments. So that we can record, this many payments happened for this particular thing and we can also find out what is the balance and stuff like that. So how do you do that in a continuous delivery fashion? So what we have to do is we have to split this table into multiple tables. But we do not want to do that in a big bang way because if something goes wrong, the entire data will be screwed up. So what we do is we have a parallel implementation. So we introduce the payments table but we do not delete the existing fields. And we copy the data like when we add a cycle, we store that data into the payments also. So we have to do that by using a database trigger or say your own ORM callbacks and stuff like that, whatever it is. That techniques does not matter much but the idea is that you have parallel. So the advantage with that is if you have a lot of downstream dependencies like if you have native apps for example, say mobile app, then the APIs of that needs do not have to change on day one. It is not a breaking change. You can change that over a period of time and then you can, that gives you enough conference that nothing has broken. So you do this, you have the split table but you have parallel implementation and after a while what is that time period where you have to have this parallel change is up to you. Especially depending on the downstream dependencies. And in the end you contract, you expand and then you contract. Contract to whatever schema change what you want to do. At that point of time only you delete the old data but you have already migrated that to the new tables. So when you first deploy, you take care of the migration and using callbacks or triggers, you make sure that the data that you are saving into the old fields you are also saving it to the new one. That can happen in the background. So this is the entire refactoring process. First you introduce the new changes. Then you have a transition period of migrating to new way of implementing it. Once you have completely transitioned it, you contract. And this is the crux of expand contract. And let us look at another example. This is called the split column. In this case, we are not migrating a new table. We are migrating, we are splitting an existing table into multiple. So in this example, the customers can book trial classes with yoga studios. They can say, so they do not want to join without evaluating how well the class is. That happens in many cases. So what they can do is they can book a trial and they can attend one or two classes depending upon how many trial classes each studio offer. And then in the end they can make a decision whether they want to join or not. So that we are recording in this table called trial bookings. So we have field code status that status can be of multiple things like whether they have attended, whether they have attended and joined. That is also important. That is the funnel. Or whether they have not attended. But then after a while we realize that too many things are being stored in that field. It is doing too many things. So we had to split that. So we need two kinds of data that is whether they have attended or whether they have joined, became a member. Because that is where the conversion reel happens. So the same way we implemented that too. That is we should not delete and add the new table in the same migration. We introduce the new fields. Have the migration script to migrate the data. Then also make the change to the code so that anytime there is a data added to that table make sure that it is added to the new fields also. And then in the end when you are completely transitioned then you contract it. It is the same refactoring process. So the idea over here is that any major change that you are making do it in real small steps so that you can observe and make changes. And you do not actually lose the data in this case because if you are deleting the table, if you are deleting a table or a field and if something goes wrong then the entire data then you have to go back and stuff like that. So that is a very high risk scenario. So and this is what the idea of expand contract that is you have parallel implementation. You have the earlier code as well as a new code and then you observe it for some time and then you make sure that all the dependencies are actually moved to the new way of implementing and then you contract to the new way of implementing. So what is required for this is a very simple concept but we cannot just do it just like that. There are some prerequisites for doing it. And what are the basics for that? One is that you should have versioning for all your database scripts so that you can identify each of them and make sure that the changes are like the code. You have to treat it like a code. And every change has to be in the version control and there should be a way to identify each of these migrations. So usually if you are working with any ORM software usually they will have a way to give a unique identifier to each script or you can use the timestamp in that way. If multiple teams or multiple people are working on database changes there should not be any clash between the unique identifier. So there are ways to do that but fundamentally what is required is you should have version for each of your database scripts so that it can be identified and in case you need to roll back you can roll back to a specific version of your database. That is basic and the next thing is automation. You will have to make sure that the testing and deployment automation to make sure that when the code actually migrates to production the database changes are also going in. And then also in case you need to have enough tests to make sure that nothing is broken. So that is a fundamental thing, it is not different for this but you have to also make sure that the migrations are executed rightly even as part of the deployments. And last bit is the strategy that you want to apply. This particular pattern even though it makes sense, it doesn't make sense even though it is useful it doesn't make sense to use it all the time. If the change is not that critical maybe you can just delete and add the column at the same time because it requires effort. The major problem with this kind of parallel implementation is that you expand but never contract then that creates technical debt. So as a team you have to have a strategy to make sure that what is the timeline? You cannot just blindly say within two weeks you have to contract. It depends on the size of the team, depends on the code base, depends on your downstream dependencies, depends on the criticality of the change that you are deploying. So many aspects are there. So how do you approach that and how do you make decisions? That is the strategy bit and you have to always evaluate what works and what doesn't work and you have to change it. And this is very critical because even though it is a good pattern if you don't have a good strategy it can create more issues than advantages for you. And this particular idea of parallel change is not just restricted to database. So the idea is coming from I think most of you are familiar with branch by abstraction. So this is the same implementation that if you are making a major change to your design or major design change or a framework change for that matter you may not have to do it as a big bang thing. You can do it in small steps. That's all idea of continuous delivery. So for example you have an existing implementation you have to change it rather than changing it across you introduce a new one then slowly you can change it. And in the end get rid of the old one. But there also the problem usually is that you have both implementation and you don't contract. That's the problem but as a team you have to have certain agreements or some guidelines to make sure that it doesn't happen. And when I have been giving this talk at multiple conferences most of the time the question comes up the idea makes sense who does it. I mean is it good for us? So I thought it's good to add some case studies to show that it's not like some pattern that I'm talking about it's actually being used. So in this article I'll share the slides so you can get the links separately. In this particular article they talk about how they release changes in Facebook and then they follow the same approach for that. And so I think it's not a recent article but I'm sure that they still follow that. Because of their database change they are making sure that it's always rolled back. You can always roll back your change. And then it doesn't and that the only way to do that in my opinion is making sure that you have parallel change. You support both earlier version and the current version and over a period you move to the new version. This is not about database change but the concept of the parallel change as a thing. So I'm sure almost all of you have used GitHub. And the main feature in GitHub or one of the things in GitHub is that when you submit a code it shows you what is the differences. And then so that you can see whether it can be merged or not. They had a problem with that merging library. And then they did that with parallel change. They observed and made sure that because a problem in that, a bug in that area will be a huge thing for GitHub. So to make sure that it's safer they did parallel change the branch by abstraction and then they did that and then observed they had it running it in background and then they changed to the new implementation. So without affecting any user base. So any major change should be taken in small steps. That's what the point is. Just to end, things will always go wrong. So you should make sure that you're making only small steps. And that's the reason why it's important to have very low risk releases and do things in small batches and continuously. And that can be done for database also with the small steps. And these are some of the references continuous delivery for sure but the patterns that I mentioned or the refactoring methods that I mentioned are from this book called Refactoring Databases. There are a lot of refactoring techniques that he talks about. I just took 203. So refer that in case you are interested to take this rule and links to the different case studies I used. So I think that's it. Okay, sure. Thank you. Thank you, Chad. Any questions? May be one or two questions. Hi. Hello. I am Karthik from Boeing. Actually, we have a situation like database migration or parallel database or refactoring. We feel that it will be like dangerous element sometimes. Sorry? It's a dangerous element when we attack with the databases because sometimes when we release on the production releases will happen. Database segment is a primary factor. Okay. When we start up with migration from small to as a parallel one, we actually found out it could means earlier old data also it was merging and new data was segmenting with different branches. So when we funneled down it was not happening properly. How well we can refactor this and get this back. Can you repeat? For us, parallel database did not work for us. Okay. So what is the best way, best method we can get out of it? We're doing road green deployments? Yes. Okay. Can you say when it didn't work out for parallel databases? See, our deployments will be like every like three weeks or four weeks and we started like migration. Slowly migration. But at one point of time we thought that it was not working because new databases were also new data sets were included into it. Tables were different table sets we created here different merge sets and the codes were different. Sequencing we could not match it actually. So that's a bit that I spoke about the strategy. You have to re-look at how you are actually deploying and then what is your strategy to make sure that the parallel implementation is done correctly. Is it because how can you capture these issues in the production even earlier? How can you simulate that even in environments, even in the development environment itself? Maybe it can be because of lack of automation or it can be because of not having enough strategy to say that how long it should be there and how do you make sure that the changes enough monitoring in the production or those kind of things. So we have to re-look at where exactly it is not working right. And then so I am afraid we will be able to give an answer in this maybe we can chat after that. Yeah. Yeah. It doesn't have to be tied up with toggles. I I am not sure. Only it to