 gonna be talking about DevOps practices and how it applies to the database team. And what I believe, right, so generally just humble kind of made this statement sometime back and basically rapidly changing, automated and all that stuff, that's what DevOps is. So fine, like what does it mean? Developers basically create stuff and do the stuff and ops people in which some part of the DBA sits on the ops side. So they kind of assume that it's not my machine, it's your code and the developers say it's my code, it's your machine that's wrong and all kinds of stuff happens here. So let's see how we can break, like break this thing. Like one is incentivize, I think Katie talked about incentives here. One is incentivize to deliver faster and the other is incentivize to keep things stable. And that's where the crux happens. So how do you improve this, right? Like how does this make it sense? In reality, I think all of our jobs is to basically help business deliver so that they can respond to changing market needs, right? That's all of our job. Doesn't matter which side of the shop we sit. And that's how I see DevOps too, right? It's our job to keep the business nimble enough so that they can deliver whatever changes are happening in the market. They can react and respond to those changes. So what does it mean to be DevOps enabled on the database side, right? Like what does it mean for developers? What does it mean for DBAs? What does it mean for everyone else? So let's say some of these, like I call these practices, you can call this like stuff I have learned over a period of years, things like that, right? Basically you need to pair your DBAs and developers. Like in whatever they are doing, like let's say they're creating a schema, creating this, creating that, they need to pair, right? To pair on whatever solution they're gonna come up with. Instead of like the developers coming up with their own solution and throwing over the wall to the DBAs, like now make this run in production, it's not good, right? Like you said, why don't you bring the DBAs earlier in the life cycle so that they have an input on the solution that's being developed, what kind of design you're coming up with and things like that. This also allows continuous learning on both sides, right? Lot of times DBAs are used to seeing lots of volume, lots of things that happen in production. The developers are happy with their like 10 customers and 20 orders in their like test data and like this thing runs awesome with 10 customers. But you go in production and there's like five million customers, stuff breaks, right? The DBAs know about that and you can use that input while developing. So the other big thing is like put all of your database artifacts along with the other artifacts of your application. If your database is supporting more than one application, make sure all of your database artifacts are in some kind of version control that everybody has access to. They can see and stuff like that. There's nothing magic about what goes on in a database like schemas or if you're stored products, triggers, whatever you have, maybe on backup scripts. If you have like ER diagrams, if you have data flow diagrams, put all of that in version control. So if someone has a question about something, you can point them exactly where it is at and they can take a look at it on their own time. Like it's not version control on a DBAs desktop or it's not version control on a DBAs laptop, right? Put everything in version control, doesn't matter what kind of artifact it is. And like here's an example of this stuff, right? Like we did some time back like everything about the database is in a DB folder inside the same applications folder and we got script for performance data, we got scripts for like maintenance, database admin, whatever you have, all of it is in one place. If there's some special data for quality testing, fine, that's also in there. So people can look at it and look at a history of this stuff. This also enables auditing, right? Someone sees what happened here, everything's there, you can look at version control and that kind of stuff, figure out all the auditing. Another part of this is generate as much as database boilerplate code. Like there are lots of triggers people put for auditing purposes, there are lots of like simple views people write, synonyms you can call about like grants, all kinds of stuff that people do in the back that are handwritten. You don't need to handwrite that. If there is a pattern to that code, just generate the heck out of it in a pipeline and you don't have to think about it again. Like someone changes a table definition, someone reunions a table, someone does this, all of that can be generated and applied in a pipeline so you don't have to hand code this stuff. And you can, here's an example of like one of our clients that I was working with, they wanted to have a database like within a schema and that schema cannot have like direct access, like applications cannot access it directly and it has to be accessed via like some other schema that had read access, write access and that kind of stuff. So I could have sat there and written all of these grants myself by hand and as the tables and schemas changed, I had to modify that script. I just wrote a small Ruby script that basically generated all the grants and created all the synonyms, put it in a pipeline. So whenever a developer changed a table, renamed the table, added a new table, we had a new script that got applied automatically and I didn't have to look at this part of the job at all again ever, right? Because I have automated myself out of that job. And by the way, automating yourself out of a job is a good thing. That's how I believe. Like if you have automated something, then you can focus on something that is more higher value, more important to the company than keeping on doing it manually, right? So here's an example of automating some stuff. Like you could use Docker images to basically get like standard set of a database for someone, like for some kind of development. There are like usually think of like a complex set of test data so that developers can work with. Think of like a standard set of customers, especially in a PII or HIPAA compliance or PCI compliance, those kinds of environments. If your setup data is complex, you can create a Docker image that has like a set of 10 customers, a set of 10 addresses, interlinked and all kinds of complex setup done. And you can publish that out so that the developers can just pull the Docker image and start working against it, right? Instead of having to like manually set up that stuff or use some kind of code to set it up and some requirements change. Now the code needs to be changed and all that kind of stuff. So in many places I have used this kind of setup where there is some test data, it works against a schema. You basically put it in a database, create an image out of it and then then you can publish that image out via a pipeline and then you could use it for whatever kinds of testing. Like sometimes like think of functional testing, unit testing, integration testing, even sometimes like complex performance testing if you have some kind of stuff going on. So that kind of pipeline is super important to abstract you away from the complexity of setting up test data, right? So it really helps you and the DBAs and the developers can focus on how I can get my test data into that original schema and then the rest of this stuff works through a pipeline and I don't have to worry about how I get my schema setup, how I get my data setup and all kinds of other questions just go away, right? So here's an example of one other thing where I let developers provision their own database, like not worry about like let me create a database for you, let me create a schema for you, let me put some reference data for you, all of that I'll just abstract out, put it in a build script so that when they basically check out from the code base they can literally like create their own schema, create their own tables, create their own data within that stuff. So here's an example of how this flows. I just, hopefully people in the back can see code here. So I just did a git clone of a GitHub repo. I'll basically point it to a database that I'm using. So that it works against a schema and it creates a schema for me. It's logging in as a system user. I'm showing Oracle example here, but it logs in as a system user and creates a schema for me and the moment I clone my repo I can get a schema for myself without having to worry about or getting into this mess about sending an email to a DBA and that kind of stuff. It also frees up the DBA from doing these kinds of mundane tasks, right? So again I can concentrate on more higher value things I want to do. Simple stuff that can be done. So the ant creates schema here. I know ant is like out of fashion right now, but I'm just showing so that it creates schema. You could do the same stuff with like Maven or Gradle or Rake or whatever it is, doesn't really matter, but you can automate that step so that you get your own schema without having to worry about how do I do this for others. The other important thing is when you are doing or when you're working on application development your database schema is gonna change all the time. I mean that's what entails when you are responding to market changes, right? Like when your business is responding to market changes it forces some changes on your application which forces changes on your database, right? So your database is constantly gonna change. So instead of waiting for the time of deployment to figure out what changed, it's better to basically do it in the early phases, right? So and one of a good example of doing this or how to do this is basically using database migration frameworks. And here's a list of some of those. I found out about roundhouse yesterday talking to someone here at the conference, but a bunch of these others I have used. I was on the team that open source DB deploy like way back in 2004, but there are a bunch of frameworks like in every software that you use. Like if you use Java, you can use one of some of this for Ruby. There's the Rails framework, Rails migrations. If you're using .NET, roundhouse, I guess. There's also some commercial software like DB maestro, Redgate, Datical. So if you are into the business of deploying changes to your database, please use a migration framework. Don't try to do this manually using your own scheme or something like that. So here's an example of how this migration framework works. Again, like you're basically saying I want to like make this stuff work. It's working with my schema here. Malmo in this example, right? And then it's taking a bunch of changes that I coded and it's also doing like some test data. It's also doing some stored procedure code, trigger code or whatever it is. Like all of this is built inside the build script. So it runs the same exact way for every person who is using this. Like doesn't matter developers, doesn't matter QA, doesn't matter DBAs. All of this stuff is running as is or similar to a set pattern like repeatable, right? That's one of the things that is very important is does it run in the same fashion for everyone? And this is a repeatable script that you can set up that works for everyone. Of course, who has not been in this situation where your database is used by your application and a bunch of other side projects that tend to use your database. So this is like the standard database level integration that's happening and every enterprise kind of goes through this, right? So if you are in this kind of situation, then you are wondering if I'm pushing changes to the database constantly, will it not break others? Like what if a report is running that is using the customer table and I rename the customer table to client or I split the table to use something else or a bunch of these kinds of changes, right? Like breaking changes. Because in a lot of ways, your database is an interface to the data. The database schema is an interface to the data and if you break that interface or you change that interface, your application is gonna break, right? So imagine someone changes the customer table to be called client and suddenly a bunch of stuff doesn't work anymore, right? So to tackle with this, there is this concept of database refactoring that me and Scott Ambler wrote about long time back and the crux of that concept is there is a timeline for a change, right? When you start making a change, there is a transition phase when you wanna keep the old and the new, like client and customer table names both work at the same time and at the end you say, okay, I'm done with this customer table, I wanna call it client from now on. So think of like whatever four or five slides I'm gonna talk from now on. Think about this pattern of you're starting a refactoring, there's a transition period and there's the end period for that. Like just like how you would do with any APIs, right? Like you start with the API and then come up with a new API, you are gonna put this old API in like deprecated mode and like maybe a year later you take out that old API. Same concept applied at the database level, right? So let's look at this concept and there is this thing known as expand and contract. So I'm gonna talk a bunch of examples here. If you are not used to seeing SQL code or SQL scripts, I apologize right now. So there's also a bunch of database refactoring patterns at that website. There's also a link to that at the end of this. So no need to take notes right now, right? So database refactoring just like code refactoring is you need to preserve behavioral as well as structural semantics, right? Like informational semantic. If it was a customer before that stored customer data, you call it something else, it still needs to store or basically have customer data. If you lose that customer data, a lot of people will not be happy with you, right? So keeping the old and the new working, that's the crux of refactoring, right? So let's look at an example, like an expand example here. I have a customer table in which I have a customer ID and a name to begin with. I'm like, oh, this is not good. I need to split it up with like first name and last name here. So I put on like customer ID and first name and last name as two new columns. And the name is still there. That's my expand phase. And at the end, I will end up with just like customer ID, first name, last name. That's my thing. So let's look at some data examples here. Like I start with name as promote, and then I'm gonna split with promote, and promote, and promote, so that is still in the table, right? And when I'm done with this, I'll have like first name and last name split up. Now at this point, there are multiple choices you can take. Like if you're a startup, have zero customers, you can deploy this immediately. No need for that middle stage at all, right? I don't have any customers, so I don't care about this middle stage. But like most companies that have existing data, you are gonna be in a situation where you want to preserve that data. So that's your start. So I can do this without migration, right? Like for example, I'm not doing any data migration. Here I'm actually splitting the strings up, like using a space as a delimiter and stuff like that, and I'm gonna split that up, right? So two options that I have, but I ultimately end up with, and we end up with the option where first name and last name is what ends up in that database. So here's a very simple scenario. Like I'm adding two columns, and I'm done, right? Because let the application take care of reading the old column, and when it persists, it'll persist as a new, like two different columns. So this I call as lazy migration, right? So the application is dealing with it. The database has nothing to do with this, right? So one other option is with synchronized data. Like when I add those two columns, I basically put in a trigger that says when someone writes to the name column, split the column up and write to those two columns. When someone writes to those two columns, combine them and put it to the new column. So whenever someone comes to the table looking for the name column, it has data that they're looking for, and someone coming to the customer table looking for those two columns, it also has data. So now you're keeping old and new, active at the same time, and data is being synchronized constantly, right? Now you can make one more variation on this, is you're synchronizing and you're migrating at the same time. So first time I add these two columns, I'm gonna migrate the data up so that all the existing data is taken care of. And in the future, when people use both those columns, I'm gonna also take care of moving the data around. So again, I'm keeping old and new happy. So think of all the reports or all the other feeds that you're sending, everything, they're all happy. So I can also, in the contract side, I can just drop this. You imagine you have 100 million customers or 100 million rows in this, this table's gonna get locked for like 15, 20 minutes. And if you care about uptime, then you don't wanna do that. Find some time when I can take that lock so that I can then drop it. So when drop takes forever, like you have database features that basically say set unused, the column just goes away from the metadata, but it's still there, right? That happens very fast, gives you higher uptime, but at the same time doesn't take away the data or the column, and you can take it away later at a Sunday night or some kind of weird time like that. Again, it can be scheduled out, but again gives you more options to deal with, like uptime and that kind of strategies, right? Even then there will be some apps, like I think Katie talked about like 400 or 1,000 apps in her enterprise. There will be some apps that nobody's working on that still want the name column there because nobody's putting any effort to migrate it to the new structure. So how do you take care of that? Like you can again add a virtual column called name, and if someone accesses that, I can run time, generate the name column and show it back to the user. So again, their interface to the database doesn't break. Of course there's a cost to this, like how long do I keep this interface alive? But again, if you're willing to take that cost, then you can always add this up and give it back to the users, right? So again, this is a way to keep the old and the new happy because we are again constantly deploying and when you're constantly deploying, you cannot guarantee that all the apps that are using your database can be deployed at the same rate that which you are modifying the database. So this is a good way to give them an interface to your database that doesn't break, right? So the next thing is to put your database under continuous integration. So if you are in this kind of shop where the DBA team basically is applying changes to your Dev database, to your prod, UAT, QA, please, please, please get away from this, right? So if you are gonna be doing this, the DBAs need to start pairing with the developers at the very early phase of writing the migrations itself. Like you are pairing there doing schema design, you are pairing there doing index design, all kinds of stuff. And you are writing the migrations there when the stuff is actually being designed, right? Then you are gonna put them in whatever version control system. I'm showing git here, but whatever version control you use, which is other than a database DBAs laptop, that's good, right? So you want to use it here. The continuous integration engine, again, whatever you use Jenkins or Go CD or whatever that you are using doesn't matter as long as it has its own copy of the database. Remember how this is DevDB here, this is integration database here, different databases. Again, we want to run this on a whole different stack and make sure it works instead of running it on the same thing again, right? So you can, again, you can use like Docker images here, like I can spawn up a new database, run this whole thing again, verify it works, and then go forward so that you are not in a situation where this thing works on my database or works on this. You are applying the same migration scripts, no changes, like whatever you checked in got applied again. You can take that further and package your database scripts, database migration scripts, just like how you package your applications. It could be you are generating a zip file for the database and a jar or a ur or a var for your application stuff. It could even be like a tar if it's Ruby land or whatever that is. Whatever package you generate out of this, you generate the same package for the database side and then you apply the same package to whatever environment. So no custom coding when deployment time comes, right? So that gives you a lot of traceability for auditing purposes, a lot of traceability going back like what mess my environment out, right? So you can use that for that purposes also. So here's an example of how I would put this, like there's application jar and there's a database upgrade zip and they go both hand in hand. Like how many of you have ran into a situation where you deploy the app and it comes up, it says column not found, table not found, view not found, all kinds of stuff happens. If you are into that situation, this will totally avoid that because the app ran against the database that you packaged and deployed together, right? So you are basically giving up the notion of DBA controlling the database to like it gets verified at the CI engine level and then gets packaged so that it can be deployed lockstop barrel along with the same versions of the application, right? So of course this comes like the next step is like physically the database in production is so much different than a developer's database. Like how do you abstract that out, right? So here's an example, like generally on the developer side I have like very simple stuff, one or two table spaces, everything is crammed into that, but in the production side I am split out, like maybe some highly used tables are put on a SSD versus some low usage tables are put on spindles and all kinds of stuff happens on the production side. So how do I make sure this abstraction doesn't bite my developers because they are designing tables that get put on the same table space and stuff like that. So we came up with a very simple map. So in developer docker images or whatever you wanna call them all the tables go into basically one table space, I don't really care, right? But as they move up the environment I can create a map like this table needs to be in this table space, this table needs to be in this table space and the pipeline I basically run a script that basically moves those table around in the right table space. So the developers don't have to worry about and the DB had made this mapping once and then he doesn't have to worry about it again either. So any new table gets added, I just need to update the map and the pipeline takes care of pushing it in the right table space automatically, right? So in this scenario I'm saying like basically take a table and move it to whatever table space that it's supposed to be based on this table space map that I've created, right? So it's a very easy way for DBS to determine what goes where. During development time and not have to like force developers to have that complex setup on their developer workstations or in their developer images and things like that. So it gives you a very easy way to move stuff around. You can also automate the database change deployment, right? Like deployment should be as easy like I can say deploy DB, upgrade QA and QA gets upgraded to the latest version. If I say deploy DB, upgrade production, production gets upgraded to this. And when it's this easy I call this the boat metaphor, right? Like deploy frequently is a boat metaphor like imagine you are on a shore like maybe Michigan lake. There's a boat that goes from like the Wisconsin side to the Michigan side like once in six months, right? And there's no other way to go from Wisconsin to Michigan just imagine that for a second, right? And if there's only one boat that goes once in six months everybody's gonna be like I want to get on that boat like I don't matter what happens, right? Same thing happens with your applications too. If you're releasing once in six months everybody wants their feature in there, right? And for that they're gonna do whatever it takes and for that we are gonna cut quality corners, we are gonna cut like all take on all kinds of risks. We heard about like three day deployment windows and that kind of stuff, right? If you deploy frequently it relaxes people. Like I can go to Michigan from Wisconsin in this boat like every Tuesday. That suddenly like reduces your stress level and makes you happy that oh if I miss this boat I can chill out a little bit maybe drink some Wisconsin beer and then get on the next boat here, right? So if that is the way you put this up to people then it relaxes and it reduces stress on your teams also. And if something's risky doing it more often reduces the risk. It's kind of weird way to think about it that way but if something's risky do it more often it will suddenly reduce the risk of doing that thing. So I say deploy more frequently and once you start deploying more frequently using this automated techniques using these refactoring techniques of where you keep old and new alive using like automated way of like managing using CI and CD pipelines for this kind of stuff it will reduce your risk and you can literally like deploy every Tuesday. I have done this many times where you deploy on a Tuesday like at 5 p.m. you start deployment, 5, 10 you're done and then you're free back to like doing your normal stuff. Right? So this kind of deployment pipeline helps you reduce risk and at the same time freeze up the DBAs to do valuable things and the valuable things are during application design like how do I partition this thing? How do I do the index design? How do I structure tables? How do I do all kinds of stuff? That's where the value is. And I have talked a lot about relational databases and I know nowadays like no SQL databases are in OOG same kind of stuff applies to no SQL databases too. Right? Doesn't matter if it's released on like non relational databases, same kind of techniques apply on both sides maybe with a little bit of difference because there is no schema on no SQL and there is schema in the relational databases. Right? That's all I had to speak. I have like I said, some books about techniques I have written a book about no SQL and something about databases and CI stuff. So I'm open to questions. Sure. So where I am now we have a using liquid base for everything, which is nice, but we don't really have developers spending any thought on rolling back changes. Do you have any advice on how to work together with developers to encourage them to consider rollbacks and maybe convenient tricks or simplifications? Because a lot of the things like, well, you know, if I just added a column that's easy to roll back. Yeah, rollback was in like high demand back in like maybe 2006, 2009. And since then people have shifted to like not rolling back but rolling forward always because rollback is a complicated thing to accomplish. Like let's say I renamed the customer table to client. Like rename that is fine. Like it's very easy, space management wise, but what if I split the table, right? Like now when rollback means I need to join the table back or let's say I drop a column and the rollback basically means bring the column back. So bring the column back. Like I can just put the column there because I need to put the data also there. That means I have to keep a copy of that data as of that instance around. It gets really complicated. So nowadays people are like, I always keep rolling forward. And if there are any errors, I will like roll forward further to fix that bug, right? So nowadays the technique has been like keep rolling forward. And one of the coolest techniques I have done like maybe a year back was the database is always rolling forward like iteration or maybe two weeks ahead of the application, right? So if I need a column in this table like two weeks from now, the database has it now. Like that kind of technique. It helps us to like roll it forward and take care of all the performance implications of adding columns or splitting tables, stuff like that in production when volumes are really high. So I can take care of that and the application catches up like two weeks behind or one week behind based on your tolerance for that kind of stuff. I will, if I remember, and I'll try to really remember. So Mark Burgess has a really good, an article he wrote a few years ago about the myth of rollback that I really love. So I'll try to remember to drop that in the Slack channel or I'll just ask Paul to go find it and drop it in Slack for me. So that'll be rad. Hi, so my organization has adopted most of the patterns you've suggested against. And I guess my question is where do we start? Right, so the first thing I would say is back in, maybe I'm showing my age here, back in 2002, we were running the same situation with the client. And the first thing I would say is you need to bring people together. I think in the big morning we saw this is a people problem. This is not necessarily a technology problem or like technique problem, it's more of a people problem. Like DBAs who are used to doing certain things, they'll say I'll probably lose my job when you automate all of this. That's not the case. We are saying just come up the value chain and help collaborate there, right? So that kind of stuff is very important. And the way I saw that problem was, literally I told to the DBA team on the client side, like let's have lunch one day a week and we go sit together and just talk. Nothing about implementing this, nothing about anything like that, but let's just talk. You need to gain trust to implement this kind of stuff. You need to show them that we are not talking about automating someone out of a job, but it's more about reducing risk. It's more about getting good sleep at night every day instead of like three days a night or something like that. So it's more about that. So I would focus more on how do I gain trust and how do I take them to the next level of understanding of what I'm trying to propose. Generally people walk into this kind of conversation with fear and then from there it just a closed door conversation doesn't go in. Hi there. There was interesting this stuff you showed about when you're upgrading the schema and you have the backwards compatibility for old clients. I was wondering if you had any suggestions for visibility of which old clients are using these deprecated columns and things other than DBA looking in the logs. Yeah, so again, some kind of that kind of stuff. Tooling-wise databases don't generally provide you tooling and say this table is being used in this particular application because as you see those are API level abstractions. Imagine you publish an API to the public. How can you tell which person is using which unless they actually call the API, right? So you can put some kind of abstraction on top like a schema on top that has like read-only access and then make apps run through that and see who logs in and you can figure out what kind of people are coming in, what kind of apps are coming in. You can also put some kind of triggers to see what kind of access is happening at the table level. Again, triggers help you with like update insert and delete not trusted with selects, right? So that kind of login level triggers can help you seeing who is accessing this stuff and then the best method, like I say, compile is a unit test, right? Like who have heard compile is a unit test, right? Similarly, you just drop the column and see who complains and then you know who is using it, right? That's the last resort by the way. Excellent. Thank you so very much. Let's get another round of applause for Rod. Thank you.