 Well, it looks like it's time. Let's do it. Let's go. Welcome everyone to Database DevOps with Containers. We're going to talk about how to build a DevOps pipeline for databases, which is really fun. We're going to focus on the journey from to get that data available to others. This is the part where I tell you, I am definitely going to post the slides on my site tonight. I've been that person chasing the speaker. It's never worked out. If ever you hear that, call out the speaker and tell him, no, post it right now. Stop the presentation. Go post your slides. You can go to robritch.org right now. You can click on Presentations. And here's Database DevOps with Containers. The slides and the code are online right now. So you can get the code up on GitHub. It is online right now. While you're here at robritch.org, let's click on About Me and learn some of the things that I've done recently. I'm a cyro developer advocate, a Microsoft MVP, a Docker captain, and a friend of Redgate. Let me tell you about AZ GiveCamp. AZ GiveCamp brings volunteer developers together with charities to build free software. We start Friday after work. Sunday afternoon, we deliver that completed software to the charities. Sleep is optional caffeine provided. If you're in Phoenix, come join us for the next AZ GiveCamp. Or if you'd like a GiveCamp here in Boston or wherever you're connecting from, hit me up on email or Twitter. Let's get a GiveCamp in your neighborhood too. GiveCamps are really fun. Some of the other things that I've done, SQL source control basics, mine is chapter eight. I also do a lot of training and consulting about containers. And so, hit me up on email or Twitter if you'd like to jump in with some more. Some of the things that I really enjoy, I replied to a .NET Rocks podcast episode. They read my comment on the air and they sent me a mug. So there's my claim to fame, my coveted .NET Rocks mug. So let's dive into database DevOps with containers. We talked about this guy. Let's take a look at DevOps. With DevOps, our goal is reliability, consistency and cost. Our mantra, automate all the things. This wasn't always the case, but computers are now cheaper than people. And so if I can get a computer to do it instead of a person, then I can probably get the job done easier. That's great. That's exactly what DevOps is all about. Let's take a look at a typical DevOps pipeline. Now, this is a DevOps pipeline for applications. We can see all of the steps involved in getting our software from its instantiation all the way through to deployment. Now, what's really interesting about this DevOps pipeline is it begins and ends with customers. If your DevOps pipeline doesn't begin and end with customers, I would invite you to revisit it because really our goal as developers is to deliver customer value. So we start with customer suggestions that may go into a task tracking system. As we start the task, we'll check out some source code, we'll make some changes, we'll finally commit that code and an automated build will kick off. That build might run some tests and finally deploy that content into a pre-production environment where others can take a look at it or maybe we're running more invasive tests. Once those tests pass, we can push that content into production and start getting customer feedback. That customer feedback can come into suggestions and we can begin this step as well. So that's a DevOps pipeline from customer suggestion to customer value. Perfect. Now, part of what's really interesting about this DevOps pipeline is if any of these steps fail, we can take that feedback and we can return and restart this loop with that additional information. In this case, a test failed. So maybe we need to not go all the way back to a customer suggestion. Maybe we can just go back to the developer's code and make some changes but we can see as the DevOps pipeline fails, we can make some alterations and once it's good enough, then we'll deploy that content into production. Now, what's wonderful is when we start talking about push left, our goal is to fail closer to the developer so that the developer can make adjustments and not have to wait a long time to be able to get that additional feedback. Shift left is really fun. So here is a DevOps pipeline for applications. Where's the master copy? If you're looking at this diagram, you may say, well, the master copy is here in the code. And I agree, if we start with the code, we can rebuild the content, we can rerun the tests, we can deploy it into pre-prod environments and finally into production and we can get that customer feedback. If we lose the source code, we're probably not getting very far. Now that's a DevOps pipeline for applications. Let's instead flip to a DevOps pipeline for databases. Now where's the master copy? Yeah, the code is important. We do wanna keep track of the schema and any reference data that we have but the master copy of our database is the data in production. If we were to lose our production database, we couldn't get back to this standard spot again. With applications, we blow away the environment and create a new environment every time. For some reason, customers don't like it when we destroy their data in deployments. The master copy is production. Our goal then with the DevOps pipeline is to start with production and to get this data into non-production environments specific for each type of use. So we might have production data that goes to pre-prod environments. We might have it going into development environments. Let's take a look at that. We'll zoom in a little bit not to say that we're eliminating steps. We definitely want a DevOps pipeline that begins and ends with customers but let's just focus on these four steps. Our first iteration, we have a production database backup and we're restoring it to production. We might call this disaster recovery. Now, hopefully we're backing up every night. Hopefully we aren't needing to restore every night but we want to do that backup really often. So if anything happens, we can restore this data and continue on with our business. As we go to different environments, pre-production environments, dev test environments and finally to developers themselves, we might have different needs. In production, we probably have really intense hardware making this happen and on a developer's laptop, probably less so. We need to make some adjustments in the data to be able to make it suitable for these other environments. Now our goal is to get production data, sanitized, anonymized and shrunk into each environment in a safe way. So today we'll focus on building a DevOps pipeline to get our production data safely into these other environments. Now we're going to want to anonymize, sanitize and shrink this data appropriate for each environment. In a pre-prod environment, maybe we don't need to sanitize and anonymize very much. Our goal is to prove that the system will work just in a little bit different hardware. When we get to developers, that database is probably way too big and we definitely need to shrink it a lot. We also need to anonymize it. Let's remove customer email addresses or mingle personal identifiable information so that it's not available to developers. For example, if a developer is testing the email system, we don't want from their local workstation for them to email actual customers. Similarly, sanitize. Now we probably have secrets in our database, API keys or other details associated with the production environment. And we want to sanitize those so that we have developer safe secrets in all of the places. Anonymize, sanitize and finally shrink. We want to shrink this database appropriate for the target environment, but we don't want to remove all of the weirdness involved in our database. Let's not delete everything that's older than six months. Let's not delete everything that's in that big customer. We want the data to be as weird as it is in production so that customers can experience that and build content that is appropriate for that. If a developer never gets to a query that has more than 10 results on it, they're never gonna build a paging function, for example. Finally, once we get that data into the new environment, we're gonna want to migrate the schema to the latest version and ultimately run our tests, our SQL tests or our integration tests to validate that this database behaves the way we expect. Our DevOps pipeline then has these steps, anonymize, sanitize and shrink, migrate to the latest version and validate that the data is correct. And with DevOps, our goal is to automate all the things. If you're doing restores of databases to local development environments, let's see if we can build a DevOps pipeline. I'd rather sleep through that process or focus on other tasks. So how do we automate this? Well, let's use containers. I like this quote, it was by me. So let's look at Docker. Now Docker is a really elegant mechanism for making this happen. Docker creates a really automated mechanism for being able to spin up hardware really easily. With Docker, we can, well, isn't Docker just VM plus plus? With Docker, we're virtualizing the operating system as opposed to VMs where we're virtualizing the hardware. So I think we're almost past this piece, but Docker didn't invent containers. Docker just made it easy for the rest of us to consume. Now we are virtualizing the operating system, not the hardware. So we need to make the operating system work for us. On the left is virtual machines, on the right is containers. This definitely isn't to scale, but we can see on the left, we need a copy of the guest operating system in each piece. We're virtualizing the hardware. On the right, we are virtualizing the operating system, so we're just using this host OS. The container needs to have only what's different between this host OS and what they need to run their system. We're already using our system more efficiently. That's perfect. Now, because we're virtualizing the operating system, our containers need to have the same operating system as our host. For Linux containers, we'll use a Linux host. For Windows containers, we'll use a Windows server host. Now, there's something that's missing here. We're not talking about Mac or Windows desktop. We're not talking about GNOME or KDE, Red Hat or Ubuntu. Let's take a look at that Red Hat and Ubuntu case. In the case of Red Hat or Ubuntu, the kernel underneath the operating system is Linux. And so we can run a Red Hat container on an Ubuntu host or vice versa. For Windows server and for Mac, we'll probably do something like this. Now in production, we definitely want to run this way here on the right, as close to metal as possible so that we can get the most amount of containers in our system. On the left is our developer setup. In our hypervisor, hyperv or hyperkit, we'll install a Linux virtual machine and we'll install the Docker hypervisor inside Linux. So now we can run our Linux containers on our Windows desktop OS. That's how we'll do it today. Now we do have a Docker ecosystem that makes building these really elegant. We can take a Docker file, that's the configuration as code, build that into an image, and then run that as a container. Now we can choose to share images with others using Docker Hub or another container registry and pull down images from Docker Hub. What's interesting, unlike GitHub and source controller repositories, we're sharing the built images. I don't need to recompile it once I get it down. I'll take a Docker file, build it into an image and run that as a container. I can also grab a Docker compose file that may specify many Docker files or many images up on Docker Hub and build or pull those images and run them all together on a Docker hypervisor. In this case, Docker swarm. Now that's great. Docker creates a really elegant mechanism for us to get from configuration as code to a running system. Here's the full Docker file where we're building up a ASP.NET container. We list a step that we need to be able to set for our base in, in this case, .NET 2.1 to our running site. Now what's really elegant about this is I'll skip the Hub dependencies demo for this. You can see that each of those layers, each of those commands creates a new Docker layer. And then those turtles all the way down build on top of the previous images and their layers. So we end up with a really elegant system of building on top of the shoulders of giants to accomplish the tasks that we need. As we started as a container, we get a new read write layer where we're able to save things like our logs or changes to files. We don't change these base layers, we only change this read write layer as we run. So the Docker magic. We have that same sandbox. We have the same process I statement that we do with typical virtualization. We also get a standard host name and IP address for each container. And we get a Docker router that is able to NAT traffic from the host environment into each container. That's wonderful. From a DevOps perspective, inside the container, we can think of this as a machine. Let me decorate this table here and lay out these particular files. I'll put this configuration file in this very specific directory. Once we close the doors of the container, now from an operations perspective, we can treat it like a process. I just start it up, plug in ethernet, plug in power, and I can control that, constrain its resources or start it a second time to spin up another container. What's really elegant here is we're using one tool chain to be able to solve the problems for both developers and operations. No longer do we need that big word document that step one is we all throw it away and just deploy it the same as we did last time. So here on Docker Hub, we have base images like the .NET image that we looked at previously, Node, PHP, Ruby, Go. We also have other images for databases, Redis, PHP My Admin, WordPress, Postgres, and in particular, SQL Server. Let's use SQL Server in our demos today. So let's take a look at SQL Server. Now our goal with SQL Server is in production to use a database as a service whenever possible. Azure SQL Database, Amazon RDS, give us that mechanism where they can control automated backups, point-in-time recovery, ensuring that our database access works as expected. We can focus then on just consuming that service and building our applications. On-prem, we might use virtual machines. Remember the rule when we should never install SQL Server inside of a virtual machine? I suspect in time we'll also move our databases into containers. There's already mechanisms for running SQL Server inside of Kubernetes. So get ready for that. In development, let's build a dev-friendly container that it contains our production backup data. Now we're definitely gonna anonymize, sanitize, and shrink this so that it's appropriate for each environment. We don't wanna leak production secrets or personally identifiable information, but we do want data that is as weird as it is in production so that developers can build on standard mechanisms. As we get that data from last night's production backup, we're gonna wanna anonymize, sanitize, and shrink it appropriate for each environment. As a developer spins it up, their first step will be to migrate it to the latest schema and then validate that the database works as expected. Perfect. So let's dig into a demo. We can come here to this database repository and we'll be focusing mostly in the database folder. But you can use the schema folder to spin up your database really quickly. Here's all of the SQL files to be able to get all of the content into place. And then here in the database folder, we'll focus on the script and the Docker file. So here's that SQL script. And this SQL script will take our production database and move it into a developer environment. Our goal is to anonymize, sanitize, and shrink the data. Let's take a look at that database. Well, we start off with some customer data. Thank you to randomuser.me. These aren't real users, but we can see that we've got actual email addresses. We probably want to avoid giving developers actual email addresses in case they're testing the email functionality. We also have the invoice log. Now this table might be exceedingly large. We probably want to prune this. We don't want to delete all of the big customers, but maybe customers six through 10 might be good to delete. Maybe we can delete transaction logs. We don't want to remove the weirdness that is production data. We just want to reduce the size of the database. Then we also have some settings. Now we want to sanitize this. Here's our API key and this is a production secret. We definitely don't want this to leak. I have a date here just so that I can keep track of the data that we've done today. Our first step is to back up this database. Let's back that up. Now we have a new proddatabase.backfile, and our goal then will be to restore this in a container. Here's that SQL script. We'll start by restoring our proddatabase.back. We'll move it into place, and then let's sanitize the data. We're going to rewrite our email addresses to at contoso.com, prefixed by a random GUID. We'll update our secrets to sanitize this data. We're going to go grab a DevSafe value and update that API key. We'll also update that build date just so that we can get that in place. Now that we've anonymized, sanitized, and shrunk, oh, here's shrinking our data. We'll delete some invoice log data. In this case, we're deleting all the things that are 1 equals 0, so we're not deleting any. But once we've anonymized, sanitized, and shrunk our data, each of these commands will create that new spot. So now let's shrink our database file, our transaction log file, and we'll also change the essay password in case developers need to be able to do more administrative things in their database. Now this script is perfect. We want to run it from our DevOps pipeline. So we'll put it into a Docker file. Now we're starting out with the SQL Server container. So we've already got SQL Server 2019 in here. That's perfect. Let's accept the ULA, set the essay password. Probably we want to pass this in as a parameter, but I just did it right here just in case. Now we'll restore our data. We'll copy the files into place. We'll start up SQL Server and we'll run that SQL script that will do our anonymized, sanitized, and shrunk processes. Now each of these steps will create a new Docker layer. So we still have the big old production database here. And so let's start a new container and copy all that content from this old container into place. Now we only have that anonymized, sanitized, and shrunk database, and we can freely share this with developers. So let's do that. Let's go build this container. Now the cool part about this process is we're pulling in the database backup where anonymized, sanitized, and shrinking this. It does take a little bit to spin up SQL Server. And so I guess that it's probably about 80 seconds. Now I wish we could have a trigger that said once SQL Server has started, automatically do the next task. And unfortunately we don't. So I just have to guess that 80 seconds might be a good test. I've seen it spin up in as little as 20 or 25 seconds. And depending on the load on my machine, I've also seen it take more than two minutes. Once you have the database in place, oh, there we go. We've got our database spinning up and we can see that SQL Server 2019 is launching. This is great. One of the things that I think is really cool is that we're automatically running upgrade scripts. So here we started at 904 and we wanted to get to 904 and we started at 899. So we automatically upgraded out from 897 all the way up to 904. That's perfect. Upgrading this database automatically, if we're using maybe SQL Server 2017 in production, we can automatically get the data to 2019 in this container. Let's practice that, get good at migrating our data into 2019. And when it comes time to move our production database in 2019, we have lots of experience with that already and we know the bumpy parts. Okay, now that we've got our SQL Server database in place, let's do our script. So now we'll take a look at anonymizing, sanitizing, and shrinking our database in appropriate environments, in this case, for developers. There, we've got all of that content in place and so now we'll finish up this container copying that database, the shrunk database, into the new image that we can then push to our container registry. At that point, the commands that we put into our build pipeline, once we finish our database backup, we can kick off the Docker build commands and Docker push commands associated with building for each environment. Now we've got Docker image list. We've got that database build that we just did. Here's dev database. Okay, let's start it up. Docker run, I will forward 1433 to 1433. Let me go double check. Yep, and I will start that dev database container. Okay, so if we had something else running on 1433, we could definitely map this port to 2433, 1434, but now we've got that database running in a container. Let's flip back to SQL management studio and let's create a new connection. Now in this case, I've been using Minicube to be able to host my containers and so let's connect to Minicube. We're putting in our developer, essay, username and password and that'll get us to our data store. Oh, did I rush the database? Is it not all the way up yet? Yep, I'm probably gonna get a connection fail and I'll just restart it and then it'll be fine. I love that as it boots up, it also migrates our data into the latest version of SQL server. We could then run our database migration scripts. Maybe we have a flyaway or SQL source control that can migrate our data. Oh, looks like we've got this connection in place. Let's create a new query into this database and we will pick the web app and now let's do those exact same queries that we did previously. So let's take a look at our customer data and we can see now that the email addresses have been randomized. We anonymized our data appropriate for a developer environment. We also have our invoice log that is a little bit leaner now because we've deleted some of that data and let's take a look at our settings. We've sanitized this into a dev safe password. I do like creating a build date here so that we can see that this happened right now. That's perfect. Then once we're done with this database, it's really easy for a developer just to say, when we're done, I can, oh, it looks like I might get locked into this. Yep. When a developer is ready to grab the next version, they can just dock or pull that database and they can choose to update it as whenever they're ready. Now the beautiful thing here is that if they're not ready, then they don't need to do anything. But if they are ready, they don't need a DBA to hold their hand as they get a new version of the database. SQL Server on Linux, you notice that we backed up our database on Windows and restored it on Linux and it worked just fine. SQL Server on Linux has really high fidelity, including SQL CLR, which is pretty amazing. You've got the agent, machine learning, agent-based replication. There are things for SQL Server on Linux that don't make sense, like reporting services and analysis services that have hooks into IAS. There is no IAS for Linux, so this doesn't make sense. But database DevOps for containers is really elegant. We've built a DevOps pipeline that allows us to get production data safely into non-production environments. We may choose to build a script for pre-production environments, for Dev and test servers, and for developers that might have different levels of anonymizing, sanitizing, and shrinking. Ultimately, this process will run automatically and, well, as DBAs, we can sleep through it. That's amazing. Grab this code at robrich.org and you can replicate these results as well. If you have questions and you're watching this video later, hit me up on Twitter at robunderscorerich. For those of us who are live here in the event, what are your questions? That was awesome, Rob. Thank you so much for that presentation. Most definitely. This is such a fun demo. Yeah. For questions, I would like our guests to go to the track channel. Since we have another presentation coming up in two minutes, it would be better for us to move all the questions to a different channel. That sounds great. I'll see you there in just a minute. Thank you so much. Yep. Thanks, everyone.