 Hey, I'm here with Travis Wright's principal group program manager in the Azure Data Organization, and we're going to talk about how you can treat your databases in a hybrid environment just like Cattle. Awesome. What does that even mean treating your databases like Cattle? Well, let's get into it. If you've heard of this phrase before of pets and cattle, you know what I'm talking about, and we'll get into what I mean by that. But really what we want to talk about today is how ARC-enabled data services allow you to run your databases in a highly available way on any infrastructure. So let's dive in and talk about this. What do we mean by this? Databases are Cattle too. Well, one thing I want to just bring up here is the origin of this from Randy Byas. He says here, struggling with explaining to customers how cloud-native apps and cloud more generally was fundamentally different than what they've been doing before. It's ironic but fitting that he was watching this presentation about scaling SQL Server that gave him this idea to think about the difference between how we used to do things before, which is how we treat our applications and including databases like pets. They're very important to us. Instead of like Cattle, like we treat things that you're going forward. So let's talk about this. So here's a couple of pictures of my pets. I just got this cat. Her name is Mishka. She's super cute, a little kitty. Super awesome. We love her to death. We've got our dog Teek. This is a picture of him when he's a puppy. We love our pets. They're unique to us. We love them. We care for them. They're super important to us. Now Cattle, they serve a purpose, but we don't really typically name our cows. Although when I was growing up, I went to visit my grandma on her ranch and she had a calf there. I don't remember what I named it. So that shows you how important it was to me, right? Even that little calf in my experience as a young child, I don't remember, right? That's just, cows just aren't important, right? Some people they are, right? But I get your point there. And I think it's funny, like when you came up with this session and we talked about it, I was like, wow, this is like, I heard similar things about servers, right? I mean, we had the same thing when we talked about like just managing servers and operating systems. And I was like, huh, you can do that for databases too. And so I really find it like an interesting, interesting topic to see like, okay, we should probably do that for databases. Yeah, I think a lot of people think of databases as more like pets. You really care about them because they're so critical. You know, the data is just so important and becoming more important over time. And so I think we kind of tend to think of them more like pets, but I think as we go forward and we evolve, like we need to think about the more like cattle in the sense that we need to make them like really resilient and available. And any one little database instance shouldn't really matter, right? And so that's what we'll get into today is how we can kind of evolve forward and treat our databases like cattle, right? And I'll get into kind of what I mean by that. So this is actually interesting. I sorry, I don't want to interrupt you here, but so this is like something we want to do not just like just because we're always in a hybrid session. This is not something we want to just do in the cloud, just to be clear. You're talking about this also doing that on-premises or in a hybrid environment or in a multi-cloud environment, just to make sure exactly that. You know, people don't really think about it that much, but for Microsoft and other big cloud providers, when we offer a database as a service solution, we manage millions and millions of database instances. And in that sense, we as Microsoft, we don't care about any one database instance, as you'll see here in a minute. It's not important that one database instance. I'm sure it's a client thinks it's important, but as you'll see here in a minute, we actually have a way to make sure that we treat our databases like cattle, yet from our customer's point of view, that database instance is always there. They can always use it. They can love it and care for it however they want. But from our point of view, managing those millions of databases is just another database, right? And we'll get into kind of what we need by that. Also, and kind of what we want to do is we want to take the learning of how we've been able to efficiently operate these millions and millions of database instances with four nines plus of availability in the cloud, with just a handful of people back in building 43 and Redmond, managing the whole thing. I mean, it's really, we just have less than 10 people on call at any point in time, trying to take care of those millions of databases. So how can we take that scale and efficiency and deliver that to customers to realize those same benefits on any infrastructure, including in their own data center or even other public clouds? Yep. Okay, so let's kind of dive in here, you know. We think about applications, right? Typically applications are set up in a way where you can have many instances of application, they're stateless, they sit behind a load balancer, your application clients connect that load balancer that connections get routed to any one of these different instances of the application. And any one of those application instances could go down, the connections will be re-routed to the remaining applications and you're fine, right? And you can also scale them out and things like that. And so in that sense, they're kind of like cattle, right? Your database is singular. That's kind of how we think about it is there's just one database. And so we think of that as being more like a pet and we have to make sure that it's okay, right? So let's kind of think about how we can evolve this thinking. And if we think about how cattle workloads can be characterized, right? They have certain attributes. You can sort of think about it that each cow is a duplicate of the others, right? Every cow is the same. Cows can be quickly added to the herd and they can perform the same function as the other cows. They're not unique, they're not special in that sense. And you don't really care whether a cow dies or not, right? Because they aren't special. Now, to be clear, we didn't herd any cows in the presentation here, but we don't really care so much of just one cow dies. We have others. And the more cows you have, the more capacity you have, right? So if you think about milk cows, for example, if you have 10 cows, that's more than five cows and you get double the amount of milk, right? So in that sense, as we scale our cattle workloads, we can have more capacity as we add more cows. So let's think about how that might look in the sense of databases. So one of the things that we've done historically over time is we've separated out the storage from the compute. So that even if the compute goes down for some reason, we can just bring back a new instance of the compute mounted to the storage and keep going. And we didn't really lose the data and that's okay. And in that sense, I kind of feel like these are databases that are cow wannabes, right? They want to be a cow, but they're not really a cow because, you know, yeah, the storage is duplicated, but the compute isn't, right? And so we're reliant on that one instance of the compute to be available in order to actually access the data in the storage. You can't quickly add additional database instances in this way and you don't have additional capacity that you can scale out in this pattern. You just have the one database instance for compute and once you max out that compute, you're done that you can't scale beyond that. And if that one compute instance dies, then we lose access to the data. So that's not really a true cow. This is a cow wannabe pattern, right? It's okay, it's better than a true pet in that sense, but it's not quite all the way there yet, right? So when we think about how we want to treat our databases like cows, this is where here at Microsoft in SQL Server back a few releases ago, we added some technology called Always On Availability Groups. And what that does is it has a very similar pattern to how people do application cow patterns, right? It's you have multiple database instances. Each of them has a copy of the data and you have those database instances back behind a load balancer or a router similar to how you would have your application clients connect to your applications. And so in this pattern, it is more like a cow pattern, right? You have a database, each database is a duplicate of the other. You get a full synchronous copy of the data across multiple instances. You can add additional database instances quickly and they will get seeded and populated with the same copy of the data so that they have that data and they perform exactly the same function of the other database instances. And you don't care if any one database instance dies, we can promote another one of the remaining database instances quickly to become the new primary and you can just keep going. And the more database instances you have, the more you can scale out your capacity. As you see here in this diagram, we could have some applications that connect with a read-only intent and we can balance those read-only connections across a number of database instances. It could be even more than two. And so we can actually increase our capacity by adding these additional database instances here. Okay, so this is actually interesting because this is how as a customer, I obviously still think as a pet, right? I still create my Azure SQL database there and it's one database for me. Like it's one database, but in the background, actually we are using that to actually replicate these and create, as you said, these multiple instances. Do I see that as a customer or is it for me just a checkbox basically where like, okay, I just turn on this, like always on the feature and... Yeah, so in our PaaS services in Azure, like Azure SQL Database or Azure SQL Managed Instance, this just happens by default behind the scenes. There's not even really an option, we just make it that way. And this is really great in PaaS services because you don't really have to think about it. But behind the scenes, this is what enabled us to do things like updating your database instances with no application downtime because we just pull one of these at a time out of the availability group updated and then put it back in, right? So you always have multiple instances that are available. And this is also how we increase our resiliency and how we can offer four nines of financially guaranteed availability, right? And if one instance dies for whatever reason, hardware dies, database instance runtime dies, doesn't matter, we still have others that we can fail over into and keep going. Now the question is, how do you get that same level of availability outside of our Azure PaaS services? And that's where Arc-enabled data services comes in and we're gonna talk about that today. Yeah, I just wanted to ask, like it would be great if we could have the same thing on premises or somewhere else. Exactly, like if you think about how you can, because it's always on availability group technologies been in SQL server for a while and lots of enterprise customers use it. But to set it up as kind of a pain, you know, like on the window side, you have to install Windows server, then you have to configure cluster services, then you have to install SQL server, then you have to create the always on availability group. And it's similar over on the Linux side, except we use pacemaker as the cluster manager. And it's just kind of a pain and it's different for every VM fabric cloud and cluster manager you have, there's so many different combinations of how you do this. And so what we wanna do is provide a better, more modern way of doing that. And that's kind of how, as we talked about here in Azure SQL, those services are highly available by design when you provision a database instance, it's just there. It's just HA already, that kind of thing. But then what do you do if you can't do this in Azure? If you need to have your database on premises or you need to be in another public cloud sample, how can I do this in a way which is consistent with how we do it in Azure, but yet I can run it on any infrastructure, right? And that's the idea. So this is where we have our Arc-enabled data services offerings that are currently in public preview. There's two services, the Azure SQL managed instance service which we'll talk about today and the Azure database for Postgres SQL Hyperscale. The SQL managed instance service is based on the SQL server engine. So you can take any of your databases that you have running on SQL server today and bring those over to run inside of an Azure SQL managed instance enabled by Azure Arc on your own infrastructure wherever you want and get those high availability of benefits just and high degree of application compatibility as you come over so that it's a pretty seamless migration. But when you do that, not only do you get this sort of automatically provisioned high availability but you get an always current service we're always automatically updating the binaries to give you the latest features and the latest updates just like we do in our past services in Azure. You can elastically scale those instances up and down with no application downtime using the always on availability group technology which we'll dive into today and you get a unified management experience where it provides you backed up and restore, you get built in monitoring, you get built in automatic updates and then you can integrate with Azure services like Azure Policy, Azure Defender, Azure Backup, Azure Monitor to be able to kind of at scale manage all of your database instances using Azure as your sort of aggregation point there. That's pretty cool. Again, I heard like from many, many customers are here exactly for these reasons that's why they love Azure SQL, right? They just like say, look I want to like basically all my databases should probably run on Azure SQL. However, we have reasons why we can't run it on Azure. Like for example, we have data sovereignty challenges we have especially also like connectivity challenges where like latency is just too high for the application to work or for example, like they have no connectivity at all and I see on your slide here that also says like even though they get disconnected support so even if I, that would basically allow me to run Azure SQL on a place where I don't have connectivity to Azure like is that true? Exactly, all these sort of benefits here these paths like capabilities you can run them in your own infrastructure even without a direct connection to Azure. So really powerful stuff and it's really kind of as I was talking about earlier it's kind of taking all of the things that we've learned and all the technology we've built to operate things at the scale of millions of database instances and delivering that to customers so that they can realize those same benefits on their own infrastructure. The way I like to think about it is if you can't come to Azure we'll bring Azure to you, right? That's what we're doing here. And Arc-enabled data service is just one of many Azure services which are being Arc-enabled, right? There we have Arc-enabled servers and Arc-enabled SQL server, Arc-enabled Kubernetes that are already in preview and more on the way. Oh, this is awesome again and I think I really highly I think I appreciate the work here because as probably people like think about this in Azure when we have a service we build it obviously for thousands of servers, right? Like literally like thousands of servers. However, I think for our customers we probably need to scale that down a little bit. So it's not just like a one-to-one copy probably of just like how the backend works. It probably needs to have some changes and modifications and... Yes, yes. Make it actually usable by somebody, right? Absolutely. So yeah, think of it this that way is it's sort of like all the same capabilities but in a way where people can realistically deploy and operate it in their own environment. And one of the key technologies that enables that is Kubernetes. And Kubernetes is really important in this strategy because it creates an abstraction layer over the underlying infrastructure and virtualization stacks. So that way, whether a customer is using something like VMware on-premises or they're using maybe EKS on top of EC2 and AWS it doesn't really matter, right? It doesn't matter what your virtualization engine is doesn't matter what your storage is doesn't matter where it is, right? As long as you have Kubernetes over the top of it all we can consistently sort of program to the Kubernetes API and deploy and operate these Arc-enabled data services on any infrastructure, right? So that way it's super consistent wherever you go. Yeah. Okay, that is also important. Like I mean many customers probably have seen that sign they think, hey, do any now some size of like Microsoft version of Kubernetes to run that? Or is it like something special? Because I run already Kubernetes probably they run that in their data center they probably have like things like OpenShift or like all our Kubernetes flavors. And so they're gonna be happy that they can just use their existing infrastructure for that. Yeah, virtually every enterprise I'm talking to these days is at some point in their journey towards migrating to Kubernetes, and mass everything I think will eventually be on Kubernetes. And I think it's a sea change that's sort of on the scale that we saw with virtual machines back in the early 2000s is a completely different pattern for it that a lot of customers are starting to adopt. And yeah, that flexibility to use any Kubernetes is super key. You can use our AKS service which is a managed pass service in Azure. You can use AKS on Azure Stack HCI or you can use AKS on Azure Stack Hub. Those are all great options, or top to bottom provided by and supported by Microsoft. But if you want to bring your own Kubernetes with OpenShift on-premises or you want to use a managed Kubernetes service like EKS in AWS or GKE and Google Cloud, that's fine too. Now you got that flexibility. Awesome. All right, so let's kind of talk about a couple of different pricing tiers that we have for high availability in Azure SQL Managed Instance with Azure Arc. You can kind of think about standard HA as being sort of like the Cal wannabe pattern that we saw earlier. You know, we separate out the compute and the storage and if the storage, the storages are a persistent layer for keeping state, but then if the compute goes down, there's no problem, we can bring it back. And then we've got the premium high availability where we have the multiple instances and availability group and that's more of the true Cal pattern, right? So let's look at kind of just a diagram and sort of flow of how this actually works. So with standard high availability, you have our data controller here which is kind of the brains of the operation running inside of your Kubernetes cluster. You have some number of Kubernetes nodes here. You have a SQL Managed Instance running here and you've got some load balancer that's routing the connections into here and your storage is out here on some persistent volume. And this points to, it could be a SAN, it could be in the case of a managed cloud service like Azure, it could be Azure files. So you got flexibility about where this persistent volume goes, but from the point of view of the database here, it just is writing to the local file system that Kubernetes is taking care of routing the actual file storage operations, the IO out to this persistent volume. So in this case, if something would happen to that pod or even the node on which this pod was run, Kubernetes will automatically reprovision another SQL instance to another node or maybe even the same node if it was just a pod failure and it'll remount to that persistent volume where the database files are out and the database engine will just do the standard database recovery as if the SQL server process has crashed on a Windows server, for example, and it'll just keep going. And in that sense, this whole process here as we'll see here in a minute takes about a minute or so and you can get back up and running no problem. All of your applications just simply need to reconnect to the load balancer and the load balancer will take care of routing the connection to the new pod that's been deployed. So I think about this as kind of like the cow wannabe pattern. We separated out compute and storage, but it's not a true sort of cow pattern here. The true row pattern comes in with the premium high availability. Here, we have multiple nodes. We have multiple SQL instances, one of which is deployed as the primary and the others are the secondary. And this availability group here, this gets automatically provisioned for you at the time that you provision a SQL managed instance in Azure Arc. Everything is done. You just click a button and everything comes up just like in our past services. Now, in this pattern, if you have an application that's read write, it connects into one load balancer and gets sent to the primary. If you have a read only application, it hits a different load balancer and it gets routed to the two secondaries, right? Now, in the event of a pod failure in this case, what happens is Kubernetes will orchestrate the nomination of a new primary and it'll tell that database instance, you're the new primary, it'll update the read write load balancer that it's now gonna point to this primary pod and this application can reconnect and start talking to that primary. And this is just a failover operation. This is really quick. I guess that can then also happen super quickly. Like again, like probably, I don't know if we're speaking in a seconds or... Yeah, we'll see the demo of it today. We just finished the first kind of build of this. So it's not quite down to the level of seconds yet, but we'll see that, yeah, this will eventually over time as we finish tuning this, this will get down to a small number of seconds to be able to failover and allow your read write applications to come back into here and reconnect. Nice, yes. Now, part of the challenge here is we'll see you in, I'll explain here in a minute kind of what the challenges are in this space because I wanna get deep into the tech here pretty soon. Okay, so now, what Kubernetes will also do is it'll help us reprovision another secondary. It'll update the read only load balancer to point to that one. This one will get reseated so that it catches up to everything else and you're good to go, right? So that's kind of how the availability groups work on Kubernetes. Okay, now, let's get into doing some demos, all right? Awesome. Enough slides. I like demos, that sounds like a good idea. That's what we're all here for really, right? Okay, so I've got a couple of different environments here. The first one is I've got a Kubernetes environment and we're gonna log into this environment and I'm gonna just show you kind of how things look in this environment. So first I'm gonna monitor this namespace here inside of my environment here. So every five seconds or so, this is gonna refresh but what we see here is we've got this set of pods that are deployed as what I mentioned as the data controller earlier. So this is kind of the brains here that provides all the management, monitoring, backup, provisioning, all those kinds of services are provided by this set of pods here, okay? Now, what I'm gonna do next is this. Going to log in, okay? So now that we're logged in using the azdata command line utility, we can now start to provision a SQL instance. So I'm gonna run this command here, azdata arcsqlmicreate and all I have to do is just give it a name, that's it. I don't have to specify really anything else and off it goes. There are other options I can specify in terms of how many cores I want or memory I want or whatever but if I just take all the defaults all I have to provide is a name, one simple command. Here you can see that the SQL instance is now coming up and this pod right here has the SQL container in it that's gonna actually be the SQL engine. So now when you go to deploy a highly available SQL instance as we'll see here in a minute, like all you really have to do there is you just pass the tier parameter and you set it to the business critical tier and then in that case it will deploy an always on availability group pattern which we'll see next, okay? So but in this case, we're deploying the Cal wannabe pattern. It's just a single SQL instance coming up here. You can see that we have now two out of the three pods ready to go. So let's dive in and kind of really take a look at the details of this pod. So I'm gonna run a command here called SQLMI Show and this is gonna show us the details of this pod. Now you can see down here, for example, that this is currently in the Arc namespace inside of my Kubernetes cluster. We're connecting to it through a load balancer. I've requested five gigs of storage for the data and the logs and I've got these endpoints here for the log search and metrics dashboards and then this is the endpoint that I connect to if I want to connect to this SQL instance from an application or from a database tool here like Azure Data Studio or SQL Server Management Studio. And you can see that right now it's in the creating state still and we're sort of at this zero one state, okay? Now the other thing I wanna kind of point out here is that automatically when we deployed this database instance, we told Kubernetes to deploy a service and a service is really that kind of load balancer that we talked about earlier that will take care of routing incoming connections to the pod wherever it's located. So my applications will always just connect to this IP address and this port number here and Kubernetes will take care of routing it out to the pod. So even if it moves around from node to node, no problem, Kubernetes will make sure that the application connections get routed to the right place. Okay, now let's also describe this pod. I'm just gonna show you inside of here, we've got a few different containers. Here's the fluent bit container and here's the SQL MI container, okay? So this is the SQL engine container running inside of there and down here we'll see that the var log is mounted to this persistent volume claim right here and var opt where the database files are at is mounted to this persistent volume claim. And if we go down here now, we can see that these mounts right here are indicated. And so if we go look at this persistent volume claim here, all right, so scroll down here as I get back to my notebook, we're gonna go look at this persistent volume claim in detail. So this is the persistent volume claim. We can see this, so persistent volume claim basically just allows a pod or in this case a container to have a claim on an underlying persistent volume. And so this persistent volume is currently mounted to this persistent volume claim is mounted to this persistent volume here. So now we can go and we actually look at the persistent volume. You can see here that it's using the Azure disk as the underlying storage that I'm running this in AKS in Azure. Yeah, so this could also be like if I would run that like on-prem, this could also be like, let's say a remote file share or like whatever you're using as your persistent options in your Kubernetes cluster, right? Yeah, exactly. So if you're using something like a NetApp storage device or something, then you could use persistent volumes in NetApp on-premises. That's the beauty of Kubernetes is that there's all these different storage plugins that abstract which underlying storage you're using. All you really care about when you provision applications that you'd specify your storage class and Kubernetes takes care of the rest. You can set up all the PVs for you on whatever storage you choose. Okay, so now the kind of interesting thing about this, so now we're looking at the persistent volume and you can see that this is the disk URI, right? So if I go grab this particular disk name right here and I go up into the Azure portal, we can see that disk right here. And we can see, for example, that this disk is a five gig custom disk, right? Just like we saw earlier when I showed you the persistent volume in the pod configuration, it is configured to be a five gig storage, right? I can choose the bigger size if I want to or whatever. That's what currently what it is, right? Okay, so that's kind of how the storage gets mapped through. Now we can see that the SQL instance is fully up and running now. We got three out of three pods, which means that our SQL container is also up and running. So now what we're gonna do is I'm going to just run this command here, which is a kubectl exec. And this uses a command to go inside of the container, if you will, and execute the command over here to the right of this double dash. So in this case, this is a Linux container. So I'm gonna ls this directory here, which is where the database files are at inside of my container. So you can see here some familiar files, master.mdf for the master database data file, for example, right? But that's really all we've got is kind of the standard system databases inside of here so far. So what I'm gonna do now is I'm just gonna pull down the AdventureWorks database from GitHub. We're just doing essentially a wget here, that'll pull down that database file. And if we go back up here now and run this ls command here again, we'll see that the AdventureWorks database backfile has been downloaded to our data directory. So now that database file is there, we can restore this just like we would any other, sorry, this is kind of hard to navigate on this giant size font here, but so we're gonna change this here and here. Okay, so now we're going to restore that database. Just standard restore command here. So we're gonna map it over to the data directory here. And so now we're restored, right? So if I clear this result, go back up here and run this again. We can see that we have our AdventureWorks on BF and LBF log files now, right? So now we can just connect to that and run standard query. So I'm just gonna select from the person table, for example here, right? So I'm just gonna run this query and there you go. So now what we've done is we've kind of shown how this is just SQL, right? If you can connect to it and do things with it just like we would normally. And we've made some changes to the disk now, right? So since we've provisioned it, we've now downloaded the backfile into that disk. We have restored the database. We've run a query against it. I can even insert data into the table and make those kinds of changes. That's all going off to that Azure disk. It's not inside of that pod, right? So now if I go and kill my pod, right? I kill that wannabe cow. That pod is gonna go away. You'll see it down here. It's gonna go down. And so it's already gone down. It's coming back up now. So you can see that my time here is 10 seconds. So while this pod is coming out, this takes about a minute, right? Let's kind of ponder what's happening, right? The pod containing the SQL container and the two agents that are responsible for managing it was deleted, right? So that means that all three of those containers were gone, right? But Kubernetes is designed to be resilient. It's a desired state system. So it has a stateful set back behind this that is designed to keep this at a current state. And the state here in this case means that this pod should exist, right? And so Kubernetes will keep trying to make this pod happen, even if it gets deleted for some reason. And so that's what's happening here behind the scenes is that Kubernetes is the one that's coming in here and kicking this off automatically to re-provision this pod to get it back to the desired state. Yeah, so that's awesome. So that means like it's really like, again, we just make sure that I still have it even if someone would go in accidentally deleted or something would crash. I mean, potentially I would like also deploy that obviously in different physical machines at the end, like on the laying fabric to make sure that when one physical machine crashes that this would have basically kind of like the same effect. And then Kubernetes would go and say, hey, okay, then I provision some new stuff here and that. And also that doesn't have an impact on the files, right? So the volume you showed earlier, my guess is I could still just access that these files and see the files there, which are on that volume. That's right, just go to an Azure disk. You could mount it to another VM and browse that file system if you wanted to, for example. So in that sense, yeah, the files are not gone. They're still there, right? And so if I go back up here and run this query again, you'll see that it'll just connect back to the database and run the same query we did before. Database is still there. That AdventureWorks database that we restored, we didn't lose it, right? So that's how the wannabe pattern works, right? Now, this is pretty good, right? I mean, it took about, what, 75 seconds to get this database instance back up and running. That's okay. That's maybe not as fast as we might like it, but it's pretty good. I think for many customers, they would already be happy with, in some scenarios, they would be very happy if they can do that, right? I mean, that's already sufficient for many scenarios. Of course, I probably also wanna see now the scenarios where this is not good enough. So in those scenarios, I think about where you try to get to the true cow pattern, where you can immediately fail over to a hot standby database instance as well as a full copy of the data. So theoretically, we should be able to fail over faster, right? And also, this doesn't help us scale out. It's just one database instance that no read scale here like you get with a true cow pattern with the always on availability group, right? So we're not quite to cow here yet, but we're getting pretty close. All right, so now let's go take a look at the sort of true cow pattern, if you will. We're gonna set up this environment here. This is a different environment. And in the near future, when we put this out into the public preview, this is how you'll deploy a business critical SQL managed instance. Same thing as we saw before, except there's these additional parameters here where instead of the default general purpose tier, you choose the business critical pricing tier, and you specify the number of read replicas that you wanna have. And we'll take care of provision always on availability group for you. So you really don't have to think about it like you do when you're trying to set it up in a Windows or Linux pacemaker environment, which just automatically happens for you. Okay, now when you do this, I've already got one set up because it takes a little while to deploy it, but let's kind of look what this looks like here. So same as we saw before, except, you know, here's all the management pods and everything for the data controller. But now this one instance that we deployed SQL test, it has three pods, each of which has these four containers, one of which is the SQL engine container inside of this pod. So this is how we get to true Cal, right? As we get multiple of these things going, each of these has a full copy of the data and that's how we can manage things. Now, I wanna show you for, you know, SQL server DBAs or experienced users, sorry, I'm on a 4K experience, I can't scale this, but those of you that are familiar with this, you're familiar with the SQL server management studio always unavailability group dashboard. Here we've got our availability group. It's in a healthy state. We can see SQL test zero one and two and one of them, the top one here, SQL test one is the primary and it's set up for synchronous replication. So, you know, for those of you that are familiar with always unavailability groups and how you would manage those inside of SQL management studio, this is exactly the same, right? And in this case, I'm connected to Azure SQL managed instance enabled by Azure Arc on Kubernetes, but the experience is the same and that's important for those that are coming from a SQL server background. We want to have them have a familiar user experience as they come into this new way of doing things. Okay, so let's go ahead and log into azdata. And this is gonna log me into the data controller and now I'm gonna do the same show command that we saw previously. So we can get some details about this instance. In this case, you'll notice that there's a little bit different thing here where it says that the replicas is three, right? Because we specified we want three replicas. Here I'm using local storage. Now, this is a really important thing to understand because with the wannabe cow pattern, you have to use remote storage because you only have one copy of the data, right? And so in that sense, you don't want it to be on the same machine as your database engine because if that entire machine goes offline, there's no way to recover, right? Now, does that mean like, of course, that would also be interesting to see then I could use very, very fast local storage, right? Is that another benefit of that? Okay. Because remote storage is pretty fast, but local storage is the fastest, right? It'll give you the best performance. And so one of the really big benefits of using always on availability groups is you can create copies of the data. Each database instance on each physical machine has a full copy of the data. And so even if you lost an entire physical machine, it's not a problem because you have multiple copies of the data. So this is really important. That's why this is kind of a feature that's available in the business critical pricing tier is because this really gives you an opportunity to have the maximum storage performance, which as we know for databases is super important. Okay, now, what I wanna do here is I want to bring up our metrics dashboard because this will help us keep an eye on what's going on. So let's go ahead and bring this up here. So the metrics dashboard, the way this works is we collect all of the monitoring metrics about a given database instance into an influx DB database. And then we chart things like transactions per second, batch requests per second, wait statistics and so on over time so that you can see sort of the performance of each of your database instances. And in this case, we can see SQL test zero, one and two. These are each of the database instances and we can sort of pivot back and forth between each of these database instances and see what's going on. All right? Yeah. So this is all built in. This is kind of one of those examples of built-in management service that comes as part of Arc-enabled data services. Okay, so now, over here, I'm gonna run this command, which is going to execute this transaction. And this transaction is, this is connecting to the database instance at the IP address that is over the read write connection, right? And it's connecting using, it's using a password and it's executing the T SQL that's contained in this transact SQL file here. So what it does is it just looks for a table to exist. If it exists, then it drops it, if it, and then it creates the table and then it does a bunch of data inserts in there and kind of does some calculations to basically put some load on the system. And now that we've kicked this off, you can see that there's a spike in the transactions per second happening up here. Yeah, nice. Okay, and so every, yeah, this will just keep running, right? So this takes a few seconds to run each sign, plus there's a one-second weight built into here. And this will just keep running over and over again just to simulate some load on the system. And in this case, read write load, right? It's not just read only load. Now, over here, what I'm gonna do is keep an eye on this environment. And you can see here as SQL tests zero, one and two, they're all running currently and this will just keep refreshing every five seconds. And over here, we're going to run another command, which is every couple of seconds is gonna run a query and it's gonna tell us which database instance is connected to. Right now it's connected to SQL test zero. And right up here, we're monitoring SQL test zero and the transactions per second is happening there. All right, so now, going back to our notebook here, I wanna show you how this kind of works. Actually, let's do this. I'm gonna delete this pod and explain to you a couple of things while that's happening. So we're gonna delete SQL test zero. That's the current primary that we can see here, right? So we're gonna delete that. Oh, no, right? You can see up here like this in a second here, this transactions per second will drop off. You see the connection drop, right? Because that database instance, that's the current primary is not available currently. Over here, this connection inquiry is not working either. Over here, you can see SQL test zero is back to a three, four state and it's 24 seconds, right? So this is now in the process of redeploying that pod. The agent pods that are inside of here, these three, those have already come up and they're good, right? So now we're just waiting on the SQL container inside of this pod to also be available. And currently that takes a couple of minutes. So I'm just gonna explain a couple of things about how this all works while that happens. As I mentioned earlier, we'll improve this over time. We just finished this. So this is hot off the press. We don't even release this yet. So this is a preview of things, but let me explain kind of how this works. So in Kubernetes, there's this concept called a config map. A config map is you can kind of think it was like a mini database almost restoring configuration in a way that's accessible by services and pods and things like that that are running inside of Kubernetes. So it's kind of a way to share state, if you will. And this is backed up by an SED database, typically in Kubernetes for high availability. So in our case, we've got this config map here. Each of the instances has its own config map that stores the configuration about that instance. And then we've got this contained availability group roll map here. And this is the one that's sort of the magic. Let me show you what this one looks like. So inside of this config map here, this is the key. We have this primary replica right here, and it says SQL test zero. So as of right now, the SQL instances are sort of coordinating amongst themselves, talking to each other like, hey, are you available? Are you available? And currently, before we started this whole thing, SQL test zero was sort of nominated as being the leader of this replica. And so all read, write traffic was being routed to that instance. And you can see over here that the least duration is 30 seconds. What this means is that, because we don't want to, as soon as we detect a lack of availability on the primary, we don't necessarily want to immediately fail over like within a second of us seeing that because that'll just sometimes create so much failover activity just because there's a little blip on the network. We don't want to do something like that. So that's where the least duration comes in. Basically what this says is, hey, if the primary is not available for 30 seconds, then we're going to fail over. So basically like this guy gets a lease on being the primary for 30 seconds. But if he doesn't sort of reestablish himself as the primary within that 30 seconds by sort of heart beating, then the other instances are going to sort of take over and say, okay, you're no longer the primary. We're going to vote amongst ourselves and decide who's going to be the new primary and promote that one to the primary. And we'll see that happen here in a few minutes. So this are a couple of seconds more probably. So this is kind of how things are orchestrated. Very simple. If you think about it, right? It's just this little config map that keeps track of this information here. And this is also how in the future we'll be able to enable manual failover. So if somebody wants to come in and sort of trigger a manual failover to happen, they can do that. So you can see now that the queries have come back online. So up here we can see that we have our activity showing up on the chart here. We've got these queries are being successful again here. And over here we can see that SQL test two is now the database instance that we're connecting to. So this was automatically rerouted. This was automatically rerouted. We can see that SQL test two is now at four four. Everything's running fine. And so this is how we can truly do chaos, right? Because it meets all the criteria, right? We don't care about any one database instance. We can scale out and add additional cow DBs that do the same thing, right? And get that scale out performance for reconnections as well. So hopefully this is a good tour of what's coming with always on availability groups and Park Enabled Data Services and kind of helps people see a new way of operating your databases. Yeah, no, definitely it's super interesting, especially I like the cow thing, how you use that to explain this. So for me, the wannabe cow thing is basically like you have one cow and you just like, if that cow dies or something, you just get a new cow which looks exactly the same and does exactly the same. But obviously that takes a little bit of time. And here you basically say, okay, I have three cows which the advantage is they also produce more milk. But if one dies, you still have two other ones, right? And then I would still replace the one but the time where I need to replace this, I would already have the two cows would still produce milk and still work. And so it really works really well like to explain this that way. So I'm like, I love how these things are working. And as you said, it kind of like, it also reminds me a little bit, like I know I shouldn't probably say that, but kind of like have some things in like Windows failover cluster, where it kind of like also like, it's sort of the same challenges in a way, right? It like addressed a little bit of the same challenges we had there to make sure that like if one node fails in a cluster, we fail it over. And then I remember I think we also had a value where we said, hey, how fast should we decide to fail it over? Maybe again, there's just a flickering network as you said, or maybe just something didn't answer like these 30 seconds you mentioned there. Yeah, so yeah, not pretty cool stuff. It's actually very similar to Windows server cluster services. It's just a different way of doing it in using Kubernetes. And the thing about that is that it's just consistent everywhere. You can deploy it on any infrastructure, going back to kind of our beginning here that this is really about embracing hybrid and like bringing Azure to you. And this is how we do things in Azure. And now you're gonna be able to do it the same way in your own environment. And what I also love, I mean, as you said, it's an Azure service and it already has a lot of stuff built in which I don't need to take care of, right? As you said, like availability groups and all that stuff, like that was like, I remember setting that up for a bunch of like, for example, systems and the clusters and stuff like that in the past. And I needed to do that to get the availability but actually I didn't want to do it because it was just, as you said, it was a little bit of pain. So it's some extra work I don't want to. Like if I just can get it by just setting a command like you showed the BC switch or the tier like to a business critical, that would have been, that is obviously way easier than just like setting up the whole thing. And then also like the monitoring tools like the Influx DB built in and the Grafana dashboard. I also like that very much that I don't have to like set up a monitoring solution for all the things. I think for our systems there guys, we can really appreciate that, right? It's just there. You don't have to go set it up and everything, right? Yeah. And it works everywhere in the same way. So yeah, absolutely. It's a new way of doing things. No, that's pretty cool. I really, really love the thing. And I also appreciate very much, by the way. I want to highlight this again. This is a super early version you just showed us. We have something like, again, not something we could try. Like the BC tier is not something you can try out, right? It's like something really, really new. And I appreciate always when people show me such awesome new stuff and technology. Absolutely. So I now obviously have a lot of like time after recording all these videos. I definitely want to go out and try these things. So I heard that Azure Arc enabled data services is in public previews that I can go and try it out, right? Yeah, exactly. So the one of the cal pattern we saw today, you can go try that out right now. The full cal pattern that'll be available very soon, where I'm hoping by the end of February, we'll have that out into the public preview release. So that's coming, but yeah, go try out everything else that's available right now with both the Azure SQL Managed Instance as well as PlusGraph SQL Hyperscale. Awesome. And so if I want to learn more about this, like I want to go now out because I really want to learn more. Do you have a couple of resources to share with us? I've got a couple on the screen here. There's a lot, right? So Azure Arc is a really big thing across all of the different Azure teams, right? And so there's a lot to go learn here. So lots of links here. Feel free to dive in, especially on the right column here. I'll just make a plug for data services because that's my thing, but definitely go try that out and feel free to reach out to me as well. Just shoot me an email, hit me up on Twitter. Always happy to have a chat with folks and get some feedback on how things are going or help people. Awesome. Thank you very much Travis. It was a pleasure to have you on this call. I learned a lot. I always like these sessions because I get a lot of information on this. I hope all our viewers also learned a lot. And if you want to learn more about Azure Arc, about other hybrid technologies we have in Microsoft, we have some great sessions on Azure Stack HCI, on AKS on Azure Stack HCI. If you want to look, dive into more of the Kubernetes side of things. We also have some great session on Azure Arc enabled servers. So how you can actually manage your service in a hybrid environment. And I hope again, this session was really, really helpful. We showed you, Travis showed you a couple of different things like how you actually can take the Azure SQL service and bring it to basically every platform. Doesn't matter if you run that on-premises in your data center at your like branch office or at your retail store or even at the other cloud providers. So that is really, really great. And then obviously how actually the management and the whole the plumbing behind it works and how we get all that resiliency. Now, thank you very much again, Travis. If you want to watch more of these sessions, go to aka.ms slash itops talks. That's where you find more of these sessions and all of the videos we have prepared for you. Thank you very much. All right, sounds good. Thanks everyone. Thanks Thomas.