 This is Trove in the real world, ventures and running production code on upstream code. My name's Andrew Conrad, I'm an engineering manager with HP, working on the Helium Dev platform, all of our platform services. I'll let Nikhil introduce himself. Hi, my name is Nikhil Manchanda, and I'm the project technical lead for Trove, for Keelow. I also work at HP on the Helium developer platform. So we'll jump right in, here's our agenda today. We're gonna talk about the real world, and that is explicitly not DevStack. We're gonna be very clear about that. We're gonna talk about disasters, that seems like a pretty open-ended topic, but we're gonna hit on disasters a little bit. We're gonna talk about how to use Trove in Juno stable upstream code to survive disasters. We're gonna talk, well, Nikhil will talk a little bit about Trove with Keelow and beyond. Kind of, we're gonna show a bunch of code today, and he's gonna talk a bunch about how this code gets upstream, or his thoughts for how the code gets upstream. At the end, all the code samples we show today are gonna be available on GitHub, so we'll tell you where that's available. We'll have a bunch of follow-up to this conversation, and we'll do Q&A. Any real-time discussion or comments about the talk, we have a Twitter account you can post to. It's at Helium developer. We may show some of this up on the screen, and so please write something your grandmother would be proud of, because we may be displaying some of this. Okay, so let's, whoops. Do we drop a slide? I don't think so, I'm not sure. Is, well, let's just go right into it. So, we had a slide here that talked a little bit about DevStack versus real world. And one thing we always talk about in DevStack is everything just works in DevStack, and we know why that is, because it's a single process, it's on a single box, you don't have any hardware failures. Just everything always works. You know, Keele's one of the engineers that works on one of my engineering teams, and one of my favorite quotes from him is quote, but it worked on DevStack, right, on my box. So, we're gonna try to get away from that today, and we're gonna talk a lot about the real world, where everything fails. Failure, okay, so there is a slide here, just pretend it's here. The real world's all about embracing failure. Failure is not a corner case, it's the common case. Everything fails all the time, you have to assume that. Basically, when we build Trove and some other other platform services, that's what we're thinking about most of the time. Like, you're not thinking about the perfect case, you're thinking about the failure case. So, I wanna talk a little thing about, bad things happen all the time. Database world specifically is probably not a surprise to most people in the world. Database is crash. Network partitions happen. VMs crash. Hardware fails. Occasionally, a tornado may hit your data center. It's kind of a rarity, but it is a possibility. Two rules about databases. Data is always durable. The data must survive everything. And then the data store must always be available. So, this is availability and durability. The two keys in Trove, the two keys in most database services, including what we're shipping with HP Healing. So, we want you to help us create a disaster today. So, we're gonna have some audience participation here. We want you to tweet. And there's two options here. We realize we're in the city of love, the city of light, Paris. So, we have, the first option is please use the hashtag, I love Trove and Paris. And specifically, you have to say L-U-V. So, we're in Paris, so I'm trying to speak a little bit of French. So, that's the nice option. For the ones in here that are a little more aggressive, a little more, I don't know what the right word is, but we have another option. We have blow up Trove and Paris. So, actually what's gonna happen is depending on which tweet gets the most, or which hashtag gets the most tweets, we're actually gonna bring down our master database that Nikhil's gonna nicely set up in a second. So, we're actually gonna destroy this thing and we need your help to do it. So, I'm gonna hand it off to Nikhil now. He's gonna talk a little bit about the code that was put into, into Juno for Trove, and it will be part of our demo today. Excuse me. Thank you, Andy. So, before we sort of dive into the demo, I just wanna give folks a bit of background into the MySQL asynchronous replication support that when Juno, which forms sort of the backbone of how we're going to do this failover. So, Trove has support for async replication, as a few of you already know, and the way you do that is through the Trove create API, it now takes this optional parameter called replica of, that you just pass it the master UUID, and then Trove goes ahead, takes a snapshot of your master, provisions a slave based on that and sets up replication between the two. So, that will sort of form the basis for our having another instance with the same data that we can failover to. And then also on when something bad happens to the master, we'll go a manual detach of the slave needs to happen so that you can use that as its own promote that to master. And so, to do that we have the Trove update API where you can call the slave and say, hey update yourself and detach yourself from the replica source because I want you to not be read only and be your own, be a new master. So, going to quickly walk through the demo. So, over here this is a Trove box. Unfortunately I couldn't get this running on the cluster or anything, so Andy it's gonna have to be DevStack over here. It works in DevStack. So, but since it's DevStack and everything works perfectly well we're gonna have to simulate the failure. So, just as a reminder tweet, I love Trove in Paris or blow up Trove in Paris and we'll have to create that. So, we did try all these scripts on our real production systems before we came here but for the sake of the demo we decided to do it on DevStack. Okay, so I'm going to go ahead. So, there's a master DB. It has an application running. It's a very simple application. It just shows the OpenStack services and I am going to create a slave based on that. Just trying to make sure, sorry one second, that the screen here is showing something different from the screen there. So, make sure we're on the same page and you guys can see what I'm doing. So, I'm going to create a Trove slave. It's going to be a replica of this master. So, I'm just going to post the replica over there and that will take a few minutes to build and as you can see, Trove is taking a snapshot of the master and the slave is starting to build. And so. Now, Nikhil, did you create the slave across the different AZ when you did that? Oh, no, so, excuse me, this is DevStack. So, no, in production what you'd want to do is you'd actually want to create the slave in a completely different AZ because your AZs might have the same power source and in case one power source goes down, you'd still want to make sure that your slave instance is up and then you can fail over to the slave. So, in this case, I don't have AZ support for the demo, but in the real world, you'd actually want to create the slave in a different AZ, absolutely. Want to give you guys a bit of background on how the failover is actually going to happen. So, if you notice here, when I did a DevStack create, the host name that came back is actually a DNS host name. And so, we're using Designate under the covers here, Designate's an OpenStack incubated project now. And so, we're going to use DNS in the system to do the failover, so the application is actually going to talk to the master using DNS. I just noticed that the resolve.conf is not going to the right name server, so I'm just going to go ahead and correct that. Also, it might be better if some of you guys can see this. Since this is local DNS, so I'm just going to make the local. This is a disaster we didn't plan. So, no, I just want to make sure that the application is actually up and running. There we go. So, DNS is working, Designate is working as it should. And so, the application is talking to the master using the DNS host name. And since it was all this actually going to the local box to the DNS resolution, things are working. So, let's check on the status of our slave quickly. Slave is still building. It takes a couple minutes to build, so we can check what the... So, this up here is the slave's DNS name. And then, if you look at the same thing on the master, you'll see the DNS name that we provisioned for the master. Okay, so let's go back to the presentation and talk about what's actually happening a bit over here under the covers. So, when Trove is setting up your instance, Trove has inbuilt support. So, the Trove guest agent that gets provisioned on the instance sends heartbeats back to the Trove control plane through the Trove conductor, which basically checks on the status of the data store on the instance. So, if it's MySQL, it checks port through 306, make sure MySQLD is running, and then says, okay, you're active. People can actually connect to you. So, we use those heartbeats from Trove as a means to figure out that the instance is active, and so automatically monitor those heartbeats. And so, there's really two cases here where failover, what failover looks like, right? So, failover or what failure looks like. So, you can consider it a failure if the guest agent is actively monitoring MySQL and sending back a heartbeat saying, hey, I can't actually talk to the data store, the data store's not up, or Andy mentions, as he mentioned, VM's crashed. So, there might not be something on the guest agent's side to actually send that heartbeat back. So, in that case, absence of a heartbeat also, after some threshold, you'd want to characterize as a failure, so you'd want to use that to failover. So, the idea here is monitor heartbeats, and if you get a bad heartbeat, or if you get no heartbeat, then that's a failure, so. Disaster time, why don't you- So, should we make sure the slave came up before we do the test? So, for your demo, you're monitoring the master. You could be monitoring both the master and the slave. Right, so right now for this demo, we're monitoring the master, and then we're monitoring heartbeats on the master, and I'll walk through the code that we're actually using to do this. It's an extension of the Trove Task Manager, so the way we've written Trove is Trove is pluggable. You can extend the current Task Manager, you can make it do other things, or, but I mean, when we walk through the code, it's all using standard OpenStack client APIs, so the reason I'm extending the Task Manager is because it has a periodic task that you can leverage to do this, and you can extend the current one with some new functionality and plug it right back in, so it looks like our slave came up, so, in that case. So, you wanna create the disaster, hopefully you've tweeted by now. Either I love Trove in Paris or blow up Trove in Paris, so let's go see what kind of audience you are. So, this is a little script that Andy wrote here to see basically pulls the Twitter stream and either shows the results for I love Trove in Paris or blow up Trove in Paris, and so it looks like we have a few people who have tweeted I love Trove in Paris, but, oh, there's only three for blow up Trove in Paris, come on guys, you can do better than that. Looks like there's more people who are falling for the beauty of Paris, the city of love, so I love Trove in Paris. So, as most. Once you hit it one more time, we'll see if we get any results. Okay, are there any? We're not talking about database consistency today, Twitter is not the most consistent database in the world, so you get some random results. Oh, there you have a few more, all right. Oh, it's a tie, this is a hard one. XG78150, just though. Why don't you show how we're gonna blow up the database and then we'll come back and check one more time. Okay, so basically, to blow up the database, what we're gonna do is we're going to go ahead and stop MySQL on the database, crash it, and so the guest agent, which is still running, would send back a shutdown, a shutdown, hard beat saying MySQL is shut down, it's not coming back up, I don't know what to do. This is my status, right, so, and as most failures, excuse me, most failures don't come with what we've built here, which is a big red button, that I'm gonna go ahead and press and that's going to cause the database to fail. So, I'm tending towards I Love Troven Paris. We'll run it one more time. Okay, let's see if that's what the audience is feeling, so. Uh-oh. Oh, let's see, so what's our final, whoa, whoa. Oh, wow. How did that happen? It's that kind of audience. All right, all right. Well, we'll have to fall back to plan B. Blow up Troven Paris, so. I'm not entirely sure. Okay, let me go hit the button and close my eyes. Andy, tell me if this goes well. Uh-oh. Wow. Okay, let's see what happened here. Uh-oh. Well, it's a hard beat, so it happens on a schedule, but our application should actually have been taken down at this point, right, so. So the Trove hard beats happen every two minutes, so hopefully, there we go. You can see master DB right now, shut down, can't reach my sequel, so the hard beat has come back and can't reach my sequel at all, so if we, so this is the case where the failure scenario should kick in, and let's go look at our monitoring task manager that I'll shortly show you guys the code for to see, to show what's going on under the covers. Oh, and look at that, so it's not able to connect remotely to, it's not able to connect to the MySQL master, so it should have gone through and actually kicked off the failover code, so let's see what happened. Oh, our application's back up, so if we actually look at the state of the Trove instances here, our master's shut down, our application is talking to the slave, and if you look at the slave show, a very interesting thing has happened, the host name of the slave is now been switched out and that's actually the host name of the master, so you're under the covers, we're using DNS to do this failover, so I've always been wanting to say this in Paris, so et voila. Let me quickly go back to the slides, and so that was the failover demo, I promised I'd walk you through some of this code that we have, so yeah, if you have questions, go ahead. Yes, you do, so there's two ways of doing this and you could either use DNS, so the way we've set it up, we have the DNS servers and we're talking directly to that particular DNS server, so we don't have to deal with DNS replication and the TTLs are set really low so that if something happens and you're using it for failover, your host is not caching the DNS record and going to the old host, so you absolutely have to deal with that. There's another way of doing this, you can use floating IPs to do this, so you basically give the master and the slave public IPs and then you switch the IPs over when you take the failover and again, that would depend on what you want to go with, the Trove has APIs that supports both and if your DNS infrastructure allows it, you could go DNS or with floating IPs. Any other questions, just feel free to raise your hand and yes, so Trove is monitoring the database, so Trove standard today in Juno monitors the database based on the heartbeats, and so it sends back status saying I'm up and running, I'm a data store shutdown, so that's what the shutdown thing you guys saw was, but we've extended Trove, the regular Trove task manager to also do this monitoring part and that's the extension that I'm sort of showing you here. So this is the monitoring manager and we've configured Trove task manager module to use this, this basically just extends the task manager and it's the crux of the code that was doing the failovers in this part, which is monitor HA, which is a periodic task that goes ahead and monitors Trove instances. So first it finds a list of master instances that have slaves that it wants to, it identifies that it wants to watch for failover scenarios, for failures and then once it's done that, there's this condition to when it should engage failover and like I had mentioned earlier, if you go back to see what's happening there over here, there's really two scenarios, right? There's an active failover or there's a failover where it's actually failed. So in the case of where it's failed, you know something bad is happening, your timeout over there is probably shorter, but in the case of where it's active, it's still active but you haven't heard from it and haven't had a heartbeat, you might want a different timeout there just because it might just be a network issue or latency or things like that that the heartbeats are not getting across. So that would have detected the case when the VM just dropped. All right. Would it be what you called? It wouldn't have made for a very good demo though because the timeout I have it set at is 15 minutes. Yeah, fair enough. Right. So we'd be waiting here for a long time before it would have detected that, I mean the VM is crashed under us, but yes. So your script basically, you detect something bad happens, don't ask any questions and just switch over. Right. So coming to the meat of this script which is actually doing the switchover part. So we've identified that failover needs to happen. This just goes ahead, authenticates a bunch of open stack clients, not very interesting. This was that failover API that I showed earlier where you call instances edit and you attach replica source. And basically at that point, your slave is no longer a slave or getting data from the master, it's its own master. So that's equivalent of promotion. And then this is talking to designate to actually switch the DNS records underneath to do the failover so that all your applications can now talk to your slave instead of going to the master. Also one more step after that, all of these functions are up here and this code will be posted on GitHub if you guys want to choose to see what's going on. And we want to update Trove to make sure the records in Trove are consistent with the designate records. So we wanna make sure that we go ahead and say the master has this new DNS and the slave has this new DNS. I think we have a question back here. Oh, sorry, the lights are bright so I might not always see your hand. That's really awesome. That's a great question. So these are some of the questions that we're actually sort of coming up with when we're designing this, like some of the Trove design sessions that we have because what we're finding is it's really up to the deployment and the environment and the sort of applications running on it. There's a lot of people who are saying, hey, I can wait longer if it's just a temporary network outage that will fix itself in five minutes, that's okay for my application. But there's other people who are more sensitive that says at the very first time, I know it's starting to fail, I want it right out of there. So I mean, and what we're finding more and more is that really we need to provide the knobs rather than actually providing the solution, one size fits all. So we need to build in ways in which those monitoring controls can be tweaked in Trove rather than have one size fits all solution there. So this is basically, so what Fillower's done, now you're talking to the slave, are we done? Well, not really. There's some other stuff that needs to go on. You want to do some more stuff and the script can be extended to do this today using the standard tooling. You promoted your slave to master, you've used DNS and floating IPs to failover, but a reset also needs to happen. Your new master now doesn't have a slave attached to it. So if this were to happen again, you're leaving yourself vulnerable to that. But your Fillower strategy could also consist of spinning up a new Trove VM just as how I showed you as a slave of the new master. And then the other good thing is like, I know at least the team in Seattle that I work with is very DevOps-oriented. And so the old master is still around and you can go take a look at SSHN or do whatever you need to do to figure out why did it go down in the first place. So it's still around for Portsmortem debugging and analysis, right, so. I'm not a big DevOps guy, but I don't consider that a party, but you should try it sometime, Andy. It becomes a way of life. So this specifically talks about asynchronous MySQL replication. I'll talk a bit more about what we're doing in Trove specifically to handle that. I think there's a lot of, then there's a slide in this where I want to talk to specifically to that. There's a lot of state-of-the-art technology already out there that we're looking to sort of leverage. We built a clustering API in Juneau that we're looking to extend as well. I'll get to it. So, but actually that's a good segue. So failover in HA in Trove, right? So right now we have the tools or the APIs to actually do this, but we don't do this in Trove. Somebody said, like, wish for you to write the code to actually for Trove to do this in default. We're absolutely looking at this and this is what we have, plan for Kilo. We're still trying to go through all of the designs to figure out what's right for Trove to do. There's a lot of ways in which Trove can actually do this monitoring and from an open stack perspective, there's no clear solution. Should Trove use a third-party solution to do this monitoring? Should we do it based on our heartbeats? I think heartbeats is a good first step and it's already baked into Trove, so there's a lot of talk about making that solution pluggable as well. So attend the, if you're interested in these areas, attend the design summit sessions on Thursday. There's a specific one on failover. There's another specific one on clustering. So you mentioned the split-brain scenario specifically, so in order to target some of that, async MySQL replication might not be the way you wanna do that. You might actually want to spin up a Galera cluster or something like that, where you have three instances and then there's already technology out there that solves this for us. So how can we make Trove leverage some of that? We have a clustering API today, but we only have Mongo clusters. In Kilo, we're absolutely gonna start looking at provisioning Galera clusters. And hopefully, my view is that by the end of Kilo, we'll be there and we'll have at least a three-node Galera cluster up and running, or Trove will have the ability to do that. So that's on the roadmap. Again, if you're interested, the design sessions are on Thursday. So it sounds like you're promising that everything we showed today plus clustering will land in Kilo. So Andy, let me, this is OpenStack, right? So OpenSource works in its own beast. So I'm not sure if I can say promise is the real world. We're gonna do our best to make sure we get there, but. That's cool. I mean, I think the point we're trying to make today is everything you need is in Trove. You just need to write some code outside of Trove. And we're gonna make a lot of it available. We're actually gonna extend a little bit of what Nick Kilo showed today. And we'll just have that as free samples that people can download and use. Obviously like Nick Kilo said, eventually all this should just be part of Trove. And it shouldn't be stuff that you have to run externally from Trove. So the location for the code samples is right there. We're gonna leave this slide up during the Q&A so you can copy that down. There's a bunch of design sessions, both Thursday and Friday. Is there any ones you specifically wanna talk to? If you're interested in failover, definitely make it to the replication V2. Talk the design session that we're gonna be talking about failover. And if you're interested in clusters and Galera clusters, building out Trove clusters is the place you wanna be at. If you wanna contribute to Trove and not interested in either of these, we're still, I mean, we'd love to have you. There's a lot of things going on in Trove other than failover and clusters. And if you're not able to, if you have conflicts, not able to make any of the design sessions, definitely come to the contributors meet up on Friday morning. We'd love to have you. Also, find us, if you can't make any of them, find us on FreeNode hash open stack dash Trove, so. And also, if you want us to explain more about the demos we showed today, the support that we'll show up in Kilo. Our contact information is there. We're both here through the end of day, Friday. So feel free to contact us. You can also, we'll be watching the Twitter account. So if people have questions, feel free to post there. And we'll, as much as we can, we'll answer the questions. There's two more Trove sessions. Unfortunately, one already happened on Monday. But there's also one at 520 today. Another one of our HP contributors, Vipple, where's Vipple, right there. He's gonna be giving a talk about the PCORNA server features in open stack and then doing ops with that. And that's at 520. It's the last session of the day. So that's our contact information. Feel free to contact us, like I said. And that's all we have for you today. We'll take questions now for a few minutes. Why don't we leave that slide up so people can copy. Today we don't have any Trove APIs that support that. We should probably discuss how often that scenario comes up and if that's something that we need to handle, we should talk about a way in which we can get that going. So if you find out that it's, so sorry, the question was, if the master's out of rotation, is there any way of putting that very same instance back in as a slave of the one that just got promoted? Can we do that? We don't have a way of doing that today. And I mean, if it was just an intermittent network failure or something and you diagnose it that it's not really a problem with the instance, then if that's the scenario that we want to be able to support, we should talk about having an API to do that, but we don't do it currently. Great question. Question. So Trove supports multiple slaves to the same master. You just have to issue multiple create calls and you can use passing the same master UUID. So you have one master with end slaves. We were trying to do some benchmarking to see if it sort of fails over at some point and what that point is, but we've built it in such a way, we've done it in such a way that you can have end slaves. So the question is, is this integrated in heat? The answer is so Trove has configuration that you can switch to tell Trove to use heat to do the provisioning or not to do the provisioning. So you can either provision your instance directly using heat or you can provision your instance by going to the individual OpenStack services themselves. So the task manager, which is the component of Trove that actually does this provisioning, I don't remember the exact configuration value, but if you switch it over, it can either talk to the Nova, Cinder, endpoints directly or through the heat API. So this is not running using heat. Specifically, I don't know of any bug that would not let you do that because underneath the actual configuration of the slave is everything that is the guest agent is handling. So heat is basically just being used to create the instance up and running and put the Trove guest on there and then it's up to the guest to do all of the configuration. So it should just work. That said, the demo itself is not using heat and I don't believe there's a bug with that, but if there is, then definitely put in a launchpad ticket and we'll look at it. It is supported. Very interesting question. So again, this comes back to the same thing of like, these are questions that are coming up in a lot of the design sessions and how do we build a set of knobs to allow people to pick a certain slave as this is the slave to actually fail over to when the master goes down. These are things that we're going through and trying to figure out during the design session as well. So come to the design session. I do want to mention, we actually shipped Healing Dev Platform with replication. It just went on a couple of days ago. We're also, is it any day now our public cloud will be updated with the same bits? People are working on it as we speak. So you can actually go up to the HP public cloud and play with this script and do that the same thing that Nikhil did today. So I believe, are we the first public cloud with replication that we know of? Any other questions? Comments, they're very interesting. So feedback, it's a hot topic. There's folks looking at it as we speak. I know there's Amroth, I don't know if he's in the room somewhere. It's blinding. Oh, there is Amroth and Dennis. There's a bunch of folks actually talking out scenarios and working through it, trying to figure out how we can use, build Cassandra clusters using the current row of clusters API as well. So that's also part of the design session that we're going to be talking at, so great question. Cool, that's all we had in terms of the demo. And thank you for blowing up Trove in Paris and thank you.