 All right. Good afternoon. So we are officially post noon and I am standing between you and lunch. So I take that responsibility very seriously. And we'll make sure this talk is on time and to the point. So today I'm going to be talking about juggling Argo rollouts for progressive delivery across multiple services. How many people are using Argo rollouts today? Hand raised. Okay, like actually like not a ton of people from this audience. So a lot of you are new to Argo rollouts and hopefully this some of these concepts are going to translate and make sense. My name is Dan Garfield. I am a co-founder and chief open source officer of a an amazing company and team called code fresh. We've got a booth out there. We helped create Argo con and we maintain Argo and we helped graduate Argo and we've been involved in this space. We've been involved in this space for a long time. You can follow me on Twitter at today was awesome. I mostly just snarkily complain about Kubernetes and things like that. So if you're into people that complain and whine Twitter is you already know the right place to go. And then in addition to being an Argo maintainer, I also helped create the open get up standards. So if you've seen the get off principles, that was an initiative that I helped start with some folks from we've works in AWS and GitHub. So an Azure and red hat. I want to leave everybody out. All right. So let's jump into this. So progressive delivery with Argo rollouts is very powerful. And for those of you and it sounds like many of you aren't necessarily super familiar with how Argo rollouts works yet. Just a brief, you know, review of this. So Argo rollouts works by having a controller that sits on our cluster and it looks for new deployment versions. Our rollout is basically just an Argo deployment. It's really that simple. You can actually take an existing Kubernetes deployment and you can just wrap it in a roller in a rollout. It's the exact same spec with additional options. So rollouts very simple to implement in that way. And basically what it does is the controller will monitor for changes to the rollout. And then it will spin up new versions and pods according to the plan and adjust ingress and service so that the traffic that you want getting there is getting there. It could be a percentage of traffic. If it's a canary release, it could be a blue green deployment where it spins up everything. And then it has a mechanism to run tests with an analysis template and you can pull metrics from Prometheus or Wavefront or whatever the heck you want to decide if you're going to continue to promote your rollout or complete your rollout. Okay, so so far so good, right? It's very simple, progressive delivery, but it's all done in a declarative format. And Argo rollouts is extremely popular. And a lot of people use Argo rollouts without Argo CD. So I'm going to talk a little bit about some of the differences if you use this with Argo CD as far as this talk goes and some of the things that are just standalone. But a lot of people use it. Salesforce famously uses Argo rollouts for like everything. They've got like tens of thousands of rollouts going. So every time that you do an update, this is really a way of de-risking that update, right? So that the blast radius of potential failure is a lot smaller. Okay, so let's just review really quickly. How does a blue green deployment work? So you've got your prod version, which is currently blue. It's the only thing that's deployed. And all of your users are currently getting traffic from that in a blue green deployment. We're going to deploy the new version, spin it up. And then we can actually run tests against that new version. Make sure everything is good before switching all the traffic over at once to the new version. And then if there were a problem, you can actually do like a hot swap back to the old version, right? So you can do a really quick recovery if there was something that went wrong after it was switched over. Now a canary release works similarly, right? Except it's just a portion of traffic that's getting sent over. These concepts, everybody's like, yeah, get it on with it. We know this. Very simple, right? So we've all seen that. Okay, so many services aren't that simple. And that's what this talk is really about. It's about handling those services that are more complex, specifically applications that aren't monolithic. So if we think about a simple blue-green release, well, you're deploying microservices, right? We're not deploying model lists or distributed model lists. We're deploying microservices. And so these things are interdependent. So you might have like front end. A front end version is dependent on a back end version. So if you're rolling out a new feature, and this is very common, right? You're working on a new feature. And how often, I mean, think about it, how often is a new feature confined to a single service? Very rarely, right? Most of the time you want to roll out a new feature. You need to make a change to a UI and a back end and maybe some other services in order for that feature to actually fully function, right? So what can happen with some applications is you deploy your new front end. And here it's showing kind of a canary release, actually, now that I look at the arrows. So pretend that it wasn't, that arrow wasn't splitting. But you deploy that new front end and you've deployed a new back end. But if there is a version mismatch between your current front end and the back end, like it'll explode, right? Because maybe the front end is trying to pull a value that's not existing on the old back end, right? So you have to be able to version these things and match them. And so what happens a lot of the time is people get to this point where they're like, uh-oh. I tried to use a canary release. I tried to use a blue-green release. And I tried to deploy a new service, but it was talking to an old service that wasn't ready for it. And the thing exploded and fell over. And so you just give up. And you're like, canary is too hard. Blue-green is too hard. Progressive delivery is too hard. It's like too complex to do. And so you give up. You uninstall or go rollouts. You unhook your deployment. And then you go home and you cry. And your boss is like, yeah, canary maybe one day. Blue-green maybe one day, but it's too hard. Cool. So that's the end of my talk. Thanks for coming. No, no, we can do something about that, right? There's a way to solve this. So we're going to talk about three scenarios. And I've got a giant QR code where you can follow along. This is built on a blog post and a technique that was put together by Kostas, who's sitting in the front row, who's on the CodeFresh team. He's brilliant, far smarter than we deserve. And he actually came up with these techniques. And he's an Argo rollouts maintainer and does a ton of work. So the first scenario, oh, well, first, let's introduce our application. So our application is really simple. It consists of a front end and a back end. And one of the amazing things about this application is it tells you what version it's running and what version of back end it's talking to. So this allows us to actually make sure that these versions, we always know what's going on. And this is a sample application. It's available on GitHub and it's linked in the blog post that I mentioned earlier. I'll have another link later that you can grab it on. So let's go through a modern application. And when I say a modern application, what I'm talking about is like that 12-factor app that is fine with talking to different versions. It's got a versioned API, right? So if I update the front end, it knows that if it's talking to the back end, it can fail gracefully if the feature is not available. It won't expose it in the front end. If it is available, it will expose it. This is like a modern application. And this is how we should be architecting our applications, right? If we're not architecting this way, then they're more monolithic. But a blue-green deployment is pretty simple, right? In stage one, only version one is deployed. In stage two, we deploy a new back end, right? Version two of our back end. And we can run smoke tests against that, make sure it looks good, do QA, whatever. But the user is still only getting version one across the board. And then we switch over the traffic at once so that now the front end is talking to the new back end. And we can, again, run tests on that in stage four. Sorry, in stage three, we can run tests against that. And in stage four, we can introduce the new version of the UI that relies on those back end changes, right? And then people are still only getting the front end. But once we've tested it, we switch everybody over to two. So we've got essentially two different progressive deliveries happening. Two different blue-green deployments happening here. And they're basically happening linearly, but they're independent of one another. So this is the best-case scenario because you're using a modern application. It's got a versioned API. It's not going to explode. So you're okay with this. Even though you have to stage out these deployments to happen one after the other, this is fine, right? And then scenario two is very similar, but it's the reverse. So it's like I deploy a new front end first. So maybe you have some... Your front end is architected so it doesn't rely on the back end. So in that case, in stage two, the front end gets deployed. It's still pointing to the old back end. Stage three, you've switched over. The users are now getting the front end traffic. Stage four, you then deploy the new back end. You run smoke tests and then in stage five, everybody's getting the new version of both. So this is the same thing as the first scenario except in reverse because you might have an application... Depending on what you're doing, you might have to order front end and back end differently. So that's actually pretty simple to do today. That's just a simple Argo rollout. You basically just set up two of them and then you have some orchestration to stage them out. Maybe a CI CD pipeline that's going to kick off one and then kick off the next one when you're done or maybe you're going to fire off a job that creates the next rollout, whatever it is. So that's pretty simple. But these scenarios are unfortunately not available most of the time for a lot of users and we would consider these legacy scenarios. So can I mismatch versions between front end and back end? If you can do that, then you're in the modern boat. But if you're not, you're in the legacy boat and there's a couple of different situations that would put you in the legacy boat. So the first one would be you have a feature that requires changes to two services at the same time. That's what I mentioned. Most of the time that we're deploying new features, they do require updating more than one service, right? I would say that in my experience, it's like a plurality at the very least, 40% of the time, 60% of the time that's happening that way. Some services don't provide versioned APIs and a lot of times companies that are just deploying services don't version their APIs because it's not worth their time. They don't view it as worth their time. Like how many of you cut a release, like a git release when you deploy a new version of software? Like hands up? Yeah, not that many. Yeah, it's rare, right? Most people don't do that extra step because they're like, I'm the only one consuming this or it's just my team or it's just a couple teams so I'm not going to cut a release. So maybe you don't have versioned APIs. Another issue is maybe your integration tests can't fully run, right? Until both those services are deployed, you actually can't run all of the tests that you need to run. That's another scenario. And then there's another kind of bonus situation that we're going to cover in here, which is what if you're only updating configuration and not binaries? Now this is an artifact of using Argo Rollouts. So here's the scenario. Imagine that you have a deployment and you've got a new image. Now if you've used Argo Rollouts, you know I update my image tag, I deploy the new rollout, Argo Rollouts controller is going to pick it up and deploy it. What if I change a config map that is loading values into that pod? Everybody knows. Oh, you update the config map and then nothing happens, right? Because the pods that are currently running, they're not going to restart. There's no change to the binary to run it. There's no change to the configuration of your deployment to run. So it's not going to kick off a new deployment because all you changed was a config map. So you'll sometimes see that there are like people using Argo CD will have like a post sync job that kills all their pods to force them to restart. Doesn't that sound terrible? It's like a terrible way to do it, right? It's like an awful, it's awful experience. They're like, yeah, good news. I kill all the pods and they're like, what happens if they don't restart? Like if you just have downtime, this is terrible. Why would you do that? So Coast just came up with a really brilliant technique to handle that scenario that we're going to cover in here. Okay. So this is what we're actually here to talk about is scenario three, the legacy application. Now in this case, the user, you can see on the user timeline, they're only going to get front end and back end matching services throughout. They're only going to be using version one or they're only going to be using version two, but they're never going to be using a mismatch like in the previous two scenarios. So even though in step two, yeah, we deploy a back end and we can run tests against it. Stage three, we deploy the new front end, and now we can run full integration tests on those two services that are linked. And then in stage four, we can switch over the traffic and then actually pull down those previous services. You're not running them anymore, and then you have everybody moving and switching all at once. This is a blue-green. Now everything that I'm showing you applies equally to canary releases. You can do the exact same, this exact same technique will work with the canary release just fine for the sake of demoing. It's easier to demo it as a blue-green because it's more obvious what's happening, but you can use these techniques for both cases. So the requirements to get your legacy apps running and deploying in synchronicity in the way that we've described is first, you need to have a configurable URL for accessing the services. So you can see on the left-hand side I have an example of some application code that is specifying where the back-end host lives, and it's taking a variable from an environmental variable. This is just good practice anyway. And your developers, if they're not doing this, this is a pretty easy change typically for them to make. And if you say, hey, this is going to allow us to do progressive delivery, and it's going to reduce our blast radius, and it's going to make our services more reliable, it's going to lower the barrier for you to deploy new stuff, they should be on board. And then on the right-hand side, you can see this is from either a deployment or from a rollout. And we just are pulling the value from that back-end host, and we're loading it from a config map. Okay? So the second component to make this work is you need to have a config map generator. Now, for those of you that are big helm users, you might not be familiar with this because this is a customized feature. And I'll show you how it works. So with a config map generator, on the right-hand side, you can see that we've got a config map generator, and I've set all my values. And on the left-hand side, I've got my normal configuration saying that I'm going to be pulling the value from a config map, just like I would in vanilla Kubernetes, right? But here I'm using Customize to do this. Now, what's going to happen, and you're going to see why this is, you're all like, I don't understand why you're talking to me about this. You'll see it in a second. What's going to happen when I render this customization, it's going to generate a unique config map name, and then all of those values in my deployment or my rollout are going to get updated to pull from that now unique config map. So that means that I actually have two config maps when I kick off my rollout, right? The one that existed before, and the new one, and the new rollout references the new config map, and the old one references the old config map, so I don't have to go kill pods. I've got a new rollout that's using the new config map, and there's no ambiguity. Can you imagine a scenario where you update your config map and then pods accidentally die and your application is now broken because it's taken new configuration, and now how do you fix it? You've got to roll back, and how do you roll back? You roll back your config map, and then you kill pods. What? Terrible. Terrible. Don't do that. It's bad. So using this setup, your config map is versioned along with your deployment or your rollout. So that's going to be key to this technique. And again, I've thrown up the link to the blog post here. I'll let this slide sit for like three seconds if you want to scan it. The slides are going to be available afterwards. Okay, so let's show you how this all works in action. So to do this, I've got, and hopefully you can see these reasonably well, I've got two, I've got my service, right, here's my application, and I've got two different, these look identical right now, they're different ports. So these are looking at a preview version and the customer view version of my application right here. And I can see, you can see I've got two rollouts. This is for my front end, and this is for my back end. And they're both currently sitting steady. We haven't deployed any changes. Now, of course, you could do this all with GitOps and using, you could be making Git commits, but for the sake of demo, we're going to keep it simple. So here's my configuration. Okay, so here's my configuration, here's my customization, and you can see that I'm using a config map generator, and I have settings that I'm passing to my back end to specify what version it is, and I have settings that I'm specifying to my front end to say what version it is, and I have a back end host that I'm specifying what the front end is supposed to be pointing at. Now remember, this is living on the rollout, right? So if I don't expose that to users, that new front end, people aren't going to get that traffic. Okay, so let's first start by updating the back end. We're going to move this to 2.0, and let's apply that. So we're going to do a K, apply, manifest. Again, you can do this with GitOps, right? Okay, so we can see this is going to kick off a new rollout. So now I have my deployment here, and for my users, by the way, if you're using port forwarding, anytime you change a service, you have to restart. I'm just going to restart in the background every time, but you'll notice that there's no change, right? Because I've only deployed the back end. I haven't deployed a new front end, okay? So now, in order to kick off the view for my users, I'm going to need to update my front end, and I'll also specify that the new version of my front end is going to point to back end preview. And so that's going to be the service name that it's looking for in Kubernetes, just using local Kubernetes URLs, right? Okay, and if I look at my back end, sorry, at my front end, you can see that it's currently only got one version available, and the back end now has two versions available, one in preview, not deployed yet. So I'm going to apply my changes again, and we're going to see the rollout kick off, so we can see it now has the new version deployed. And so now I have two new versions, and if I restart my port forwards, again, the users are not going to get any difference. They're both getting front end, and if I look at my preview, they're now getting version 2.0, right? So I can now run smoke tests against this, I can run integration tests against this, I can have, you know, do whatever validation I want. The users are not getting a mismatched version. Here we don't have a mismatched version, and then to complete this, all I have to do is finish promoting my blue-green deployment, right? So right now they're only available as preview, right? So if I want to switch it to over to users, I'm just going to do a rollouts promote, and I'll promote the front end first. The front end is already pointing at the back end, and this is going to now delete the old deployment. You can see these pods are still here, but they're going to delete in 24 seconds. And let's restart my port forwarding. And if I refresh this, now everybody's getting version 2.0 altogether. So they're synced up, so you have synchronicity, you have consistency for the user. And if I were to do a rollback at this point, it would switch everybody back to version 1, just off of switching the rollout off of my front end. Now, if my back end, the preview version is still hanging around, but I don't need it to, so I can go ahead and promote my back end as well. And that's going to kick off. You can see it's now starting to do the deletion, but already the traffic has switched over. And again, if I switch these over, you're going to see that there's no change here. They're getting version 2, right? Okay, so hopefully that was clear what was happening. There was a lot of moving parts, right? And I think we'll have time for a question. Yeah, what's your question? Go ahead, and I'll repeat it. Yes, that's right. So in this case, and this is for demo purposes, and there's comments in here, but when I reset this, I should have a final step to change this back to front end, right? Just to reset it for the next time. But I could do it the next time that I run or whatever. It doesn't matter, because the front end service is going to be automatically pushing back, but you're exactly correct. Okay, now, you can obviously automate this process, and you can see here I've got like a CI CD pipeline, and in this case, it's deploying the new back end. And it's running tests, and then it's waiting for like a manual approval. And then when people hit approve, it's going to move to the next step and deploy both. And then you can do the promotion. So of course, you can automate this, but you would need like a CI CD pipeline to do it today. That would be, you know, this is legacy. And there are, of course, GitOps ways of doing this that CodeFresh is, we actually announced this week, which you can go check out at our booth. So you can find like the GitOps way to do promotions. But there are a few warnings and caveats with this technique. Now the big one is that if you're using ConfigMap Generator with Argo CD, when you are doing your promotion or rollout, it may delete the old ConfigMap before you want it to. So if you go to do a rollback, the pods won't be able to start because the ConfigMap is missing. Okay, so to fix that, all you need to do is add, and you can see I've actually done it here, in your ConfigMap Generator, you just need to add prune last. So that'll make sure that the ConfigMaps are the last thing to get deleted. So if you have to do a rollback, they'll be available for you. And so that's pretty important because if you want to do a rollback and like the ConfigMap isn't there, and it fails to complete the rollback, that would be a big problem. And that's basically it. There you go. Thank you. It is lunchtime now, and the session is officially over. But I'll be available hanging out here if you want to ask any questions. And thank you again. Really appreciate the great audience and wonderful warm reception.