 Welcome, CNCF community. Thanks for giving me the opportunity and joining into automating SRE from Hello World to Enterprise Scale with Captain. This is really an overview and introduction section to our CNCF Sandbox Project Captain. I think you have all the links here that you need in order to figure out where to find more about Captain, Captain.sh, follow us on the Captain Project, star us on GitHub, or join the Slack channel. I am Andy Grabner. I'm a deaf rail for Captain. And if you want to know more about me, feel free to reach out. We will also have a live webinar on the CNCF webinar schedule coming up next week. So then I will also be joined by Jürgen Etzelsdorfer. And we can then both show you more about Captain in life. You can ask us questions, and then we navigate you through the product. But I really encourage you as a first step, if you want to learn more, definitely check out our website from here. You can reach all the tutorials. You can get access to additional resources like previous recordings on different use cases. Also testimonials to see how other users are using Captain, how they take the benefit. We've also just recently released Captain 0.8. So it's March, 2021. Depending on when you watch this, there might even be newer versions, but just to let you know that this is the latest and greatest as of the time of the recording. But let me first go back and give you a little overview and let me get started, why we actually built Captain. Because we built Captain to solve a couple of problems that we've seen in our own organization, but also with users in our community. One of them was a lot of DevOps teams are challenged with having very monolithic automation in their pipelines, and it becomes hard to deploy. What does this mean? An example is from Christian Heckelmann. He is a DevOps engineer, and he constantly is challenged with pipelines that are broken. He constantly gets pinged on slag, saying pipeline broken, please fix. He is managing, as you can see, a 2,800 projects and 966 CI CD pipelines. And this is obviously jelling, especially if things are broken. And why are they broken? Because over the time, they became very complex. Here's one of the screenshots, and this might be something that you can also relate to. Some of these pipelines start small, but all of a sudden, well, it escalates well and fast. We end up with very complex scripts that are doing a lot of amazing things, but it's really hard to maintain and keep up, especially if you then have different permutations. So this is the first problem we see out there and that we want to address with Kepton. The second problem is that also DevOps teams of people that are related and in charge of tool integrations and pipelining, these pipelines tend to contain tool integrations, and they are often custom-made, custom-built, and then copy-pasted around because of lack of standard. This is an example from Dieter. He's a senior ACE engineer here at Dynatrace, and he says onboarding or updating pipelines is manual and often error-prone. Now, while his environment is much smaller than what we saw from Christian, the challenge is that you often start with one pipeline for one service, then you copy and paste and modify small things into the other services. So you end up with a lot of different pipelines for a project and then this multiplies into the classical snowflake effect. Now, what's interesting, we've done some analysis, so Dieter has done some analysis to actually see how much duplicated code we have across all the different pipelines we have in our different projects. And it's a very, well, eye-opening to see that there's a lot of red here, which means a lot of duplicated code. That means if there are either bugs in there or something needs to be changed, need to change it in many different places and often you don't even know any more ways to change it. So this is another problem. We wanna make this easier because we're spending too much time. Another problem that we solve or wanna solve is we see a lot of SRE teams that trying to get SRE practices around SLIS, SELOS, around performance testing, around chaos engineering, it's scaling to their organization, but it's really hard to automate that at scale. Triskon, so Roman Fjarslid managing director, Triskon, he's been working with organizations where they're limited to the number of tests they can run per year or the number of apps they can test and validate against their SELOS. So the reason why they are struggling with it is because a lot of the stuff is done manually. A lot of tests have to be rerun because they're only run, you know, let's say 15 times a year. So a lot of things change between. This also means they have only about 10% of the projects onboarded in an organization but haven't scaled it across the organization. The reasons for all this is because there's a lot of manual time spent in script creation, configure your monitoring, analyzing your test results, analyzing your SELOS, which you want to do if you want to get broader with your SRE practice, not only in production but also bring it across the lifecycle. Now these are three problems and three challenges. Now I want to show you three examples of how captain users have been helped by captain and solve their problems. Sumit is it into it. They are using Argo, Gatling and Jenkins for distributed load testing fully automated and now they're using captain to completely automate the test analysis. Captain has a capability that is called SLO based quality gates. So they run their tests and their existing tooling and then they hand it over to captain to fully automatically continuously evaluate SLO, something that they have done manually before. Now captain allows them to scale. Coming back to Roman who I brought up earlier, remember he had like 15 to 20 tests per year and only five apps. Well, now they run 15 times the amount of tests and 10 times the amount of apps. Thanks to the automation that captain brings in because captain runs tests more consecutively, more continuously, more automated and also automates the analysis. This really enables them to do automated performance and resiliency testing. And the third one, remember Christian, he was challenged with the ever-growing number of pipelines. Well, they have now moved over to Kubernetes which means new microservices, new pipelines that have to be onboarded and they didn't want to make the mistake from the past, from the previous architecture. So now they're using captain to orchestrate the whole end-to-end delivery pipeline, calling GitLab for deployment, captain triggering the automated tests with Catalon and Chimeter using Helm for deployment but then also doing the automated quality gate evaluation. So this is where, this is some of the stories and you can actually find some videos of these three gentlemen and more if you go to the captain websites and go to captain resources. There are some other nice testimonials, you can also find them on the website. What I really like is Tarrash from Facebook who says captain feels like a reference implementation of Google's side reliability engineering and the side reliability engineering workbook. I guess this was really nice for us to hear that it seems a lot of people understand that we really try to help, especially the SRE community to bring SRE automated in your cloud-native continuous delivery. All right, so now what is captain, right? Captain is something different for different personas, whether you are an ops, an SRE, a dev, whether you are a performance engineer, whoever you are, captain allows you to pick a use case where you're currently struggling with automation, with automating it in general in the way you want to automate that and integrate it into existing automation tools. So captain allows you to pick the use case that you want to automate, quality gates, delivery, SRE automation or auto remediation for production. Depending on the use case, you then bring your configuration. For instance, for the quality gate evaluation, you have to bring your SLI and SLO definitions. For your performance test automation, you bring your workload definition. For your auto remediation and production, you bring your runbooks. And best of all, captain doesn't execute this thing. Captain is an orchestrator. Captain connects to your tools. So you can bring your tools that work well in your particular environment. Everyone has a different environment. Everyone has favorite tools that you have investments in. So you can bring these tools and connect them to captain because captain then takes your configuration, takes your use case and really automates the configuration of your tools, connects them and provides the use cases completely as a self-service. And it does it through a declarative approach. Everything, all the configuration files are all persisted, stored and versioned in git. Everything is centered around service level objectives as SLOs. Every action captain takes is validated that it doesn't break anything or still you are within your SLOs. And the whole communication from captain to your different tools is all based on the cloud event standards. So everything is standard based. There is no proprietary integration now. Everything is based on standards, which makes it easy to extend, easy bringing your tools and also easy to swap tools without having to then update your manual custom proprietary integrations. All right, before I go into demo, a quick architecture overview. The architecture was driven by really the new requirements that we've seen. Remember we have seen pipelines we've seen automation scripts that grew too fast because they had mixed information about processing and tooling and target platform and environments in there. There was also no clear separation of concerns about what the developer should do and DevOps engineer should do and aside reliability engineer should do. This is kind of, we packed everything together and this was the fundamental problems, I think, of most of the approaches we have today. So what we said, well, in the end, we have processes but we want to automate processes with the hard dependencies to the tooling. So we said, if you have a process on the left and you have the hard-coded dependencies, why not just break these things apart? Why not break the dependencies or remove these hard dependencies and say, hey, we have a process that we want to automate, of course, right? It may be build, prepare, deploy test, notify rollback, whatever you do in automating certain processes in your delivery and operations. On the right then, you have your tools or I rather like to call them capabilities because you may have one or more tools that have a certain capability or can provide a certain capability in a certain environment. So if you have the process on the left and the capabilities on the right and we have a process orchestrator, then we need some way for them to communicate and this is where eventing comes in. Captain uses an event-based model just as when we break monolithic applications into smaller services, then use eventing to connect and we do the same thing. We allow it to define the process and as we execute the process, Captain will then send the right event at the right moment to, for instance, say, hey, I need somebody that has the capability to deploy container number one in depth with a blue-green deployment strategy. Then you may have one or two capabilities. Maybe you have Helm that could do it. Maybe you have a Jenkins pipeline that could do it or you have Spinnaker. Then these tools can say, yes, I can do it because I'm certified and I have all the config files that are needed for that environment. Let me do it and when it's done, it sends it back that the job was successfully done or maybe failed, who knows? And then Captain can continue with the workflow. So really what we did is we said, which events do we need and also what are the capabilities we need on the right side? And then we connect them through eventing. From 10,000 feet, the way this looks like you install Captain on Kubernetes. You install the so-called control plane on a cluster that manages all of the workflow and all the logic I just explained. We're using nets as the eventing engine. Now, in order to use Captain, you have somebody that needs to say which processes, which workflows, which sequences Captain should actually orchestrate and automate. This is what we call the application plane. You specify what type of processes. Is it delivery processes, is it a remediation processes, is it a testing process? You declare this in our config files. We call them shipyard and remediation files. Shipyard, that means everything that is related to continuous delivery until it ends up in production and remediation is everything for all the remediating tasks in production. The nice thing is because we have a clear separation of concerns between process definition and the tooling and the capabilities, you can have even a different team that can define and install the execution plane either on the same cluster or on different clusters. We just introduced Captain 0.8 that now finally has the capability to execute or to install the execution plane in all of your different target systems. And then this team can say, well, which tools do you wanna use in this target environment? And then they install these capabilities. They're listening to these cloud events. So it's all based on standards. And once they receive it, they execute the action and respond. Which means at the end, the real beneficiary is the user, the dev, the ops, the SRE that can then say, I have a new artifact and I want Captain to now run an automated process for me. Let's say test automation or even delivery. Which means then Captain starts with sending the events depending on your process definition. With this triggers the right tooling in your execution plane. These tools then do the action and then report back if something is good or not good. The nice thing is, you can easily now change the process without having to think about which tooling integrations you now need to worry about or maybe break. But you can also change the tooling without thinking about the process, right? You can say, I'm swapping from, let's say a Jenkins pipeline that used to do my deployments to now using Helm natively. Or you may switch from Jmeter as a testing tool to something like Neotis. Or you switch from one monitoring tool to another monitoring tool. It gives you the observability data. And the nice thing is you don't have these integrations hard-coded anymore. It's all process definition, tool capabilities and then they are connected through events. So I wanna now go into my first demo. I wanna show you a little bit of Captain. All right, so I had this here. Let me show you something that I have here. And by the way, as I said in about a week or so, we do a live demo. We can do more, we do a live webinar and we do a little bit more on live demos with Captain. So just wanna let you know, I've installed Captain on an EKS cluster. This is a standard installation now where I have control and execution plan installed. You see a couple of pods here that I have. I also have my Captain CLI authenticated against my Captain environment. And I can now also do things like, and let me just do this here, history, grab artifact. I wanna kick off a new deployment. I'm too lazy to remember all of this, to be honest with you. That's why what I want now says, I wanna say Captain, please, I have a new artifact for you for a particular Captain project as service. And here's my new image and now you go off. Now, while this runs, I wanna show you a little bit of what actually happens behind the scenes. So here is my Captain installation. Here is my Captain 07 project. Captain internally holds a config repo for everything it does. So for every project, you get a config repo and then you can also specify an upstream git. This is here, my GitHub repository. What you can see here in the main branch, I have my shipyard file. This is kind of my process definition. This is where I say, Captain, I want you to provide me three stages, dev, staging and prod. You can give it different types of metadata to change the opinionated workflow that Captain has, like what type of deployment should happen, what type of testing should happen, what type of approval should happen, what type of remediation should happen. Now, what you see here is a shipyard file of Captain version 0.7, 0.8 was just released as I'm recording this. So I will show you how this changed in 0.8 because in 0.8 you are more flexible with what should happen in a stage, but I start with 07 here because in the end it gets the point across what Captain is doing. So this is what I specified. That's my whole kind of pipeline code. Now, what else do I have? For every individual stage, Captain created a branch for me. Like for instance, if I go into the dev branch, this is now where I have all of my supporting configuration files for the individual tools and capabilities so that they can do their job. So for instance, I have my Gmeter scripts in here because I'm using Gmeter. So when Gmeter is triggered later on, it can access this config repo for dev and then say, okay, what are my files? What is my configuration? I also have for, I can either specify it on a global scale for stage or I can do it on an individual service because a project in Captain typically contains multiple microservices that you wanna deploy. Then you can have more specific files for a particular service. Maybe I have specific test files for a service. So I have global files for maybe everything, some basic tests and then specific ones. Now, what do you also see here? I have my Helm folder where I have my Helm charts. And interestingly enough, two minutes ago, something was changed here because remember I triggered off my, I said Captain send new event artifact. Let me just go back quickly. I said, Captain send event new artifact for this project, for this service, for simple note, I have a new version. And one of the things that Captain does, the way I specified it, Captain will first trigger, send an event and say, hey, Andy wants to change the version and therefore take this version information and update it in the files where it's necessary. So the first thing that actually happens is a version change and it made the update here, which is nice because I don't have to do with it. You don't have to do it. You can also do it through your regular GitOps approach where you change your configurations in that Git repo and then trigger the rest of the Captain workflow. But this I think is pretty neat. So we have everything in here. Dev, we also have prod and staging. So for every stage, you have your different files. Now let's go back to Captain. Let's go into this project. So this is now my Captain 07 project. And I have, if I click on environment, I have my Dev staging and prod. So this is kind of just the visualization of the chip yet file that I showed you earlier. I can also see what is currently deployed in that stage. I have only one service onboard it. That's my symbol node. So I can click on here and it seems build number four was already deployed. If I go back to staging, this is where probably build number three still, right? So build number four was in Dev and build number three is in staging. And because I sent kind of Captain along the way, Captain will now go through all the process until hopefully it ends up in prod. And you will see here, I actually didn't clean up my environment from some previous demos. I actually had a couple of builds and runs earlier that made it all the way into prod, almost all the way into prod. Because I specified in my chip yet file that I wanna have direct deployment or direct promotion from Dev to staging. If a build is good, but from staging to production, I always wanna have like a manual approval. This is why these are waiting here and now I get the overview of my SLIs and my SLOs and I can then make a decision to go or no go. So these are some old test runs that I have never approved. So this is why they're still lingering around. Now, but what's interesting, this gives me an overview of what is currently deployed in which stage, but if I click on services, then I see here in the list all of my previous attempts and my previous demos when I ran deployments, right? They can be triggered through the CLI, through a webhook. You can do it as part of a git action, whatever you want. If I go to build number four now, if I click on it, on the right side now, you actually see all these events that I talked about earlier. Remember in my animation, I said captain is sending events and with this triggering then the capabilities and once they are picking up the job, they say yes, I'm doing it and then they're sending back once they're done. So this is really neat because I see exactly what is happening. Deployment test finished until quality gates are enforced and then it goes on into the next stage. Now let me switch to build number three that I ran a little earlier because here I see a little more because in this case, the build was promoted all the way from death into staging and then into staging, we also ran some G-meter tests and we did some more extensive quality gate evaluations until it ended up waiting in prod for approval. So I can actually now finally, I think, kick this off and push build number three in prod. I mean, build number four is already on its way but now I'm good with this. So this is kind of like a quick overview. What you should take away with you is that in captain, everything is declarative meaning you declare what kind of process you wanna automate. We have the shipyard file, for instance. You then also add all of your configuration files for your specific tools and capabilities into that git repository for a particular stage. Either for the overall stage, meaning all your test files for that stage or specific ones in for a particular service. And then captain orchestrates everything for you. Now captain also has a very rich API. We have a swagger UI here where you can explore the API and this is where you can then, for instance, trigger an event from the outside. You can create your project to services. You can fully automate captain. There's a lot of different options here to then access the git repository or everything in the git repository, upload files, download files up to you. And you may ask, okay, so but how do I build a new service? How do I extend it? Well, services are basically listeners to events. And if you go to captain sandbox, then this is actually a great way for you first to see what type of services we have. We have a sandbox, we have a country, then we have the core captain project. But this is where we have low-cost service, litmus service. You see there's a lot of stuff already built. Monaco service, the GitOps operator, we have a lot of things here already. If you wanna write your own service and see how easy it is to just build an integration based on these standards, then just take this template. It's a goal template and you can get started. Now we have already videos for that. So I'm not spending my time. So what you've seen is a quick overview of how captain works. Now the nice thing is captain can easily be integrated in existing pipelines and existing tooling. That was also our goal. We don't wanna replace everything. We wanna extend them. We wanna automate things that are currently hard to automate. And one of the examples is Patrick. They're using GitLab for CI for building their containers and pushing it to the registry. And then they're kicking off captain and captain is then doing the delivery for them. So this was kind of the overview of a key major use case. Now a very big important piece of captain is that everything we do in captain, everything captain does is always validated against important data, SLOs, service level objectives. So every time you deploy, every time you run a test, every time you do remediation, we were using SLOs. And the reason why we had to do this because this was a problem that a lot of people are facing. And this was also one of the problems I highlighted in the beginning that a lot of people are trying to automate data-based decisions as part of the delivery process. And I know how many people have tried to bake this into the existing pipelines, whatever tools these are to make a good or no good decision. But it's really hard because we do no longer only has just the unit tests to look into or functional tests. As a region, especially you're asked to run more performance tests, more chaos tests, you need to bring in observability data from tools like OpenTelemetry or your APM solutions. And there's so much data and it's really hard to then analyze it from a build to build, from a deployment to a deployment perspective. It is all possible, but it's not easy. So what we then said, well, we want to tackle this problem and we want to make it core at captain. And for this, we looked at Google's SRE practice. So SRE stands for side reliability engineering. I guess I don't tell you anything new. For those where it is new, it's very simple actually. You have SLIs, these are your service level indicators. Something, a metric that you can measure that is important to you like the error rate of login requests. Then you specify what is your objective with this metric. So for instance, you want to make sure that the login error rate should be less than 2% over 30 day periods, especially in production. These are the things that you've defined. And then SLAs, probably more well-known, are things like what would happen if you are kind of missing your SLOs. Then you may have a legal contract. You may have some obligation or you may lose users, whatever that is. So in the end, Google did a great job in advocating for this principle as part of the side reliability engineering practices. Great videos, great books. I like the top line. SLOs drive SLOs, which inform SLAs. Now, what we thought, it's great that more and more organizations are looking into using SLOs as part of their production deployment, production monitoring, right? You can use SLOs for individual services applications for different types of metrics. You use them, use the error budget, the status to make decisions on whether or not to deploy. But we thought, why not take it and use the same concept for everything we do, from when you create your first container image until deployed in depth, run your tests. Why not use the same concept of looking at metrics and then validate them if they are within what my expectations are? And this is why we bring Captain Quality Gates as a core component to it, which is based on the concept of SLIs and SLOs. Metrics compares against objectives and then Captain just analyzes metrics that are important for you with every commit, with every build, and then makes a decision good or no good. Now, these might be different metrics and different thresholds that you have in production. I understand that. This is also where you typically use regression detection between builds because you wanna know did the new build maybe introduce CPU consumption by 20% or you're making 50 new database calls to the backend and this is something that you wanna flag. These might be not as SLOs that are interesting for you in production. Well, they might be. But what I'm saying here is we allow you to also specify different SLIs and SLOs as part of Quality Gates. So, very high level how this works. You specify SLIs in Captain what metrics you want from whatever tool and data source. It could be Prometheus, could be Dynatrace, could be Wavefront, could be any of the other monitoring tools. Then you specify your SLOs where you can say I'm expecting this metric to be within a certain range or I don't want this metric to go above a certain baseline by looking back at different builds. So you can do absolute and relative. We analyze every single value and we grade it and then we come up with a total score and then you can also say what is your objective for the overall score? We always normalize it between zero and 100. So if build number one comes along and everything is green, then great and Captain will tell you you're good to go 100%. If build two comes along and it seems you are slower on response time and failure rate, then you're getting penalized, getting 75% and then you can decide it's still good to go yes or no. If you're trying to fix this problem and build number three comes along and all of a sudden you fixed the response time and the failure rate, but all of a sudden you have an increase in number of backend login service calls from one to two, but it didn't allow any of that to happen because of your SLO definition of an increase of 0%, then you're getting penalized and this would then stop your pipeline, which is great because you immediately get that feedback and then build number four comes along, everything is green and now we're good to go. So this is how it looks in Excel. This is now how it looks in Captain the way Captain treats SLOs and SLOs. You specify your SLOs as indicators, SLI YAML files. I don't wanna start a debate on YAML and JSON now. So you basically say these are the metrics and then you put the query language next to it for the particular tool that you're using and then you specify your SLOs in a separate file. So we made the strategic decision to separate data source definition, so the SLI definition from the SLO because this also allows you to easily swap monitoring tools but still maintaining your SLO definitions. So if you have those and then Captain, you're asking Captain, please evaluate because this is also one valid use case. You can just say Captain, the only thing I want you to do is to evaluate my performance metrics or my SLIs and SLOs. Then Captain will send an event, say, hey, which tools can give me these SLIs? Here are all the definitions. Then whatever tool you've connected can then report the value. Captain then takes this value, scores every single value based on the SLOs and then comes up with a total score that is then translated into pass, warning or fail. Now let me go quickly back into Captain and to show you this, how we did it with Dynatrace. Now this also works for Prometheus and others are just using Dynatrace because this is where my day job is and so I'm familiar with that tool. In Dynatrace, we allow you to simply just build a dashboard and then Captain will automate all of this. Now let me go back to my Captain instance. So remember earlier we deployed build number four. Let's see what build number four is now. Yeah, build number four made it all the way through. It's now waiting in prod. But let me show you this here. This is the SLI and SLO definition. Now remember, I told you that normally you would go in into your Git repo and if I go to staging and if I go to, in my case, I'm using Dynatrace as the monitoring tool. So here's where I specify my SLIs. I could go in and specify all of my queries against my Dynatrace query language. So I say, hey, there's an SLI called process memory and this is how you query it. You can do all that and you can then also specify your SLOs in the YAML. Just as I showed you earlier, here are all the SLOs and I'm sure there's somewhere the memory, the process memory with past warning, wait and so on and so forth. So you can do this and while this is great, what we did in order to make this a little easier because not everybody is yet there to do everything as code from the scratch, from scratch. We said for our integration here, if you're using Dynatrace, you can also just build a dashboard. Meaning I have an observability platform. I build a dashboard where what I would normally do, right? You normally build a dashboard, you put all your metrics on and then you typically have an idea on how much the metric look like, like what's the metric, what's the range that I expected. So that's the same thing that I did here. I basically put in all of my metrics that are important for me and then additionally, if I zoom in here a little bit, instead of me looking at them and say, okay, what's the value that I'm expecting and that is good, I can specify my rules. Pass should be faster than 600 milliseconds and it should not slow down by more than 10%. I can do this on service level metrics, on transaction metrics. I can do this also on my process metrics, memory, whatever I have, even loop tick frequency because in my case I have a Node.js app. So you put this in here and it's just a convenience thing. And what we, what Captain does with it, it takes this dashboard, there's kind of the source of truth, generates the SLI and SLO YAML out of it so that then Captain internally can also process it the same way as with all the other monitoring tools. And then here back in my Captain's bridge, I have all the results for every single metric for every single run. I can look at it, I can also click on the chart and then see things over time, which is really nice. Now I also wanna quickly highlight, Captain 0.8 I mentioned, 0.8 was just released and I'm really happy about this because there's some really cool new capabilities in 0.8, you also have a nicer way of visualizing the stages so you can really easily click and focus on the sequences here for the SLO validation. We now get a nice overview of tables. This was missing earlier, what's the SLI? What's the value? What is the pass and the warning criteria? What's the result? What's the score? How much does it contribute to the 100 points? This is all here. And obviously you still have the charts as I just show you, you can also ignore tests or ignore runs in case you had a major issue that you're aware of so that it doesn't kind of pollute your baseline, right? So really cool things that are possible now, especially also nicer visualization. All right, now what that means is this is core to Captain. We always evaluate our SLOs, but it means you can also use it standalone. And this is, I have to admit, it's the first use case that people start with Captain. They say, I may have already my pipeline, I already deploy with Christian here with GitLab and already kick off some tests, but then I have not yet automated my test validation. So I want to use Captain for it. So in this case, it's just from your existing, let's say, GitLab pipeline, you trigger the Captain evaluation and then Captain brings back the result. All right, last point. I've shown you a lot about data-driven delivery, data-driven validation. Now the last concept is data-driven operations. We also know that a lot of people are struggling with auto remediation. And there we also wanted to focus, let Captain focus on, through a feature that we call closed loop remediation. So similar to orchestrating processes for delivery, we can do the same thing for processes that you wanna trigger as part of a problem in production. So for instance, if you have a monitoring tool that alerts you on conversion rate dropped, root causes CPU pressure, then you can specify, similar to the shipyard file that I showed you earlier, but now a remediation file, we will say, these are my steps that I would execute as remediation. And again, these steps, these actions, Captain will take them and the way Captain treats them is just like with the delivery process. It sends an event and say, who has the capability to execute this particular action on that system? Because this is what I want to automate. So for instance, the problem comes in, Captain will say, well, the first thing is scaling up. So please scale up whoever, whatever action that is. And then remember, also for the auto remediation, we validate, we validate the SLIs and DSLOs and also we call them BLOs, your business level objectives because in production, you typically then also take maybe some end user metrics to it. What's the impact on the end user? And then based on that, Captain says, yes, the first remediation action brought the system back to a regular healthy state. That's great. If not, run the next action. And if nothing of that solves it, then in the end, you can still escalate. Really cool, but I know also really scary. A lot of people think, well, they don't trust this in production as a first, I'm running it as a first time. This is why we are also partnering and we've seen a lot of movement. And this is why Jürgen is great that he will be there with us next week in the live webinar. Integrating Captain and people with chaos engineering. So Captain can trigger your performance test to run some load against your system. Captain can also at the same time trigger your chaos test with its litmus, with its Kremlin, whatever you use. And then the first thing this does, it validates that you are alerting and monitoring works correctly. And then you can use this to create, to refine, to validate, to battle test your remediation actions. Because in the end, this is a package you're pushing your code through and the code now also comes with remediation actions. And the combination of both and the orchestration that Captain provides really wants to make sure that the system stays healthy even under chaotic situations. So Captain can then be used here what I like to also call test-driven operations because that's really what it is. I wanna test drive my operational code that will be executed later in case chaos really strikes, but I've already battle tested it here. So to wrap it up, Captain is different things for different people but really what it is, it allows you to painlessly automate tasks around delivery and operations. You pick your use case, you bring your configuration, you pick your tools, Captain does the rest. Now Captain 0.8 was just released and a major milestone here is that we have now multi cluster support. So we can install the control plane that is managing and controlling the processes and we can communicate to one or many different execution planes that then actually do the execution itself. Christian is actually one of those people that are one of our early adopters there to use Captain triggered from GitLab to then do multi-stage deployment into the different environments then having the deployment, doing the testing and then eventually promoting and the automated monitoring. And now a cool thing is, and I showed you this briefly in the demo, we went to a different shipyard model now, we have shipyard version 0.2.0 which allows you to be more explicit on what should happen in each stage. In the previous versions, we were very not opportunistic opinionated, that's the right word. We were very opinionated on what happens in a stage. Now we give you more freedom, you can define your own tasks and sequences and you can say which sequence should trigger when and what should happen after which sequence. It gives you more flexibility. So have a look at Captain 0.8. Best way to get started is to go to tutorials, make sure you choose the right version and if you have any more questions, feel free to reach out to us, make sure to follow us on Twitter, visit our website, join us on Slack. And yeah, make sure to also join us live at the CNCF live webinar where we talk about captain. Then you can ask all the questions and we'll just go through the product. Thank you so much. Happy surveying, happy scaling from a small project to large enterprise scale. Thank you.