 We are super excited about our talk with we, I mean, Mike Beamer, who is maintainer of Open Feature and myself, Johannes Poyer, open source project. Why we are excited? Because today we want to bring together Open Feature, Captain as well as Open Telemetry in order to set up a releasing environment that allows us to release features in production without or in a very safe manner. For our talk, we have prepared a couple of sections. Mike is going to explain why it's important to have feature flagging adopted. Afterwards, he's presenting more details on Open. And then I will jump in and I will talk about Captain. In the demo, we are then bringing together Captain Open Feature and Open Telemetry to release a new feature in a hands-off manner. And finally, we will then do a recap. All right, Mike, let's get started. All right, thanks, Johannes. Yeah, so just a quick introduction to feature flagging. Basically, it's a software technique that allows you to modify system behavior without requiring a redeployment. And so it's really powerful. It's used in a lot of ways. Very commonly, it's used for, I guess, a release toggle so you can put new functionality behind a feature flag and then enable it when you're ready. You can use it for experimentation, so A-B testing or A-B-N testing where you would run experiment, collect data, analyze the results and then choose the best of the options. You could also use it as an ops toggle. And so that's another way to look at feature flags. Typically, you would think of a feature flag as something that you would add your code. You would use it until it's on or in a stable state and then remove it. An ops toggle would actually be used for basically the life of the project. And you would use that in situations where you need to disable some kind of behavior to maybe save your system some load or if another third-party service is down, you can basically use that in operation mode. And the last one is a context-aware toggle. That's something that's going to be really important for this demo today. And that's basically where you can pass in different contextual information like what users logged in, what their geo is, or anything like that. And then you can make decisions based on that metadata. One way to do that or one of the big benefits of feature is basically you're decoupling your feature releases from a deployment. So if you think you're writing your software, you're doing your deployment, you're sending it out to production. Traditionally, that's basically it. You'd roll it out to production and all of the features are immediately available. With feature flags, you can basically do a more granular release. So we could enable a feature specifically or actually release it that's already been disabled and then roll it out to a subset of users until you feel like it's safe. So that's what we're doing in this example here. So imagine you're in production, we have one new feature flag, and then maybe we're enabling it maybe for a product manager and a synthetic test, but your customers are seeing something else. That would allow you to test it out, make sure it has the intended behavior, and then go ahead and roll it out safely. So then moving into open feature. Open feature is an open standard vendor neutral feature flagging specification. We have broad industry support and there's a lot of interesting, you know, thought leaders in the space that are also involved in this project. We have SDKs available in many popular languages. Right now we have Java dot net, JavaScript and slash type script and go with many more on the way. And if your favorite language isn't supported, it's likely either coming soon or we would love your support and if you want to help build it. We also have a developer first cloud native implementation that something will show up in the demo as well. And it's just a way to test out a lot of these feature flagging concepts without having to, you know, register for a vendor account or build anything from scratch. On your right, you can see kind of where open feature fits into this whole architecture. You have your application and in the yellow, you see the open feature SDK and a very important concept here is the provider. And so that would be basically the piece of code that would perform the flag evaluations and that could be anything. It could be a vendor, it could be your homemade solution or it could be one of the open source offerings that we provide as well. And open feature was also just recently made a sandbox project as part of the CNCF. So you may ask why develop a standard. So really one of the main benefits is just the consistent API for developers. So if you're getting started with a homemade solution, you don't have to, you know, come up with your own patterns here. You can just leverage our system. If you're going between different vendors, if you're, you know, changing from a homemade solution to maybe a commercial solution, the APIs are the same. Because of that, we can actually leverage the community to help build really nice tooling around this, which is really, really beneficial. And as the community keeps growing, more and more interesting use cases will come out. It also lowers the adoption risk. So if you're getting started with feature flagging, you don't have this like code level lock in that makes it really tricky to move between different providers. And from like a Dynatrace perspective, we actually have the ability to do telemetry data. So we can actually, since we have a single SDK, we can actually capture interesting data points on flag evaluation and then provide that in monitoring solutions as well. Just wanted to show you, you know, it's not just like an open feature by itself. We're actually working with a lot of, you know, companies that are pretty well known, some names you may recognize. And we have a lot more on the way. So these are the ones that are available now. You know, just to highlight, you know, cloud bees, launch directly split. Many, many others are available. And then some open source projects like go feature flag and then flag D. Speaking of flag D, that's the one that we'll be using in the demo today. And that's just a, you know, spec compliant implementation. It's built, you know, with a cloud in mind and Kubernetes in mind. And it does follow like a UNIX philosophy. So it just does one thing and does it really well. I'm going to hand it over to Johannes now. All right. Let's talk about captain. Captain as a project achieved incubating this or became an incubating project this year. This made us very proud. And it's basically known as an application life cycle orchestrator. Now you might yourself back that or what can I now get from captain. And for digging into that, understand which problem kept. And therefore I want to jump back about three years ago. As we kicked off the project, we asked quite a lot of end users. What are actually your biggest troubles and challenges when it comes to cloud native delivery? And the aspect that we got was basic that they are constantly faced with the problems around keeping the pipelines up and running. One example I show you here is Christian who was a senior. Therefore, he told us that he has to deal with 2,800 projects. And there he needs a thousand pipelines that he needs to keep running. The numbers are high. The really bad or comes in when take a look at the pipeline itself because those pipelines, they are quite heavy and quite long in terms of lines of code. And the engineers and developers, they also baked in quite a lot of tooling, which made them really complicated to maintain and to run over time. And this, yeah, you can escalate really quickly and became almost impossible to maintain. And this was then the time where we thought it might or there needs to be a solution which can deal with that problem. And in order to explain you what captain now solves for you, I'm showing you here visualization of. And that our Christian was gave us basically the pipeline contained a bunch of tasks that should. That's totally fair as we want to have to do certain tasks and steps. However, those tasks, they are and were strongly connected to the underlying tooling. In other words, they were a strong and hard dependency of what we want to execute and how we are going to execute that. And here captain jump in and we kind of proposed that we are now going to break apart the processes from what the underlying tooling is actually. And the way captain is working, it basically introduced and right in between. The commanding layer allows captain sends out the events, and then the tooling, getting up those events and executing the tasks they are designed for. And this brings in now quite a lot of flexibility, because I can easily exchange the tool without touching the process or working around within the pipeline. And is on the cloud event and some data, which is required to execute the task, but not really. And with that approach, you're now able to have on the one hand, the process definition separated from the actual tool tools that are then executing the process. And now with a well defined process, you can then share this across your organization. And each team is following the same process, but using other tools for executing it. And you can also think of using this process. Whether it's the development to execute this process in environment where you want to run this process. And even in production, you can also run this process. All right. I think this is better. Now, also important to understand is that captain comes with two concepts. There's first the concept of the shipyard, which allows you to declare how the environment should look like. A shipyard is here shown in the middle where you can see that we have defined an environment with two stages, a development stage and a staging stage. And within each stage, there is a delivery process, which consists of four tasks to execute. Important here is that in a shipyard configuration, there is no reference to a tool because this is extracted and defined in what we call the uniform. And here is where the tool configuration lives and the kind of execution of the process is defined. And giving now this paradigm and approach, it's super easy to separate the tooling from the process and to really focus on the concerns that should be in focus. All right. And now back to the problem we had at the beginning. As I said, it was really difficult for quite a lot of end users. We talked to keep the automation up and running due to the complexity. But with captain, they could reduce the automation code by 90%. They had a very nice separation of process and tooling. And additionally, captain also comes with a set of as three best practices like as the low validations built in. And also important to mention is that captain does not try to replace your tooling. It's more or less an additional layer on top of your tooling, which just takes over the orchestration and takes over responsibility of executing those tools as they should run. All right. Now let's get back over to the demo and Michael and us what we are kind of doing. All right. Thanks, Johannes. All right. So the situation here is we have this hypothetical company named Fiber. They are in the very lucrative Fibonacci as a service industry and it's rapidly growing. And so they basically need to rearchitect their system in order to scale to the passive demand that they're experiencing. If we look at their current architecture, it's quite simple. The user makes a request. They have basically a monolith application. It does the processing, sends the response back. They decided that it really makes sense to kind of decouple this a little bit. So we'll have the application and we're going to actually introduce a dedicated Fibonacci calculation service. That will allow us to scale to the demand that we need. The way we want to go ahead and do that is we're going to go ahead and build a Fibonacci micro service first obviously, but we're going to put that behind a feature flag and we're going to at first at least have that disabled, but it will be enabled for the automated tasks. So that's going to allow us to test that before actually users see what the impact is. First, we're going to run it through a staging environment. We'll use captain for that to the orchestrate the staging environment. If that all looks good, we'll move on to the production environment. And if everything passes, we'll go ahead and enable that feature. All right. And then quickly just on the right here, you can see what the feature flag itself looks like. Basically, what we're doing is we're, you know, if you look at the use remote Fib service, that's the flag identifier uniquely identifies the flag in the system. The false just represents like the default fallback value just in case something goes wrong. That's what the behavior would be. And we're also using the user agent actually is like the context that we're providing to the feature flag. And in our situation, we're using K six, we'll get into that in a little bit. But if the user agent has K six in it, they see the new feature, otherwise it's just disabled or demo time. As now Mike said, we are now going to deploy a new version of our micro service. But before doing so, let's just take a look at our awesome Fibonacci. Fibonacci service that our end users are willing to pay for. It's that one. When they go there, they can basically run this, this functionality in order to derive the number of 40. And now, as Mike said, we want to update this particular micro service. And if we're doing so, I now jump over to my ID. Where I have everything prepared. And I'm now going to trigger a new captain delivery, meaning that I'm now you're giving captain the responsibility of four of doing the delivery of this new micro service. All right. Everything got started. And now captain is taking over responsibility of doing the orchestration. We can see this by going to captain. And when I do a refresh of the screen, we can see that the delivery kicked in and now we are deploying a new version of this micro service in staging. When I now go there, we see that the deployment already finished. And therefore we used helm for doing the deployment. And now the tests are executed. As Mike said before, here we are running K six tests. And those are configured in a way that we are testing the new feature, which is behind the feature flag, but all the other functionality is kind of hidden. And the test, they should be executed in a couple of seconds. I promise you this will take not too long. Here we go. The tests are finished and the new micro service is also released in the staging environment, meaning that everybody who has access to staging can now access this new micro version. In the meanwhile, we also trigger currently instead of waiting for the test results. I want to tell you or explain each. Let's use that one. Yeah. Important to mention is that each captain project has a git repository behind the configure configuration is stored. When we now take a closer look, we find one folder for our Fiber service. This is the service that we are currently updating. And in this folder, we see that there is a helm chart, a regular one, and there is also a folder for the K six tests. While we take a look there, let's jump back to captain. And here we can see that the delivery failed, Mike. What happened now? Right. Let's now dig into the problem itself because we also got a select notification that delivery now in production failed. And we can also take a look now into Yeager in order to figure out what actually went wrong. All right, perfect. So now now that we're using open feature with actually an open telemetry hook or collecting metadata on a trace so we can look at and see what basically happened here. In this case, we're just going to look at the Fibonacci requests and we'll see that all of them are failing. If real users would have been hitting the site, we would have seen successes here. But basically the tests themselves were configured in a way where they would fail. Just looking through here real quick, we're seeing actually unfortunately it looks like the call to the remote service must have had a configuration issue. So even though it passed in the lower environment, it looks like there must have been a misconfiguration in production. So I think we should go ahead and try to fix that. I think I made a mistake here because I forgot to update the password before I started the demonstration. But this is a fix we can do really quickly because as I said, we have configurations stored in the script repository. I just go there and in this deployment manifest, there is the place where we are configuring our highly secure password. And here made the mistake of using the wrong one. Now with this update of the config, we are now also we need to redeploy our new microservice in production. And therefore I'm using again captain for doing that. And in this case, I'm targeting and hitting directly production since everything went well in staging before. Now let's execute this again. And we can see that now as before the deployment in broad is ongoing. As I said before, it's Helm who does the deployment. And now the K6 tests kick in. All right, since the test take a few seconds, I'm just going to really quickly peek behind the curtain a little bit and see how these feature flags were configured. So remember, excuse me, the the user remote fib feature flag is the one we were using. And then you'll see in here that for everyone, it was off, except if you happen to have K6 as your user agent. So this is something that can be done in a lot of feature flagging tools. This is just, you know, how a flag deed chooses to represent this configuration. But it's really an interesting and pretty powerful pattern to be able to target user properties to make these different decisions in your feature flagging system. So if we check back here, tests are still running. So give it one more second. Hopefully that was the last configuration issue that we had. If we're good to go there, what the next steps going to be, if we wait, is basically captain is going to open up a pull request. So hopefully we see this in just a few seconds. If the test pass is going to open up a pull request that enables that feature for everyone. So that's what you're seeing here. This is enable feature. It's just pending now. If I were to go back to our pull requests, we should see that we have a new one. Is this the right thing? Okay. What are we doing? At any rate, we'll look at one of the old ones. There we go. Perfect. That just took a sec. Cool. So what we see here actually is captain went ahead and opened up a pull request. It says basically everything looks good. The test saw pass. We're ready to enable this feature for everyone. If we review the diff real quick, we'll see all it's doing is removing that targeting role. We don't need it anymore. And we're enabling that feature for everyone. So if this is what we want, we can go ahead and hit approve. Once that's good to go, we can go ahead and merge it. And then if we go back to captain, we'll see in just a second that basically an event was sent from GitHub saying that we've merged this pull request and it will continue with the feature release for everyone. So there we go. We see that everything's green and we're good to go. I think we're going back to the summary now. All right. So let's go through a quick recap here. So basically we went into our staging environment at first orchestrated everything with captain. We did a deployment using helm. We kicked off some tests using K6. Those those tests again were specifically enabling that feature flag only for those tests. If everything looked good, we went ahead and enabled that test for everyone and then went ahead and release that feature to the wild. In production, keep in mind, this is the same, you know, code. There's no no code changes. It was just a configuration change between those two environments. The deployment went fine as expected. When the test kicked off, we did have a failure that failure happened to be a configuration change. So we were notified through Slack looked at it looked in open telemetry and Yeager specifically saw that it was just a 401. We just had a password issue went ahead and corrected that issue. And then everything went through just fine. So the core idea there is we made a pretty major architectural change configuration typos. Those things happen. So this could be a way to, you know, add significant functionality or architecture changes in a safe manner, even in production. And yeah, let's let Johannes do the quick summary of captain and then yeah, we'll we'll wrap it up. All right. Yeah, the main main key takeaway from this presentation is that whenever you have problems or challenges when it comes to complex automation pipelines, think of or take a look at captain, which is an orchestration layer helping you to separate from the tooling from the process in order to have a clear to have clear concerns and to focus on what you want to focus on. And when it comes to open feature. Yeah, with open feature basically we're enabling standardization and vendor agnostic feature flagging so it's just a way to really take advantage of these really interesting and powerful concepts and in a safe way that works across you know vendors and even your homemade solutions. If either of those projects are interesting, which I'm sure they are, please go ahead and scan these QR codes or visitor website. Both projects also have Twitter accounts so appreciate any support you guys fight there. We both, you know, both projects also have, you know, very active communities so definitely feel free to get involved. You know, love to have you and if you have any thoughts on the session please go ahead and scan that QR code let us know and then feel free to chat with us after the session. So definitely appreciate your time and safe travels going home. Thanks everyone.