 Hello everyone, welcome to the talk by Gunnar Kross on after CICT, there is now continuous configuration. So we are delighted that Gunnar can join us today. So without any further delay, over to you Gunnar. Thank you very much and thank you for having me today. My name is Gunnar I am a developer advocate at Amazon Web Services and in the last decade, the movement towards CICT has been really transformational for getting value out to customers quickly. But in recent years, there's been new processes tooling towards using configuration post deployment in the form of feature flags, operational config and other runtime configuration. So in this session, we look at how to continually adjusting configuration to update and tune your code in production, how that is a powerful, fast and safe way to deploy value to customers. And we're going to look at how Amazon uses continuous configuration tools at scale to move fast and ensure a maximum availability of services. So there we go. All right. So in the early days of Amazon.com, making changes to the e-commerce code base was complex. It was slow and it was fraught with danger. So only a single change would require a complete redeploy of a single monolithic executable. And the executable in question, it was over one gigabyte in size and it took over a day to compile. So there was so much time between making a change in the code base and then seeing the impact in production that it was really difficult to correlate it to. So over the years as Amazon scaled and grew, this slow software deployment process was really a bottleneck that limited the pace of which we could deliver features and application improvements for customers. So to solve this, Amazon broke down that monolith moving towards a service-oriented design. So this empowered the teams to ship features faster and more frequently. And it also reduced the blast radius of changes to all of the different individual services. So with this shift of ownership came the need for individual teams to manage the deployment and operation of their services, resulting in the creation of many streamlined processes and procedures similar to the CICD pipelines we use today. So let's just take a quick look at CICD. So continuous integration, as I think many of you know, is a software development practice where developers regularly merge their code changes into a central repository after which automated builds and tests are run. So continuous integration most often refers to the build or integration stage of the software release process. And it entails both an automation component, could be the build service, and a cultural component, learning to integrate frequently. And the key goals of continuous integration are to find and address bugs quicker, improve software quality, and then reduce the time it takes to validate and release new software updates. So why was continuous integration needed? Well, in the past developers on a team, they might work in isolation for an extended period of time and only merge their changes to the master branch once the work was completed. And this made merging code changes difficult and really time consuming. So it also resulted in bugs accumulating for a long time without correction. And these factors made it harder to deliver updates to customers quickly. So continuous integration works with developers frequently commit to a shared repository using Git, for instance. And prior to each commit, the developers use to run local unit tests on their code as an extra verification layer before integrating. So a continuous integration service automatically then builds and runs unit tests on the new code changes to immediately surface any errors. Then we have continuous deployment and continuous delivery. And those are software development practices where code changes are then automatically prepared for release to production. So as a pillar of modern application development, continuous delivery and continuous deployment expands upon continuous integration by deploying all of these code changes to a testing environment and or a production environment after the build stage. So when properly implemented, developers will always have a deployment ready build artifact that has passed through a standardized test process. So continuous delivery and deployment lets developers automate testing beyond just unit tests. So they can verify application updates across multiple dimensions before deploying to customers. And these tests might include UI testing, load testing, integration testing, API reliability testing and so on and so on. So it helps developers more thoroughly validate updates and preemptively discover issues. And it then, sorry, with continuous delivery, every code change is built, tested and then pushed to a non-production testing or staging environment. And there can be multiple parallel test stages before production deployment. And the difference between continuous delivery and continuous deployment is the presence of a manual improvement to update to production. With continuous deployment, production happens automatically without this explicit approval. So this is a very simplified look at stages of software release, the process for that. So we start off with the source stage and often that's the last time that the developers see their code. They then put their code into remote repositories for code reviews and so on. Then we have the build stage where we'll do unit tests, reading, getting our artifacts and then we'll deploy it into multiple stages and then into production. And we're always deploying the same artifact into all of our environments. So in the testing stage, we're going to try integration tests, performance tests and so on. Everything we need to make sure that the application is ready to go to production. And then we go into production. We want to deploy with the least amount of downtime possible. So looking at this from what we looked at previously, this is basically the continuous integration phase. We then have the continuous delivery cycle with that manual improvement. And then we have continuous deployment. And this is where Amazon then moved into having processes that were similar to this. But then release is just the start of a software project. So at this point, configuration was still managed directly in the application's code base, making all of the changes very slow and still requiring a service to be redeployed or restarted to adopt a new configuration. So it was quite clear at Amazon that while CICD practices and technologies solved for many of the deployment challenges I explained early on, taking hours to do deployments, there was still a lack of ability to modify application behavior quickly and especially in response to dynamic live environment. So that capability is necessary to achieve that availability requirement for a system like Amazon.com. So how about a new way of thinking of this? So we start off with continuous integration and we then add on with continuous delivery or continuous deployment, but then something new, continuous configuration. And continuous configuration then enables software teams to have dials and switches that allows them to change the runtime behavior of their application. So just as Amazon evolved, external developers have been using application configuration to change how their software behaves for quite a long time. But generally, this configuration is static in nature and companies remain stuck using antiquated techniques that don't really allow them to best serve their customers. So the configuration is loaded as an application initialized and then remains the same until the app restarts. Alternatively, some customers have hacked together ways to pull these configuration details from other places, be it Amazon S3 or a database entry somewhere. And while this can work for some, these custom solution implementations often face constraints around latency, scalability, ease of use, and a lot more. So true dynamic configuration is different. The source of configuration truth, it lives in an independent configuration management system and it's pulled by the consuming application. So these are configuration values that an engineer anticipates will need to be changed easily in the future or might vary according to specific system conditions, even if they don't know exactly when that will be. And these values often fall into two groups, those that modify operational behavior of an application, such as throttling limits, connection limits, or logging verbosity, and those that control FAC, feature access control. And that includes feature flags, AB testing, and user allow or denialist. So continuous configuration is the process of updating these dynamic configuration values during runtime, all without deploying new code or restarting the app. And it's also the practice of rolling out these changes in a controlled way using a deployment strategy across your application fleet. So if we start off with CI-CD and we have those practices in place, we now want to move to CI-CD CC for continuous configuration. So let's look at some different use cases for continuous configuration. Well, a very common one is what's called feature flags. So code is deployed to production, but it's hidden behind a feature flag configuration. So once a feature is ready, it can then be released to small cohorts first, for instance, internal users. And app and new feature performance is measured to understand the impact of new features. And features are slowly rolled out to more users over time. And if something goes wrong, teams can then immediately roll back features that they have changed or released with the help of continuous configuration. And all of this without the restart or without code changes. So a very common use case for this capability with feature flags is delivering an application change for an event occurring at a precise time, where a configuration change in a CEC or continuous configuration system is much smaller, quicker and more reliable than trying to relaunch an entire application or rely on, say, a DNS switchover. So an example, a customer who produces a TV show could use a continuous configuration system to manage audience participation so that voting user interfaces can be displayed inside of mobile applications applications at the very moment that the TV host announced that voting is now open on the latest reality show and closed for all users as soon as the period has completed. Next, we have what's called operational flags. So app teams and operational teams, they coordinate ahead of time to build in throttling values into configuration. And as environmental factors change in your production environments, the operational team can then dial this throttling up and down. So an example of this is if CPU alarms go off due to traffic spikes, the operational team can then truffle down non-essential background tasks to handle the spikes in your production environment and to handle these traffic spikes. And once again, all of this done without any restarts or without any code changes needing to be deployed. Then we have what's called security flags. So also here, application teams and operational teams that can coordinate ahead of time to build in flags to control logging verbosity. So during a suspected production incident, teams can then dynamically increase the verbosity of data collected about, for instance, a suspected bad actor. And teams are able to increase logging until the incident is over, then returning to the normal levels. And once again, with continuous configuration, everything is done without a restart or without any code changes. So there are some dangers of continuous configurations because just like in code, we can have typos and it can cause outages. And if a config is rolled out too quickly, it can have a large blast radius. So having the option to be able to roll back confidently is really critical for using continuous configuration in a safe way. And think about how we work with CI CD today. If there are issues in our tests or in our deployments, we do a rollback. And the same applies for CC continuous configuration. And thinking about these use cases that I just mentioned, feature flags, operational flags and security flags, you might think that you could simply use a database switch or some in-memory flag to solve these problems. But our experience at AWS and at Amazon is that writing configuration systems is complicated, particularly at scale. So one of those things we learned by analyzing our internal correction of errors, COEs, over the last years are that configuration changes cause outages at about the same rate as code changes. So when it comes to resolving outages, manual configuration changes are being made when stress levels are high, further increasing the risk of errors in changes or attempt to rollbacks. So in the moment where changes are the most critical, validation systems and guardrails help you give that surgical precision when your hand probably is the most cheekiest. So at AWS, we use something called app config when we launch services. When a new service is in development, it will often be in production before it's actually released. And internal users, some partners and customers are granted access to provide feedback or produce launch materials. So we add these users via an allow list managed by a continuous configuration system. When the services launch, such as following an announcement at, for instance, a re-invent coming up in a couple of weeks, we change the configuration. So the service is available to all accounts and not just those on the allow list. And then as new features enter preview, it's significantly easier to manage a single production deployment with an allow list for beta users compared to managing a separate stable beta build of the application concurrently. So AWS app config, which was launched publicly in 2019 for all AWS customers to use, but we've used internally since 2016. And today, in all of Amazon, it is used by over 4000 different service teams. And it makes configuration changes safe and quick to make. And it also allows us to make updates at scale. Think about that service release example I just talked about, where we're able to make these changes to roll out an entirely new AWS service to all of our million customers around world. And this is really what allows us to use that practice, CI, CD, CC. And one of the features with app config is what's called validators, which then allows us to prevent errors with the help of these validators. And they allow any syntactic or semantic check to be made before the configuration is deployed. And then allows us to prevent those errors. As I mentioned, as the same with code, it's easy to make typos, but using validators help us to avoid those. So even the smallest configuration change can of course cause a massive outage. So it's important to validate the configuration. And many of the configuration changes, they're made when stress levels are high, just when we need to launch something at the top of the keynote, for instance. And many engineers or product managers, they might need some type of baby proofing when they want to do those changes. And then another one is the ability to limit blast radius with the help of monitoring and rollback. So with app config and continuous configuration, it allows us to monitor and then do auto rollback, allow us to have this customer controlled bake time where we're monitoring the deployment to make sure that it actually works as intended. And then if something happens, we can do an automatic rollback when an alarm is triggered. And then that limits our blast radius. So configuration should have monitoring tools, of course, just in the same way as with our code. And if a configuration change needs to be rolled back, it should be immediately. And any application which really has configuration is a use case for AWS app config. And this could be allow lists, and it could be block lists. It can be any type of private APIs, for instance, where you want to make changes to the URL for an API, and where you want to roll out that change to your application. Or it might be for your eyes for resources, something that can change quite regularly in our experience and in the way that we build systems. And using continuous configuration with, for instance, app config is great for AV testing to be able to roll out changes to smaller groups and allow them to test something. And it also helps us when we're promoting configuration, moving from that development stage into our quality assurance stage, and then finally into production. And a quite common use case is the verbosity of logging. As I mentioned before, for instance, when we want to increase logging when there is a security incident, or if there are any other types of incidents, we can quickly make those changes in our applications using continuous configuration. And the final one that I've mentioned several times now is feature flags. That's probably one of the most common use cases for continuous configuration. So I want to show you a quick example of how continuous configuration works for us by having a look at app config. So switching over to a live demo now, I want you all to cross your fingers and let's hope that our demo gods are with us. So this is the AWS console, something you might be familiar with it. I'm going to look at my monitor next to the camera now, so sorry if I'm looking away. And I've set up a simple application demo, which is basically a serverless API, serverless API that just returns a value to make it as simple as possible. But the use cases, they are, well, they are a million different use cases for this. So think about how you could implement that in whatever system you're building. I'm going to switch to BS code. And this API is built as a serverless API. So I'm using an API gateway for us to be able to call that API. And it is backed by one single AWS Lambda function. And it's a basic hello world example. So this is the app. All right, simple handler that just returns hello from Lambda. Let's check to make sure that it actually works calling that URL. And yeah, it's returning hello from Lambda. Cool. But I want to use continuous configuration with the help of app config to then change the behavior of my application. And I'm doing that in very simple way. First off, let's have a look at the template for my application. I'm building this using infrastructure as code with the help of AWS SAM, our serverless framework for building serverless applications. And in that, I've defined my serverless function, my AWS Lambda function, and calling out the handler, that is where the code is. Then I'm adding AWS app config to this. And I'm doing that with the help of something called Lambda extensions. So it is basically adding what's called a layer to my application, which then contains all of that logic for fetching the configuration, which makes it easy for me. So adding that layer, then I've just added a policy to be able to fetch it. That means that my Lambda function is allowed to fetch that configuration. And then in the actual Lambda function, the application code, I'm making just an simple function that's calling that Lambda layer, which is then hosted on local host on that Lambda function and fetching using a simple HTTP called fetching the URL for my configuration. When that configuration is received, I can then use that in my handler. So let's have a quick look at what the configuration looks like. In app config, I've created what's called a configuration profile, which is my config. And this is a super simple example, JSON containing just two items. First off, if it's enabled or not, let's start with version one. In this case, it is set to enabled false. And then I have another one to set a message option. And then I'm using this in my Lambda function. So first off, I'm checking, is my option enabled or not? If it's not enabled, it should return this message hello from Lambda, which was the one we saw when we simply looked at that API. If it is enabled, we should instead return hello from and then the message option from that configuration. So set it several times. Super simple example of how we can use this. But this is basically a feature flag. True or false, if this feature should be enabled, then we have some additional parameters. In this case, the message option. So checking my API again, it's working. Let's then make a change to this. So creating a new version of my configuration, setting it to true. And let's set the message option to India, creating that new version. Version three now exists. And I can then deploy this into my production environment in this case. So starting a new deployment, choosing version three. And then I can choose the deployment strategy. And the deployment strategy is basically how will this be rolled out? And there are different preset deployment strategies available here. You can create your own, what's fitting for you. And it's basically how will you roll out this configuration change? It might be that you want to do it in a linear way, in an exponential way. You want to do it to a small subset first and then roll out to the rest and so on. And you can set this up and use it any way you wish. And besides that deployment and the time it takes to deploy, you also are able to set what's called a bake time. I mentioned that earlier as well. And the bake time, it's the time after deployment that the deployment will be monitored for any errors. And if errors happen, we are able to then roll back. So first off, I'm just going to make this in demo mode, meaning that the deployment time is zero minutes and bake time is zero minutes. So it should switch over straight away. Starting the deployment. And it's already complete since it was zero minutes. So now if I switch back to my API, it has changed to hello from Agile India. And so far, using that, well, you might once again think, well, I can do that using environment variables or I can do it using a database change or something like that. But when we start using these features like rolling out deployments using the different deployment strategies, that's when a configuration system like app config really makes sense. So let's do it again. And this time, I want to show you that for my environment, I have an alarm setup. And the alarm in this case is a simple cloud watch alarm, which then exists here. And it basically checks for errors. And this could be any alarm. Think about what's important for you. It could be metric alarms. It could be a system metric alarm or it could be perhaps a more business metric alarm. That makes sense. For instance, an e-commerce site, well, the number of purchases per second or minute is perhaps a very important metric for you. So rolling out the feature, maybe that's what you should check. Are people still purchasing with the same frequencies as before that configuration change? But in this case, it's just checking my lambda function for errors. And if it receives an error, it will then go into the alarm state. So back to app config. And let's then start a new deployment. Let's roll back to my first version, version one. And now I'm instead going to use a different deployment strategy, still a quick one also for test demo. This one has a one minute deployment time and then a one minute bake time. So it uses a linear growth type. So it will start with 50% of users will receive this new version and then the other 50%. So basically after 30 seconds, 50% will have it. After another 30 seconds, everyone will have this new version. So let's start that deployment. And remember now that we have that monitoring in place. All right. So far, we're still at the old version. This is the part of the demo where I can just wait for the deployment to happen. And hopefully there aren't any errors. So this won't roll back. 30 seconds is a long time while doing demo. So 50% is complete. So we should probably get version 100% complete. There is a bit of a caching mechanism in that app config extension that I'm using. There we go. So now it switched back to version one. Hello from Lambda. So now a final example of this. Let's deploy this new version once again soon as our bake time is over. And now I'm going to trigger that alarm to make sure that it actually works the way that I told you. So wait for the bake time to finish. And since I can't really create an error in my application in an easy way, what I'm going to do is to use the AWS CLI, find that window and switch it there. And a simple CLI command, which allows me to then set this. I have too many windows. So I need to just copy that command. There we go. All right. Hopefully it's done now. One minute, still not over. There we go. Now it's complete. So starting the final phase, switching back to version three, meaning that it will enable that feature flag and it will change to agile India again. But this time I'm going to trigger the alarm. So let's start the deployment and let's imagine that this is my very important business website where we're then monitoring that to make sure that things work as intended. Starting the deployment. It's now going to start deploying. And I'm going to make use of the AWS CLI to set the alarm state for my alarm into the alarm state value. Checking cloud watch. It is now in the alarm state. Something went wrong with my application. Did I not use the right alarm? There we go. All right. We can now see that the deployment was stopped due to a cloud watch alarm. Something went wrong with my application. So it's going to roll back that configuration change. And it does that pretty much straight away as soon as the alarm is picked up by app config in this case. So rolling back to hello from Lambda, which was our version one. So very quick example of how we can use it. This was using a super simple serverless API using API gateway and a Lambda function. But you can use this with EC2. You can use it with basically any compute platform and you can roll out these configuration changes in an easy way. And let me switch back here. All right. So looking forward a bit then, in much the way that CICD reduced the fear of deployment, continuous configuration reduces the fear of changing configuration. So by exercising this capability regularly, we build confidence and can start to enable automation. And the systems like app config builds in this automation. So as you saw, I create a cloud watch alarm that performs this health check in the application. And you can have app config monitor that. And then when the alarm went into the alarm state, well, it will then roll back that deployment. And for me, I think this is really where continuous configuration starts to change the way that we're able to operate. It makes it really easy and safe to change configuration. And these are big words, but we're actually able to democratize configuration because you don't have to be a software engineer to make changes to applications. It's really a benefit that these continuous configuration systems and that practice, it means that pretty much anyone with the protective guardrails in place can then self serve and make changes in application behavior. It can be a junior engineer to scale up the number of database connections. It can be a product manager that wants to add customers to a beta service, for instance, or maybe even the UX designer that wants to modify an AB tests cohort split. So democratizing of ownership helps to remove this operational engineers as a bottleneck for changes in production environments. And we have this phase, we move from configuring to validating, and we then deploy this, and we monitor that deployment. So some additional resources, if you want to check out AWS app config and check some documentation for that, it's available on screen. And there's also a great blog post on continuous configuration written by our good friend and CTO Werner Vogels that's also available. And with that, I want to thank you all for watching. We've looked at how to continually adjust configuration to update and tune your code in production and how that is powerful, fast, and safe way to deploy value to customers. If you have any questions besides what might be in the Q&A, do reach out to me on Twitter at Guna Grosch as shown on screen, or happy to connect on LinkedIn as well. So thank you very much for watching. Thank you, Guna. And there are a few questions in the Q&A box so we can pick up one by one. Okay, so first one is if I have multiple tests in broad environment, how do I review all the config values in one place? Also, how do I ensure, as my code artifacts move from one environment to another, the same config moves at all? Well, first of you, you can have configuration. It depends on you. It depends. It's always a boring answer, but it depends on how your environment is set up across accounts and so on. But you can, I didn't show that in app config right now, but you're able to set up your different environments there as well so you can use the same configuration and point that towards different environments. So you still have that central control of your configuration if you wish. What was the second part of the question? Yeah, it was also how do I ensure my code artifacts move from one environment to another, the same config moves from one to another? Yeah, and you can do that pretty much in the same way, but you have it stored in a central location. The type of configuration I did right now with app config, it uses what's called a hosted configuration, but you can use, for instance, an S3 bucket or you can use parameter store as a source for your configuration and just point to that using app config. So in that way, you are able to then control that you use the same artifacts for all of your different environments. Thank you, Anna. The next one is any reason they have timing strategy that deployment app configuration flag will be available after one to 20 minutes. Can you repeat that one? Any reason they have timing strategy that deployment app configuration flag will be available after one to 20 minutes? Okay, we can take this question later. Why there's so long time for the deployment? Well, I would say it is part of that safety, safety mechanism to roll it out over time means that we are quicker, more quickly able to pick up if there are errors. So the examples I did was they were well immediate or after one minute, but in a production environment, I would say that it's very common to make these changes over time. So over well up to 20 minutes like you wrote in the question. I think that that's very common and especially if you're doing a gradual rollout, so it's some users are starting to get the new configuration because then you're able to pick up on any errors quickly without it affecting everyone. If you do a quick rollout, then everyone will be affected by that configuration change quickly and perhaps face errors. Thank you. The next one is in your experience, the idea of implementing feature flags coming from developers or product managers? That's a good question. I think it depends on the organization. My experience is that my experience from AWS at least is that it's something that both want. Developers at AWS, they want to be able to create their features and have them in production without them being accessible, accessible by everyone so that we're able to test it in production in a safe way. And product managers, they want feature flags because they want to be able to roll out the feature now. They want to be able to control exactly when it should be there, not have to wait for deployment times, for instance. The next one is if I am working with an application where scaling is not critical, the CC may not be needed. Is that correct? I always try to think of when building whatever I'm building that you should always plan for scale and starting to use good practices, no matter if it's CICD or if it's CC. I think that's something you should use if you're able to, because hopefully whatever you're building might need to scale in the future, or it might be that you want to make use of certain features, for instance, like feature flags or making changes like logging, logging verbosity, things like that. So really no matter the scale, I think that these types of configuration management systems are really helpful. Also that it makes it easier, the democratization part of it, that other people can have access to make configuration changes without having access to the code. Okay, then next one is what are your views on Dal configuration like this? Actually not read up on that. I've heard it mentioned, but I don't really have enough knowledge to have a view or an opinion on it, but I'm going to put that on my reading list and make sure to do for sure. Okay, thank you, Guna, and thanks for sharing your experience with us today.