 Well, good afternoon. My name is Christopher Lane and my co-presenter today is Alex Crane. We are both members of the Enterprise Architecture Team with Chick-fil-A and we're going to talk about get-ups in the real world, improving the developer experience, and this is based on our experience of managing Chick-fil-A's customer-facing digital properties. So first a little bit about who Chick-fil-A is. We are a privately-owned Quick-Service restaurant based in Atlanta, Georgia. Chick-fil-A began with a simple idea from our founder Truett Cappy to fill the needs of his customers, the Chick-fil-A chicken sandwich, very famously. We've grown from that idea to more than 2,400 restaurants, mostly in the US, but we've opened locations in Canada and Puerto Rico in the last few years. Our per-restaurant scale is significant. We have by far the highest per-restaurant sales in the QSR industry. More than 20% of revenue is flowing through our digital channels. Our mobile app, Chick-fil-A 1 and Chick-fil-A.com drive the vast majority of these sales and we've seen significant growth in these channels since March 2020 when the impacts of COVID started in North America. So I want to talk a little bit about what our back-end looks like for our digital properties. We call that back in DXE for the digital experience. And this is the number of requests through DXE on a given day. There's nothing special about this day. I just picked a recent one. All the times are US Eastern, but the pattern probably isn't terribly surprising. So there is the Crest ramp up during breakfast. There's a spike at the peak of the lunch rush. And finally we hit another smaller peak during dinner. If we're running a national or regional promotion, these numbers can easily double. But on average we hit about 300,000 requests per minute at peak. We average about 120,000 requests per minute over the day and we service about 330 million requests total. So the DXE team behind these services is composed of a bunch of developers, both internal and internal staff and external contractors. Together this team pushes thousands of commits and opens hundreds of PRs every single day. I really like this GitHub insight graph because it looks like we have no issues, which of course we do, but we just track them elsewhere. But our application platform is entirely based on GitOps. So all of our developers need to traverse this process to get their code deployed. We have no blackout times where we prevent deployments, so we need this process to flow smoothly at all times. Allison is going to explore a little bit more about what we discovered using GitOps at our scale, but I think it's helpful to set a little context. So let's talk a little bit about what the DXE architecture looks like and the GitOps process behind it. So this is a high-level DXE architecture. It's relatively straightforward. Request from CFA1, Chick-fil-A.com and other digital properties are routed to our services running in EKS. We have clusters in both US East 1 and US West 2, though the ladders mostly failover. The setup is the same in both regions and we regularly route all of our traffic to US West 2 to make sure there aren't any hidden dependencies in there. Our services are mostly Java spring looped applications with some mixture of Go and Python apps. Baps are backed by global Dynamo DB tables. The data in Dynamo is a stream to a set of Aurora databases for real-time analytics. The app data from both Dynamo and the analytics databases are streamed to our S3 back data lake and the digital warn orders, the orders that start with Chick-fil-A.com and CFA1 are combined with the traditional orders so that our analytics teams have a complete picture of what's going on for many given day. So this is the architecture of our application platform. This is what is running all of the DXE services. We think of CAP as our CFA flavor distribution of K8. It's composed of a series of layers that collectively form the complete platform. This certainly isn't the entire list of components of CAP. The slide would basically be unreadable if we add them all, but this is how we think of CAP in this layered approach. The layers are meant to be composable so that platform teams are free to add or remove components to taste, but if they don't want to worry about it, CAP has opinionated defaults and you'll get those opinionated defaults, but it does allow flexibility for teams to switch things out, say, the get-offs operator for a custom one or different one, things of that layer. The base layer is EKS on top of, sorry, that. The base layer is EKS. On top of that is the SIS layer, which includes things like IAM roles or service accounts, external DNS for managing around 53 entries, the ADBS load balancer to manage the ALBs that form our front door, the cluster auto-scaler and a whole suite of metrics and providers for both the cluster and services. On top of that is our core layer. This includes things like Argo CD, which is really the hub of the entire get-offs process, the Prometheus and Grafana stack, which we're using Prometheus operator for. We're using Thanos operator for managing long-term storage of those metrics. And then finally, a tool we're very excited about, SpeedScale, which we use for capturing traffic in one environment and replaying it in another. SpeedScale quotes are here today, if you'd like to talk to them a little bit, but we're excited about that. And then the top layer is our app layer. And we draw a pretty distinct line between what CAT manages and what our dev teams manage. So CAT manages the K, SIS and core layers and the dev teams manage the app layer. So what do the dev teams need to do to actually deploy? And so this is our deployment process at a high level. Let's there's a lot going on here. So let's walk through it and see what it's composed of. So the process has started kind of as always by a developer pushing a commit to the mainline development branch. This triggers the dev workflow and get-up actions. We've moved over to get-up actions last year from Jenkins. The app containers built. All tests are run and we'll assume a happy path here and that they all pass and the image is pushed to Artifactory. Next, the workflow pulls the base manifests for the app from our customized repos. And our customized repos are a set of repos that we've developed that contain our standard manifests for various applications. So, you know, Java API, Go API, Python API, React app, all of those are stored within these set of customized repos. And the base manifests are merged with any app specific overlays found in the app repo at this point. Developers are free to patch anything in the standard manifest, but we offer sane defaults that work for most use cases. And this provides kind of an easy on-ramp for developers to get started quickly and increase their K knowledge incrementally and change things as they need to without having to wade through a gigantic pile of yaml. The complete merge manifest are then committed to a special repo we call the Atlas. This is the repo that Argo CD is synced to. It's watching for changes on the branch of the Atlas. And anytime there's a change, Argo pulls down those changes and applies them. So, that's a lot of stuff. And I'm going to end it over now to Alex to discuss what we've learned using this approach at scale. Thanks, Christopher. And I will probably hop skip through the slides a little bit since we have, I think we're still going to run up on a similar time stop that we had before the interlude there. But it's kind of one quick slide here. This is kind of the simple version. You've got users contributing stuff to the get repo. You have what I put up here as a black box. Hopefully you do actually know what is occurring in there. And then applying it to your target environment. So, Christopher kind of showed the real version, right? We sell this to directors, et cetera. This is what it's going to look like, and then we know it's going to look like what Christopher showed. So, I'll jump right into some pain points and then go to kind of a couple examples that kind of show these things. So, it's get ups, right? And actually, it was great. This panel covered, like, 50% of what we wanted to cover in the slide deck, so that was fantastic. But, you know, so it's YAML, right? So, it's YAML. It's in get. We know what's there. It's auditable. All of those great things. Like, you can merge it, you can see what changed, see all of that. However, users going in to edit stuff, right? You have a massive, well, at least for us, we have a big gap in terms of the different users who needed deploy stuff or change a config, right? Some of that's people who do it constantly every day, and they know all of it. And then you've got others who only know or who deploy less frequently, right? They don't remember what all the options and, you know, field values are for all of the different pieces of YAML. And today's kind of deployment, YAML are similar. It has a lot of different choices, but depending on what you've added in your cluster, that's in a lot of separate documentation. So, you know, how do you do, you know, do you do the pull request validation and testing? That catches some of it. But that's not terribly proactive to the user, right? They do it, they submit it, then they find out. It's almost like the mainframe days. But, I mean, better and faster, hopefully. But it's still a little bit delayed. You know, so, you know, what are some solutions, you know, for some of that, you know, IDE plug-ins as well, right? You can do all that linting, frontload all that. And that's good. But some of the users still aren't kind of IDE people, right? I mean, think about today, some people are command line people, some people like web UIs, some are, love doing it in the IDE. And kind of the other piece there is, and again, as was mentioned in the panel, AWS, Azure, Google, Argo, CD, these groups put a lot of effort into good user experience, directed flows, what needs to be in a menu to help users figure out what should be in those files or what should be in the system. We kind of lose that, right? We throw that out the window, frequently do getups. So kind of, you know, here's an example with the Argo CD UI. You know, most of the issues that we raised before are kind of addressed in this, but it does, you lose audit history of what's here unless Argo separately fully implements that. You lose recreatability, right? Think about anyone who's had a team that turned out they created a database or similar through a web UI and a cloud provider, and later on you need to go recreate it, and you got no idea what that state was. An example, you know, with getups, we fixed those two problems by doing this in Git. We know the audit trail, we can recreate it, we can repull it, but what are the field values that I can put here? Which ones are not listed in some basic template? Where you have things that would be autocomplete clusters to be targeted, etc. You know, you're going, where are you looking those up? You're looking them up separately and filling it out manually right for misspellings, typos, etc. Usually, you know, you go ahead, you get it as close as you think, you submit it, and then you wait for things to explode, hopefully in the pull request, but sometimes it gets all over the cluster before you find out that thing is exploding. And hopefully that's also not scary. Hopefully your automated tooling for rollbacks, etc. Makes that a clean and transparent process. But as a dev, I hate waiting five or 10 minutes sometimes to find out that the thing I deployed actually exploded and they're waiting for a production fix. So getting to some, you know, proposed solutions. So one I went ahead and crossed out entirely was the abandoned GitOps in favor of some new model. I just went to GitOps and I really don't want to change. No, I think there's a lot of great value found in GitOps and the auditability, what's there. And for those of us that are deep in the weeds of some of the stuff, you know, it's really nice being able to be in the guts of it and not pulled out into some menu. But, you know, I just think how great it would be when we look at UI editing with Git as the data store. If I jump back a few slides real quick, we see in this top right-hand corner, it might be too small to see on the deck, but it says edit as YAML, right? The number of cloud providers and other pieces in various places let you see the YAML that backs your thing or the JSON that backs your documentation object. However, kind of your best way to use that as a dev would be, it's go fill out the form, then go click edit as YAML and then go copy and paste that over into the Git repo, right? How nice would it be if it was able to load that from Git as the data store when you went to that menu and when you hit save, you know, push it back as a pull request so that it could be reviewed. Still hit all that great pipelining stuff that's done behind the scenes, but enable those users who are more UI or visual based, right? But not box out those who do need to go edit something more directly in the repo. And then one other kind of interesting option there is IDE plug-ins and also command line, right? And those would both would and could function similarly to the UI piece there. But the core part of that being that after you made your changes, done your validation, some of it, it gets committed to Git and is able to follow all the rest of those deployment practices and validations. Yeah. So we always like to finish with a quote from our founder, Truett Caffee, and we picked the no goal is too high to climb or sorry, no goal is so high if you climb with care and confidence. This is a quote that's particularly meaningful to the the DXC team. We actually named one of our conferences Climb with Care. But you know, please check us out at our tech blogs, techposts.medium.com at Chick-fil-A tech blog, as well as some of our open source projects at github.com slash Chick-fil-A. And as always, we're hiring. So thank you guys. We appreciate the time.