 All right, we are ready to get started. We have an incredible panel of amazing experience, a whole lot of people from really interesting companies with really interesting use cases. Some of them very famous and well-known. Some of them less famous and less well-known. So I'm hoping we can expose those and get them clear. For those, we are going to do some Q&A during this. We do want it to be someone of a live session. Now, there is one trick with that. We only have one. We're maxed out on microphones because we have six people on stage. So this has to be the audience microphone. And so what we'll do maybe is, as people have questions, if maybe Hannah can kind of hang out up here-ish. And then if you have a question, you can kind of come up here and hang out by Hannah and we'll pass the microphone over to you. I'll either run or run. Oh, yeah, and Scott can run. Can you relay runner? Perfect. OK. Yeah, excellent. OK, great. So I'd like to start with some introductions. And some of you are very well-known and maybe we can keep it to 30-ish seconds. So Cornelia, you want to start? Sure, I will start. I'm Cornelia Davis. I have spent the last decade or so working on developer platforms. I was at Pivotal working on Cloud Foundry. And then I was at Weaveworks, which was really just an extension of that agenda, which is to continue the DevOps agenda into the space of GitOps. And I've just recently moved and I am at Amazon doing Alexa artificial intelligence stuff these days, so. Perfect. May? May Large. I'm from State Farm and architect manager, primarily responsible for the GitOps topic. Excellent. David? I'm David Lewis. I am a lead site reliability engineer at Starbucks. Primarily a work in the product team and we developed a tool on Kubernetes to enable application engineers and applications across the technology sector just move and deploy a lot faster. And we use GitOps very heavily for that. Excellent. Perfect. Christopher? Christopher Lane with Chick-fil-A. I'm an enterprise architect working primarily with our customer technology group. So that's all customer-facing applications at Chick-fil-A. So if they want our mobile app, dot com, the point of sale heads, that's all backed by our unified digital commerce platform, which is based entirely on GitOps and Gates. Excellent. And Mike? Hi, Dan. Yeah, my name is Mike Bowen. I work at BlackRock as a senior principal software engineer. I also run our open source program office for the firm. I focus primarily on Kubernetes platform, platform services, and work within our SRE core platform team. Excellent. So I did a quick survey of our panel here before we started, and we have roughly equal representation of Argo CD and Flux users. And one of them is using both. But you'll have to find out which ones. You see if you can guess. I think you'll be pleasantly surprised. So one of the first things that I wanted to talk about is the announcement of the GitOps 1.0 standard and this idea of having a standard. Now none of you had the benefit of a standard, but there were a lot of guide posts put out there by people like Alexis Richardson and others who were writing about it. So maybe does anybody have a strong feeling about something they ran into when they were adopting GitOps that really surprised them? That was a big shocker. And I can throw it at somebody specifically, but it's better maybe if you just jump. I have one. Yeah, go. So one of them is the perception switch from more of a CI ops driven and manual processes. And in a highly regulated firm, moving that to things that are more autonomous, a little bit higher velocity, and where you're trusting more declarative code and describing policies and regulation and such in a way to facilitate the GitOps journey. I wouldn't say it was a challenge, but it was an opportunity that was one that took a little bit longer to realize. Yeah, did you have a challenge for people that were like, well, we're doing infrastructure as code. So why is this any different? Did that come up at all? It did. It did in pockets. It's a big firm. So it came up in pockets, but typically showing, seeing is believing a lot. So the more we sat down and worked with people, had conversations, showed them the real value, showed them where we were getting the gains, where we still had some rough edges, where maybe some higher order tools or integrating some of our existing tools needed to be done, but it's really about having that conversation. Make sure we're collaborating and showing that it wasn't just a one-ring to rule them all. It was a new tool in our tool chain that we had to help everybody understand and see the value. And then once they did, the adoption went very fast. Oh, excellent. And may I see you nodding along? Was there some surprises at State Farm as well? It's actually very exciting because prior to the principles, it's almost like you're handed a clean sandbox and it's up to you to be creative about it. But we're also very intentional to share out whatever it is that we're learning. Because, yes, we have a large community within State Farm, but we all understand it's a larger get-offs community that we're all a part of. Yeah, I see. There is something that I would add because I think that there's two principles that are, and one of them is not expressed quite this directly in the principles, the two of them that come together that are very interesting are the notion that the automation is convergent. And that is back to the point that you just made, which is something that from a principle's perspective, we wanna express the principles. But if you take the principles just one step further and you say declarative configuration, what does that imply? That implies that all of the automation that is tied to a get-action is convergent. And that's very, very different. And that is in fact what allows the other thing, which I think still surprises a lot of people, which is that when you're doing get-offs, get is the interface for operations. And what I still see a lot of people doing, and you mentioned it as well, is infrastructure as code. Well, I have infrastructure as code, of course I'm putting that in a source code control system. But if then after you've checked it in, you go and you execute a script somewhere else, or you go to some user interface, which now draws things from that get repository, then you're not doing get-offs all the way. It's that get is the interface for operations. It is now, you have to get away, you have to shift that mindset from, oh, I've still got my consoles, be it an AWS console or a vSphere console or whatever. No, no, no, no. Your console now is get, whether you're using the get command line or get hub or get lab, it doesn't matter. So it's the combination of those two things that I think still cause a lot of surprise. Well, I think that that makes adoption sometimes a challenge too, because there's a default, especially within operations for a long time to do things manually. Whether that's even through a command line or going into a portal or a control plane of some kind in executing something. And in a lot of products and technologies that are built today, they even use the word get-offs, but when you get down to it and you start going into the control plane and executing something, and a lot of times those are applying changes to the service or infrastructure, and you're like, that's not, you're bypassing all the rules that get-offs helps you in place because those rules are for policy enforcement and compliance and security and reviewing, and it's supposed to be integrated with tools like CI to ensure that you, what you're putting into production or what you're putting into these infrastructure is not only being declared by the code that's in Git, but that it can pass through all of these gates to ensure that you don't have what I refer to now as the Facebook effect, where you accidentally take everything down because not to say that they did it that way, but I think it's a really good reminder of why we use get-offs, of why get-offs is so important and why some of the tools that we build into get-offs are so important. You can't just use Git to drive configuration into a system, but you need to have some of these other tools to be a part of it. Yeah, this is, I mean, something that famously AWS and Cloudflare both had very well-known outages that happened because of people making manual configuration changes. One of the other kind of interesting things that's been happening with get-offs that I think was unexpected for a lot of people is the rise of edge get-offs and this idea of having sites and satellites and using get-offs in those locations. Chick-fil-A very famously came out two years ago with a talk about how they had a Kubernetes cluster in every store and it kind of blew everybody's minds. I think Starbucks has followed a similar kind of pattern and I don't know if State Farm is doing something similar, but do you want to maybe, to the edge use case, was there something unique about get-offs that really brought you guys to that? Yeah, so I mean, managing 2,600 plus clusters kind of drove us towards that. There's really, I don't think an effective way to do it other than get-offs process, especially when you have this fleet of clusters you're managing. So you're, it's not just one cluster in those components, it's every single one and rolling it out has to happen in a controlled fashion. And so using get as the centralized tool to do that and well-known processes that all developers can immediately come into, because it's scary to run things at the edge of the restaurant. You can't replace them easily and quickly like you can in the cloud. And so you have to get it right and that involves quite a few gates you have to go through and check-offs to make that all work effectively. And so I think get-offs is the enabling process to manage that size of a fleet. Every single one of them is a little data center. I had assumed that you were driven because no one was willing to deploy on Sundays. So you needed to have an automated way to handle that. So there's this thing that we refer to as Starbucks and I know Chick-fil-A deals with the same thing. When you have all these stores, any kind of configuration change, a lot of times requires what we refer to as a truck roll. I don't know if they call them that at Chick-fil-A as well, but truck rolls are expensive because it requires a technician to go out there and install a piece of hardware or software to make a change. And we have 10,000 stores just in the United States alone. That's, anytime you want to deploy a new application that requires a new piece of hardware, that is really expensive and that takes a long time to roll out. And so the idea is to kind of bring it together into, Kubernetes really makes sense in those situations because then if you can run all your applications in just a small subset of hardware and it doesn't require a truck roll to do it, you can just deploy it out to that location, then that's the ideal scenario. Have you had to deal with essentially d-dossing your artifact repositories when you do a deployment? Like because you have so many sites that are trying to pull those artifacts at once, that it creates a bottleneck? We've had that problem. The restaurants have limited bandwidth. So even like pulling down images at the restaurant can be challenging. And we've had to really strengthen our, widen our bandwidth constraints to allow us to do some more faster things we like. But yeah, if we run to that, even like pulling logs and metrics and how you stream that, you have to think very, very carefully about how you pull all of that from the restaurants. Because you can take, you can just swamp the network and take down sales. But I wonder, I wanna knit on something here because you said all at once. And yes, I think if they're all going at once, you can have this problem. But the whole point I think of this is that you achieve a certain amount of autonomy where not all of the stores need to go all at once. That's the beauty of distributed systems and the magic of distributed systems when you've done that right is that you don't have these bottlenecks. And so that, you know, I'm sitting here thinking, of course, thinking about, ah, well, how do we build that into the tooling that supports GitOps so that we can take advantage of this inherent advantage of autonomy that comes with distributed systems done the right way? So how can we do that so that it's not just, oh, we've got to up the bandwidth, but maybe there's ways that we can navigate through that hierarchy of distributed systems to kind of balance the load. Interesting. Yeah, good point. We do, we are gonna have some time for questions. So if you have some questions, come line up over by Hannah and I know we have some coming over Slack. Before we go to those, May, I wanted to hear about some of the unique use cases you have at State Farm. What are some of the unique challenges that you've had to bring in GitOps and what are the kind of unique drivers from your business that have commanded that? Yeah, for sure. Opposite to the many clusters that you guys have to maintain, we introduce GitOps in a well-established, large multi-tenant cluster, super large, with production workloads. So having to figure out, there's obviously relationships that had to be figured out and part of the excitement, like I talked about earlier, is there's teams, there's roles forming as a result of an organizational change like adopting GitOps. So beyond that, right, so introducing GitOps and really what we landed with and the video is still out there from GitOps days in June about introducing flux multi-tenancy. And that was such an interesting endeavor to really gain the momentum and embracing GitOps in, like I said, a well-established, large, lots of already running production workloads. Oh, excellent, thank you so much. So I know we have a few questions from the audience in the Slack, so are we on the same channel or do you have to come grab the mic? Hello, no, I'm good. Okay, go ahead, what's bringing us one of the questions from the Slack? So we have two questions and they're about managing configuration. Carlos, he asked, I've sold the organization that GitOps is the best practice to manage configuration on Kubernetes, but the problem now is a lot of people don't know how to use Git. They are not used to it because their primary job role is not programmers using Git. I guess he's looking for some advice on that. Yeah, did any of you run into this where suddenly you discovered that some of your engineers didn't actually know how to use Git, which is concerning. It is. In many ways, yeah, go ahead. Yeah, so we ran into this problem and it's also related to just understanding Kubernetes in general and how it works. Some of our dev teams have widely bearing experience in our clusters, so we spent a lot of time running a series of classes out of enterprise architecture. So we have 101 where we introduced the very basics, a 201 class and a 301 class and then followed it up with 401s and 501s just kept sort of repeating ourselves and walking through the step and making sure that was kind of a very easy on-ramp for folks and that we weren't trying to do anything super fancy right out of the gate, but we went through a lot of training to get the teams up to speed. Mike, at that question you kind of gave me a look like, no, all of our engineers know how to use Git. Training, training, training, training, then reinforce the training with more training because it's varying degrees, right? Especially when you get into the Kubernetes's wrought with complexity. It doesn't make hard things easy, makes hard things more efficient, more distributed, more, it facilitates a better path using GitOps as the delivery mechanism too. So not just people that knew how to use Git, but good Git practices and understanding all the Git semantics, understanding everything there is to know about shods. One of the big values that we've gotten out of it is ability to roll back and roll back very, very quickly. Over a thousand apps across 150 clusters, two short years ago we had five clusters, handful, right? And so we've scaled up very, very quick, but that configuration, nested configuration, remote bases with customized helm dependencies, whether it's an add-ins platform service level thing versus a tenant application thing. We also support single-tenant versus multi-tenant. We've had both clusters given privacy, right? Certain apps cannot run with others very, very specifically and intentionally. So in that model, we do have to kind of, we don't have a series of snowflakes, we have two snowflakes, right? One's single-tenant, one's multi-tenant, but the way we reason about around that is through those add-in layers, right? Through those segments we call them that go into every cluster based on its composition, right? But to get training I think with the configuration and getting all that set up, right? Because all that's like table stakes and the foundation. Yeah, it took some iterations, training, reinforcing, and making sure that where we could teach a convention it was good, but where we could force a pattern was better. I would say enforcing the pattern is really, really important to back up the training because if given the opportunity, an engineer has the opportunity for a back door, they might take it. So make sure that you've got your safeties in place, go to read-only permissions for your portals, do things like that so that the only path for them to take is through Git. The paved road, we talked about conformed apps. If your app's conformed, it's gonna go into the cluster find and Git's not gonna work. Gold path. Yeah. We have some other questions, Hannah, go ahead. This is from one of our audience. Hello, how's it going? So this is a follow-on from actually the previous question. Developers love Git, but what about, so I'm finding myself in a situation now where someone wants to use a GUI or another tool and I go, well, you know, in read-only mode, sure. I don't want it to write to the cluster in a non-declared way. Presumably new tools are coming online that are kind of adopting GitOps principles, new GUI-like based tools. But is that always the right, I guess I'm wondering about kind of performance, introducing kind of interposing Git in between rather than writing directly to the database or writing directly to whatever. Is that the model? Is that where we're going, where all these tools start to use Git on the kind of on the back end so that we can get off the floor or is there. So I can answer that then. So we do that a lot actually. We create a lot of like self-service tools for our engineers and onboarding portals and whatnot. So they're acting as an API to retrieve information from our customer and then it's writing to Git and then that Git code is then driving configuration for it. So, and what I would like to see for a lot of products and vendors to do is when they are developing like a nice portal to instead of just making the configuration change that applies directly to the infrastructure or cluster that it writes to Git which then can flow through your own set of tools and workflows like CI and policy validation compliance and that sort of thing. And then it applies to the infrastructure at that point. So I really do like the idea of a UI writing to Git as a means to use GitOps. Another question? That one's on. So kind of going off that as well, we have a lot of operations where we want to make bulk changes and then query and Git is not always the best language or mechanism to query for things. Like we wanna see how many of our applications are using version XYZ of their image and instead of going directly to the clusters because we might not have access to them because the GitOps engine is the only thing with access. How do you all provide visibility into Git and then how do you make bulk changes across all your Git repos that seem to be rapidly expanding every time we introduce a new microservice or whatever? So I'll answer the first question and somebody can answer the second. First question I would say in regards to like having some visibility into what versions and things that are out there, we use monitoring for that. So there's lots of tools, we use Datadog, New Relic and things like that. That's what we use to gather the metrics of what versions things are on, what state they're in and things like that. We don't actually query Git for that kind of thing. So I would just say make sure you have really good monitoring in place for those kinds of things. Again, I'll plug actually Argo CD Autopilot. If you haven't tried it, it provides a structured way of approach. It's an open source project that provides a structured way of doing changes across environments and it uses customize. And you can get in trouble with customize because you can make changes to your base layer that aren't necessarily evident where your overlays are and that can lead to some issues. But if you wanted to make 500 changes at once, for good or for worse, you can do it, right? So that's one that maybe to call out. May? For question one, we rolled out our own framework. It's the don't mind us, we're just listening type of deal. Because a code change, as soon as you push it remotely till the time it's realized to prod, there's lots of steps that happen there. And we do watch that. We capture all those milestones, those events from your testing, scans, chaos tests, performance, all the levels of testing, right? Until ultimately it's gonna get applied to production. So we set up all that infrastructure in AWS, again, proprietary homegrown solution, but we harvest that vast amount of data, right? To increase our comfort for this is good to go. This can go to prod. Your second question about large changes, we just take advantage of Windows to do that. We primarily use Terraform, we've gotten really good at writing sweeping changes, and we'll tell more at our session. I don't wanna steal Priyanka Ravi's thunder here. Are you talking about update Windows? Is what you're talking about? Yeah, yeah. Oh, another thing that I would add, and this is something that Weaveworks is working on, is they are working on the notion of a profile. And a profile is something where you can think of it essentially as a template, and you can, in fact, think of these as a hierarchy of profiles, where profiles can be built from other profiles, and so what it allows you to do is it allows you to take kind of a base thing that is gonna be common across all of your clusters, so across all of your different environments, capture that in kind of a base profile, and then have lower level profiles where there's some differences, and of course it leverages things like customize and so on for those differences, but there is emerging this hierarchical model so that you can go and say, ah, the things that are common across my clusters, so instead of putting everything in a single Git repository, it allows you, and by the way, these profiles are Git repositories themselves, so it essentially allows you to have kind of base Git repositories and then specialized Git repositories so that the things that you do wanna roll out at scale, you don't have to go and update it in a bunch of different Git repositories, you can update it in a root Git repository and it'll filter down for you, so that's something that WeWorks has in the works. So another topic, when we actually started the GitOps Working Group, we had discussions with AWS and with Azure, who were both founding members, and I remember asking them, like, why are you interested in pushing a GitOps standard? Like, we've worked in CodeFresh, I know why we're interested, we both make and sell tools that do GitOps, you know? And the answer that I heard was, well, we wanna have a standard that we can hold up against our tools and say, are we building cloud components that are compliant, that can meet this standard, that are gonna work with this? So now that you work at AWS, I was hoping that you could give a very hard and public commitment about everything in AWS reaching that standard, I'm sure you're ready to do that. Well, my big cop-out is that I don't actually work for AWS, I work for Alexa, so I'm not at liberty to make any statement like that, even if I was at AWS, I wouldn't. But the thing that I will tell you that is not unique to AWS, it is something that Amazon has across the board, is we are customer obsessed, and I can tell you that the reason that Amazon, that AWS participated in the GitOps Working Group right from the beginning was because we, this serves customers. It is valuable, it was clear that this will make lives better for customers, and that's why we engaged, and yeah. So you heard it here, if something isn't GitOps friendly on AWS, just ask Alexa. And if enough people do it, it'll show up in the error logs, Karalia will float it to everybody. Hannah, I think we have one more question. Yep, sure, she know her. We had a question asking, GitOps practitioners identified configuration management at scale as an unsolved problem in the community. How do you handle this, and what more can the industry do to solve this problem? And she was referring to things like customize, helm, et cetera, that manage YAML configuration files. Similar to the last question. Any thoughts on things that you'd like to see come out, or ideas that you've had, or successes you've had in doing massive configuration management? When you have tens of thousands of YAML files, I can see that getting frustrating. And while you're thinking about it, one of the patterns that I like to advocate for, though I think the tooling is still not amazing yet for it, is the idea of doing generated configuration generation. So like having a repo with some baseline configuration that generates and writes the entire configuration to another repo. And I especially like this with helm, and Scott and I were debating in-between sessions, like the pros and cons of that approach. Anybody had any successes around big configuration management? So that's essentially what we do. We have a set of base manifest and for every single application type that we have. So I think Python API or Java API, we have what we consider the same defaults for that. And if your app is conformed, then it should just work in most use cases. But I think there's a, the trick is that most use cases, the other use cases you have to account for in your tooling, and allow dev teams to, you know, zig where the folks zag, and where you draw that line I think is tricky. Awesome, well thank you everyone. We're out of time on this panel, but please chase these people down as they leave the room. I'm sure they would love to have more questions from you. Thank you so much. I mean, there's an incredible amount of experience. You are the pioneers of GitOps. You've been doing it for a few years, massive scale, massive success, and it really shines away for all of the rest of us. So thank you for that. Thank you. Thank you.