 Hi friends, my name's Lee. The most important thing on this slide is that I'm Filipino, honestly. Like, I love seeing any Filipino people show up, so come say hi if you wanna talk culture or anything. I come from the Kubernetes and Flux projects and I work with VMware. Thanks for coming today. I have a little story to tell you about how frustrating it is to configure an application's routing. So, when you wanna get your cool app running on your sweet domain for your company, then you've gotta go and do everybody's favorite thing. It's configure DNS, right? So, coolapp.company.com, right? And when I configure DNS, I have my cool name that I want everyone to use, but DNS's job is to go and give somebody an IP address behind the scenes, right? Where those IPs come from. They come from a bunch of servers that we wanna point them to. So, naturally, we have some information that's coming from some system. We gotta go and pull it over into some other system. And this is an infrastructure engineer's favorite problem. It is the thing we love and despise most, the source of pain, right? Now, here's the fun thing, right? Now that we got our DNS set up, it points to our servers, but our servers on our edge, they don't just go to one app, right? They go to lots of them, but this name is just from one app, so we gotta go set up the virtual host. Oh, look at that, right? Fun problem, just like in the reverse direction, right? And then the cool thing as well is our, we got our apps servers now, our traffic is routing, people are happy, but now we wanna deploy a new version. So we gotta do a blue-green deploy, which means we got a bunch of information going both ways again, that we gotta share back and forth between the app server backends and the IngenX frontends. We gotta go shift some traffic around, stuff is changing all the time, kinda feels like we might need to add a robot here. So, sweet, this is 10 years ago, I'm using Puppet, but, but my organization pays for Infoblox. That is our DNS provider, so I gotta write a script, use some curl. I'm gonna make my special purpose software robot to go update DNS when I change my IngenX frontends. And then we're gonna implement some service discovery, it's gonna be sweet, I got some templating, I'm gonna go make some JSON files out of my Ruby template, I got an EPL, kinda file extension, doing a little bit of loop-in programming from my YAML struct, this is like I got another special purpose robot. But my company has like six domains. So I gotta go talk to John, who manages DNS over there, just make sure, because sometimes we break stuff, my special purpose robot, it's not the highest quality piece of software, it just helps us get the job done. But he wants me to file a ticket every time we run the script. Cool, sounds good. Now, this story of pain is not new. I know you're over there, you're kinda thinkin' Lee. You're talkin' like this was yesterday, I just did this last week. But the cool thing is our tool's above with us, right? So nowadays, if you wanna use Kubernetes, this is a simplification of the picture, but maybe we don't gotta build any special purpose software robots anymore, right? We could use Kubernetes, we use external DNS, now it almost doesn't care what our DNS provider is if we pay for Infobox, we're using name.com, whatever. And then, Kubernetes got our back. We don't gotta worry anymore about what IP addresses are, it's got its own service discovery system in the middle of it, cool, great. So, we get this fast, consistent, distributed computer and has this fancy programming model that's based off of Prox Theory. It's got these unprecedented capabilities. It's gonna let us stitch together networking. We gotta compute APIs. We wanna do some application-specific APIs. Give me a deployment, give me a staple set. You wanna manage some configs? You wanna deal with some secrets? Let's split up those APIs. But we'll give you the same API machinery and a promise-based programming model, cool, okay. So, Kubernetes gives us this really cool way to talk, right? We got the big distributed computer, it can do all of the things. Now everybody wants to share it. Well, GitOps is the thing that Kubernetes is begging for. Right, GitOps is the actual talking to each other. You see, because Kubernetes is not a forum and nobody logging into Kubernetes, being like, yo, what's up, Miranda? I wanted to go and change this resource. I thought that you owned this part, right? We gotta go and do that talking with each other somewhere else, somewhere that's not Kubernetes, but we're gonna use Kubernetes underneath. That's GitOps, right? So, this talk is called People Process GitOps. Those are actually greater than signs, right? Or maybe they're kind of a left to right way of thinking. We wanna think about the people in process first. We used to say this all the time at places like DevOps days or maybe your local city meetup, people process tools, people process tools. It's like the tools don't have any value in themselves. We gotta go and do something with them and the whole doing something is incentivized by the fact that we work with a bunch of people, trying to serve a bunch of people and the value of the things actually comes from the people. So, people process GitOps. Why did I think it's important to take 15 minutes of your time to talk about this? If you look around at the schedule, you go and you talk to the sponsors here. We got a lot of focus on tooling here at GitOps days. But I don't wanna see a new community of practitioners making the same mistakes from, the same lessons that we learned from literally 10 years ago with the DevOps movement, right? 2014, that's when the first state of DevOps research, research survey, first 2014 state of DevOps survey went out. So, let's bring this programming model. Mike's getting a little hot here. But let's bring the programming model that Kubernetes is inviting us to. And we pull it into our place of collaboration. This is not something new, right? GitOps feels a lot like what we used to do with configuration management. The part that is new is all of the stable APIs, all of the interoperability, all of the standard interfaces that allow you to access different infrastructure vendors, all of the application concerns, the policy concerns, the compute concerns, the capacity management concerns in the same place with the same language instead of a bunch of different formats on many different servers in your fleet of configuration management, right? So that's the new part, but this is not new. So what can we learn then from all of our DevOps ancestors who came before us? Well, obviously we gotta have a lot of Kubernetes clusters, right? And you're like, you get a cluster, you get a cluster, your dev team's hiding a cluster under the desk. And so your picture of your organization kind of looks a little bit like this again, right? You got some robots around doing the special things that your organization needs to accomplish. And now you got some things to hand off to people. And so starting to feel a little bit like maybe you might be doing some of the same stuff that you did before that feels a little bit inefficient, right? So maybe our community of practitioners needs a little bit of a reminder or maybe even a primer, right? Maybe we're not even reviewing. We got some new people here. We never thought about what does it mean to be successful in DevOps? And I have good news for you. There's actual things that you can measure. We've done literally 11 years, I think, of studying on what these metrics look like in organizations that perform well in the task of software delivery performance. So the organization that sort of came up with this stuff, they've been slurped up by Google, they're called DORA, DevOps Research Association. Puppet uses the same methodology in their state of DevOps survey as well. And the things that we use to indicate these top four were the first ones that we started with. And then recently in 2021, we added this fifth one. So deployment frequency, high-performing organizations in the task of software delivery deploy frequently. Oh, it seems simple. Okay, how frequently? Well, maybe if you've got to do your organization 10 times a day is frequent. Maybe if you're working on a small application, you're releasing three times a week. But regardless, you kind of want to pay attention to how quickly can you get changes out? Now, more than just deployment frequency, there's this little tweak. What does lead time mean? But when we decide that we want to make a change and then we want to implement something and then we get into the implementation phase and then we go and talk to all of the people, whether it's project managers or folks who are owning a particular part of the system that we need to interoperate with, there's the lead time for how we get some change into the system. We got to listen to that as well. When we deploy fast, we're going to have good lead times. It means let's make these times smaller. Now, then something goes wrong. High-performing organizations delivering software restore service quickly. Pretty simple statement, right? Naturally, you want to be successful. You want to provide value to your customers, actually turn that into money or organizational impact or whatever you are looking at as your kind of function for success, your organizational mission. You've got to restore service to the people who that service matters to. Now, we're changing the system all the time. We want to make sure that as we inject change into the system that we're not necessarily always injecting failure. And then lastly what we find is that people who are running software at scale, serving lots of people, they do so reliably. Now, there's one interesting thing here and it actually comes out in the surprises from the 2022 survey, which is very recent stuff. Reliability is only helpful in improving software delivery performance for organizations who are already performing well in all of these other four characteristics. So, it's no good to be reliable but to not be able to change your software frequently with good lead times and active ability to restore service when something bad happens, so. Let's look at a little bit of these important lessons learned in the context of some relatable situations. So we know we've got to measure on these five points but what can we do about it when in actual practice? The first thing that's probably going to be most important to focus on, say you're a new practitioner here or you're with a company who's starting to scale out their GitOps solution. You're also adopting a lot of these other CD and CI tools. Maybe you're pulling in CircleCI or something into your workflow, right? I just came out of a session where people were talking a lot about CD events. So you're focused on all this tooling. You're gluing all of these things together and everyone's like, oh, CI CD, let's do it. But CI CD is not one thing, right? We have continuous integration, which is the process of testing stuff, right? Actually asserting that your software probably does what it's supposed to do. This is the beginning of the test pyramid. We have delivery, which is actually packaging your software up into a release, maybe making some assertions about it that you think that that software now would behaves properly as a unit that has a version number, as a place that we store it. Now we can actually point and say we want that artifact to run. Maybe it has like database migrations and stuff like also bundled with it, right? So continuous integration, continuous delivery. Only now do we get to continuous deployment, which is usually the part where we really are thinking a lot about GitOps. GitOps is the place where we collaborate with other people to have our communal group of assertions, our understanding of the history and the current state of what we want our distributed computers to do to actually then imply that state into the deployment of it. Continuous deployment, GitOps is mostly about continuous deployment, but you can put other stuff into it if you want. So we want to decouple these things. If you cannot test and release separately, you should do that. You should make sure that you can be able to decouple those systems from each other. You should be able to cut a release and cut the testing phase separately from each other. Usually you'll find that people have coupled this into the same shell script. So be careful of this. Even worse, we need to be able to deploy whenever we want. Remember the, remember that? Let's deploy frequently. We should be able to take a software artifact and say this is what should be running now. If you cannot do that, decoupled from your test and release process, you might have some issue recovering in a timely manner from some sort of production defect. Here's another question. Maybe you don't need the stage environment mandated on every single change, right? Like, if we're trying to deploy frequently and we actually have good ways of talking with each other about what we're doing, well, let me in on a little secret here. Your fancy distributed computer is really, really good at doing what's called a rolling update. And the rolling update is supposed to validate that the thing started correctly. You know, it's gonna make another replica set. It's gonna start moving your traffic over. Hopefully you've seen my zero downtime talk or read somebody's blog post about how you need to have a pre-stop lifecycle hook to sleep so that you drain traffic and all that stuff doesn't do it by default. So there's a bunch of little gotchas. Hopefully you didn't name your config map wrong. But as long as you've got these API validations up in front, maybe you're linting your Kubernetes configs, doing a reasonable review, you should be able to tell, oh, this change is not that risky. We're doing a small thing in our software. Let's get it out as fast as possible so we can go and make more small changes. You put staging environment in the middle of this, you add mech at least five minutes into your cycle time. And so that's gonna affect your lead time. It's gonna affect your deployment frequency because you're affecting how quickly people can work with each other. And here's the deal. You mess with deployment frequency and lead time. People are gonna start batching changes into bigger blocks. And what do we get with bigger batch size? More risk. This might even help you lower your change failure rate. And so you see how approaching these problems, we wanna think about how they affect our key metrics and then measure them again at the end of the day. And really the tools that we use, they affect the way that we work with each other. They affect our organizational habits, our way of developing a mental model with each other. And really, again, going back to the whole where does the value come from? The people that you work with and the people that you serve. That's the only reason that our jobs are important. It's the only thing we're working on really. That's what mission is derived from. So maybe in our habits and the way that we treat each other, the way that we interact on an email, the way that we respond to somebody in a pull request, maybe that's where the important stuff is. So let's remember these metrics. Let's keep in mind when we're implementing new tools, we've got lessons to learn from what we've done before, half of the same new. So my name's Lee, hit me up.