 Good morning, good afternoon, or good evening wherever you happen to be joining me from. I'm so delighted you're here We are going to spend the next 35 minutes or so talking about get ops I'm betting that most of you have some idea of what get ops is But I'm betting that there's more to it than you might be thinking about and that's what I want to show you in the next Half hour or so by way of a brief introduction. My name is Cornelia Davis. I'm the CTO at we've worked I've been in this industry for a long time about 30 years. My background is in development. I've always been a developer There's a little hint there that says wasn't ops, but I can definitely consider myself an ops person now as well I've been working in web architecture. So gosh for well over 15 years cloud native for nearly a decade although I'll definitely concede to you that We we didn't always call it cloud native We certainly didn't call it cloud native a decade ago. A lot of that experience comes from a long time working with cloud Foundry Which was a cloud native application platform that's been out there for quite some time I've been doing Kubernetes for nearly four years, which doesn't make me a veteran, but it doesn't make me a total newbie either The other thing that I mentioned is that I'm the author of book called cloud native patterns Which is targeted at the application developer and architect and goes over all of the patterns that are required to make it software work really well in the cloud being Software that works well in a highly distributed highly distributed environment that is Experiencing constant change Now that notion of cloud native is very relevant to the get-offs area Rather than focusing on the development of software the software patterns themselves What I want to do with get-offs is talk about cloud native from an operational perspective So come along So I'm not going to spend a lot of time going into the details of the benefits that come from get-offs We will talk about a few of those things as we go along But there are get-offs we're not doing get-offs for a get-off site There are certain benefits that we want to realize and those benefits really center around things like security repeatability recovery from failure Productivity gains so that we can do things over and over Reducing toil all of those types of things And the way that we do get-offs is going to have a direct impact on the benefits that we get to enjoy Now get-offs is if you will a set of practices and patterns And then of course tools that help you implement those practices and patterns that really give you that devops Environment that has come to dominate the way that we think about delivering digital assets these days So given that we want to achieve certain benefits the question then is How do you get ops? What is get-offs and how do we do it? Now I'm betting that most of you have it An impression of get-offs that goes something like this We're going to store code and Increasingly configuration in something like it after all get is the first part of the name get-offs Doing things and get is going to cause some automation to happen and that automation is going to configure some type of a runtime environment now get-offs of course is applicable to both Applications that are running in that runtime environment But you'll see as we go along that it's also very applicable to the configuration of that runtime environment and the standing up in maintenance of that runtime environment as well And of course The whole one of the whole points of get-offs is that the primary user interface For this get-offs processes all the way back on the left hand side of this diagram on the get side So the UX is actually using get Now that doesn't preclude other user Experiences but get is it is one of the central ways that was significant ways that we can interact with this get-offs Environment so rather than humans touching the right-hand side of the slide were touching the left-hand side So this is if you will the ten thousandth of you of get-offs and is this indeed what get-offs is and the answer is yes but Details do matter and how we actually put those pieces together and most notably How we do the automation that is in the center of that diagram That is going to have a direct impact on the level of benefits that we enjoy coming from our get-offs practices Now I have a confession to make I have been known to call myself a propeller head before and I am I am a Technologist and what we're going to do in the next 30 minutes or so is I'm not going to present to you the principles of get-offs. We're going to derive them So yes, I'm a mathematician at heart and in mathematics. We do derivations But if you're not a mathematician Please stick with me because I think you'll enjoy this even if we don't you know do proofs at that level So I'm going to kick things off by going back to a version of this diagram that we just saw a moment ago It's perhaps a little bit more detailed, but it does launch us on our journey of this derivation So this is a pattern that I'm sure most of you are familiar with we've got the developer and the DevOps engineer They might be the one same person They might be two different individuals and they're checking their code their source code into a repository They've well established CI practices that are these days because we're operating in a containerized environment Generating those container images Increasingly we're also storing the application configuration and get and then as we alluded to in the previous slide We're going to use some automation to draw those pieces together And get them running in some type of a runtime environment and this is coupon So of course we're going to talk about that runtime environment being Kubernetes Now if we take a look at this and we think about that That automation how do we achieve that automation that I'm alluding to that's right in the middle there Well, we've already got some automation in this picture, right well understood and well established automation And that is the CI system So in fact that might be the first place that we might look is to say oh well Can I just add to that CI process? I mean after all continuous integration is Extremely well understood. We've been innovating in that space and maturing that and we have a deep understanding that it's been more than a decade Probably closer to two decades where we've been putting some type of continuous integration processes in place And we've achieved all sorts of benefits from higher quality and shorter time and repeatability And the reduction of toil and developer productivity and all of that So it would be natural for us to think all right well Let's go ahead and just insert deployment as a part of that automation But I'm going to argue to you that continuous integration is not the same as Continuous delivery and deployment that there are reasons to separate those out and those reasons include things like If you are running in a regulated environment, you know that you are dealing with separation of concerns Developers release code. They are the ones that write the code and release it But then operate operators do the deployment. They've got the keys to the production environment They've got a set of rigorous processes that they're trained that are intentionally not given to the developers Again because the separation of concerns ensures certain levels of security that are difficult to achieve in other ways Another thing is that you're very likely to take that artifact that you're creating in your continuous integration process And you're going to deploy it to many different environments So there's already kind of an inherent need for the decoupling so that we can go to those different environments And another reason for having decoupling is to be able to recreate a deployment that shouldn't and Because we've had some type of a failure for example, and that shouldn't require a new build to be created so Those are some of the reasons why we want to decouple these things rather than having this picture What we want to do is we want to draw that deployment back out of continuous integration So one of the things I'm very fond of saying is that there's no CI CD. It's not one word There's CI and there's CD, but there's absolute benefit in Separating those and that's what we see in this picture. So we no longer have it in the CI process Now that CD process is great And one of the benefits that I didn't mention on the previous slide is that if you are Including deployment as a part of your CI process that generally means that your CI process is being given the keys to that Environment that deployment environment So you've got the keys or you've got the credentials those types of things which represents a security Boundary even if you're not worried about the developers Just having that system the CI system now is an attack surface And that is not at all uncommon for us to hear those stories Where a breach happened because somebody was able to make it into the CI environment And threw that get into the the runtime environment But beyond that if so now we say okay Well the credentials aren't in the CI system and we can constrain those to being in the CD system We can say that the operators are the ones that are running the CD system So we've got the separation of concerns What happens if we now have this? I'm doing that deployment to a bunch of different environments dev staging and prod Now i'm back to needing the keys To dev staging and prod all from this kind of centralized hub and spoke cd system So i'm back to struggling with this security concern You can also imagine that there might be deployments where we're not talking about three targets We might be talking about hundreds Factory floors. So we're starting to talk about edge deployments Or you might have thousands of coffee shops or sandwich shops Or you might have tens of thousands of cell towers. Yes, we're starting to see kubernetes running in cell towers So now in addition to those security challenges, I've also got challenges of network connectivity I've got challenges of scale How many environments can the centralized system deal with the centralized cd system? Now there's an easy answer. There's an easy solution to this and that is to just spin things around What we do then is we move the cd process from being a centralized thing to being out in the runtime environments themselves And so rather than pushing configuration out to those runtime environments What we're going to do is pull the configuration into those runtime environments from the repositories Now, we'll see some patterns in just a little bit on how we can manage that at scale Before we go to those patterns I have to ask the question. Well, if we're pulling Then how do we know when we need to pull? The answer is that we don't have to We certainly can use events. We could have something like a get push event Trigger that type of reconciliation But then what happens if the network's down when that trigger happens? You've got to deal with that Well, the good news is that we already have a pattern that we know works in this particular scenario We don't have to know when to reconcile Because we have reconciliation loops that run forever And this is the core pattern that kubernetes really brought mainstream Now this reconciliation pattern I have to say existed before kubernetes It existed in cloud foundry for example, where we applied it to the orchestrate We applied it to the running of container images now those container images Originally predated docker so they were it was well before oci and We built our own reconciliation engine In in cloud foundry now cloud foundry applied it to this very specific use case No question that kubernetes applied it more broadly kubernetes created this Platform if you will and we're going to see that play prominently through the rest of the presentation So we've got this notion of well We can just inside of these kubernetes clusters run these reconciliation loops that are going to constantly be drawing In the configuration from those repositories So we have the desired state in the repository The reconciler is Watching the actual state of the system and it's doing what it needs to do to bring those two things together So that pattern that basic reconciliation pattern is extraordinarily powerful And it enables a whole bunch of other interesting patterns Patterns that I like to call get ops patterns or cloud native operational patterns So let's talk through some of those now The first one well actually this is not the first pattern. This is more like the You know, maybe third pattern given that we talked about pull and we talked about reconciliation Those are certainly two important patterns But let's talk about this next pattern The pattern that I talk call drift detection and remediation So we've done this great get ops process where we have Declare put our declarative configuration in git We have a reconciler that is constantly drawing things the latest things in from git into the runtime environment And somebody comes along And does a kubectl apply This is the modern-day equivalent of sshing into a box One of the benefits of course of storing configuration in git is that git has a version history And so if we were to have some type of a Catastrophic event and we needed to recreate our environment Well, so long as there hasn't been any drift from what we have in the declared configuration We're golden But again here somebody has done the modern-day equivalent of sshing And we have drifted from that desired state as recorded in the system So what what do we do? Well, that reconciler is constantly running And now what's happened is that that actual state is diverged from the desired state And the reconciliation loop can do some type of remediation Now what is the remediation that you want? Well, it can actually go back and say, you know what? I'm going to undo that change. I'm going to go back to the state that was represented in the git repository Or you might decide that you in fact want to just alert on that and you want to let somebody know that something's changed That is your prerogative and that's what you are going to tool into what I like to start calling a git ops pipeline Stringing together these types of behaviors into the patterns that you need for your organization All right, so that's one pattern. Let's talk about another one that I like to call image update automation Now I've made a little bit of space in this slide in the middle To show you that what we have is just a basic application deployment We've got some yaml. It's stored in the application configuration repository We have an image that's stored in an image registry And of course the yaml points to that particular image my image version 1.0 0.0.1 Now that ci process somebody checks in new source code the ci process creates a new version Of that container image and now what do we do? Of course, I as an operator could go in and I can update that That yaml configuration and I can check the new version of that yaml configuration into the git repository And through the magic of that reconciler that I talked about in the in the previous examples That will get deployed out into the runtime environment But do I need to do that as a human by hand? The answer is no How about We throw a reconciler in there that is constantly watching that container image Uh a registry and it does the update in the yaml itself Just the update in the copy of the yaml that's running within that reconciler environment Then I can invoke another reconciler That is going to have the effect of pushing that into the git repository Now it could do the commit in the push immediately Or you could decide in your git get um get ops pipeline that what you want to do is actually generate a pull request That pull request goes into the repository somebody approves it or maybe it's automatically approved And then what happens? Well, this is the pattern that we already know and love Is we have the reconciler that's running inside the runtime environment that pulls the latest version in Has the effect of deploying the updated version of that image So that's another pattern Now you might notice in tint that there's a number of reconcilers here You're going to see where we're where we're going with those reconcilers and how we stitch all those things together in just a little bit But let's talk about another Another pattern and that is environment customizations. Remember, we talked about this earlier So we talked about the fact that I have Dev staging and prod and they're all going to be deploying this application And while a lot of the application deployment Configuration is going to be exactly the same. There's going to be some differences across those So how do we deal with that? Again, let me make a little space in this diagram And I'm going to talk very briefly about customize With a k customize allows us to do this that allows us to store in the application configuration first of all some description of Some some uh configuration that has a base So it has some commonality So this deployment my deployment is going to go into dev staging and prod But then it also allows me to apply overlays on top of that Essentially overrides on some of the basic things that are in that base pattern Now those overrides can be handled by reconcilers so we can actually apply these customizations as a part of our get ops pipeline and those Reconcilers those customize reconcilers can operate independently in each of these different runtime environments So doing environment customizations is an also an important pattern All right So we had started here and those patterns that I just described to you really are addressing this They're saying all right. There's a whole bunch of processes There's a whole bunch of detail that's required in this process Of taking these application configurations And these image images in the from the image registries And composing them together Into the actual declarative state that's going to be running in my runtime environment And I like to call those set of controllers that we've just been talking about the set of reconcilers delivery controllers I'd hang on a second. I said This is about deployment and I don't know that delivery of configuration manifests Really encompasses all of deployment. So have we achieved deployment? Well, we're getting there But we're not quite there yet So let's look at one more pattern Now if we remember we were here recall that in image update automation We had this pattern remember we had reconcilers that were watching the registries And updating the the references and the yaml Then we had reconcilers that were interfacing with git To do things like pull requests or maybe automatically committing things into the git repository And then of course we had the reconcilers that were drawing those configurations Doing the composition like using using customize like I showed in the last example And then drawing those into the runtime environment Now this is a little bit too coarse grain We need to get a little bit more detailed here And that is to say that Technically what we were doing was we weren't getting all the way to the running of those applications What we were effectively doing with those reconcilers is getting to the point where we had drawn that configuration Into kubernetes into xcd So now we've landed that in xcd The next step of course to get all the way to runtime Is that we have those That we have running pods in the end now How did we get those running pods? Well, you might say that's kubernetes And of course it is but how does kubernetes do it? It does it With reconcilers Right it has a reconciler for the deployment. It has a reconciler for replica sets. It has a deployment a reconciler for Damon sets and so on So it is kubernetes that has these what I call runtime controllers That complete the picture Now the real magic happens when we start to draw together All of these different reconcilers all of these different controllers across this entire spectrum now All right. So kubernetes has these runtime controllers But as I suggested earlier kubernetes isn't just about runtime controllers for pods It's actually a platform for reconciliation loops. So what if you want different deployment Behavior for example What can you do with that? Well, what if you want Canary style Rolling upgrades. What if you want blue green? What if you want ab testing where you're actually going to run both Two versions of something in parallel and actually use the metrics coming out of those to make decisions on You know who gets what in and where the traffic goes. So different deployment scenarios How does that work? Well, what you can do is You guessed it You can have a reconciler kubernetes Is allows you to extend the api So you can create a reconciler that for example recognizes when a deployment is happening And provide some additional logic on top of that Now as it happens one of the projects that we have worked on here at we've works is a project called flagger It is totally open source and it does exactly that it allows you to Define a deployment strategy progressive delivery as James governor had coined that term And it allows you to select one of these release strategies canaries ab testing blue green and then it interfaces with ingress either kind of plain vanilla ingress like an engine x or a service mesh And it interfaces with the ingress or the service mesh to do the traffic routing to provide Exactly those types of release strategies that you're going for And how is that implemented? Of course it is implemented as a set of controllers that run in the kubernetes Um, this project is open source. You can find it there at the url And I won't go more into the details But what you can see there and really the point that i'm getting to Is that not only are we going to leverage the runtime controllers that are in kubernetes? We can also extend things with the runtime controllers ourselves Okay That's a lot If we go back to this picture that I just showed you a moment ago There's reconcilers all over this How can I possibly manage that? Well, the first thing that I want to show you is that in fact This is a version of the very first very simple diagram that I started with at the beginning of the presentation Let me show that to you again We've got something we've we're storing some declarative configuration and give Then we have a set of delivery controllers That facilitate the delivery With whatever the pipeline is I want Into fcd Then I have a set of runtime controllers that actually achieve getting that running in production remember That is We store code in configuration We've got automation and we've got a runtime environment Oh, and of course the user interface for this entire system is all the way on the left hand side with git So it is if you will that very simple picture that I showed at the beginning But hopefully you're getting the idea now that the way that we do that automation in the middle Has to be cloud native Cloud native operations is what git ops is all about Let me kind of put a bow on this now I like to think of git ops as this those set of delivery controllers And the delivery controllers I've been referring to them by name But some of the ones that that I referred to earlier in our derivation were things like that We call the source controller or the customized controller or the image up automation image update automation controller On the runtime side We have the ones that are baked into kubernetes like the replica set controller or the daemon set controller or Flagger is an extended controller and yes cluster api controllers Those cluster api and git ops match made in heaven. They work beautifully together So you take those controllers And then you stitch them together Into the pipelines that are relevant for your organization and for your needs And that's what draws together git ops So to go back to this picture we've now completed this entire cycle We've got continuous integration on the far left We talked about the delivery controllers. We added on the runtime controllers And you draw all of that together And you get git ops So git ops is Not just about delivery It is in fact git ops is the combination of continuous delivery Delivery controllers and continuous operations runtime controllers And when you put those things together, that's where a lot of magic happens Now I had promised you that we were going to Derive principles. We were going to go back to first principles So oftentimes when I've spoken about git ops in the past, I've started by presenting these principles but today the whole idea was to Kind of maybe challenge your assumptions on what git ops is and to derive those principles from scratch But let me now summarize and review those principles That we've just gone over I'm going to start with the bookends On the left hand side, we have declarative configuration. That's yaml for example On the right hand side, we have the software agents. So the deployment Controllers for example deployment replica set These are two foundational principles that kubernetes brought to the party Without those two foundational principles, we wouldn't have git ops as we have it today That really ceded the ability to get this started So we have those two bookends and what we've done with git ops is we've added Two more elements on in the middle We've said well if I've got that declarative configuration and I now add Storing that declarative configuration in a version version controlled Immutable store That has semantics like git Then I achieve those benefits of things like roll forward rather than roll back I could always get back to An earlier configuration that I know worked or if I have a disaster I can recover from that disaster very quickly So that is another magical element Now, yes, I also alluded and mentioned several times that it is the user interface It doesn't have to be the user interface and there are some very interesting other user experiences That don't involve necessarily going in and creating a pull request by hand But some of those abstractions those git semantics are important and very very heavily leveraged So I like to say that git plays the role of both A potential user experience as well as a distributed data store that is immutable and has version history And then finally the third thing the third column over there Is that we have this set of delivery controllers That automatically apply But that allow me to build the pipelines that I need to apply those patterns in the entire github's practice And that's what we've done is we've just derived those four key github's principles Now I won't go over this in detail, but you've heard me talk about these github's patterns Or as I also like to call them cloud native operational patterns Over the years As we got better and better at microservice architectures We learned a whole suite of patterns that were those cloud native patterns And in fact, that's what I cover in my book things like retries and circuit breakers and Service discovery and those types of things What we're doing now with cloud native operations and with github's is that we are Deriving these sets of patterns That are going to become ubiquitous in the in the coming years and going to be applied for for great gain So with that, I thank you for your attention and I hope you enjoy the rest of the conference. Thank you so much Have a great time