 Hello and welcome to GitOps for multi-cloud applications. My name is William Chia. I'm part of the product marketing team at GitLab. And together with my colleague, Cesar, we're here to talk to you about GitOps. You can find me at several places around the web. While you're watching this talk, feel free to shoot me out a tweet on Twitter and I'll be happy to respond with any questions or if you have any observations. Love to connect you with you on LinkedIn as well. So as you look at this, we all know the landscape of software development is changing. When we look back on how software development has evolved, we can remember the days of waterfall development. Maybe some of us in large enterprises are even still doing some of this where we're delivering software like it's hardware. That causes a lot of challenges. Innovations like Agile and DevOps have allowed us to move a lot faster while also being a lot safer, deploying with less risk more frequently, being able to easily roll back changes. And the latest evolution in cloud native applications is about dynamic environments. Those that scale up and down automatically based on load where things can shift and change, different users logged into the same application can have a completely different experience based on the region or what they've interacted with. And in order to manage all these complexity, we need an infrastructure automation practice that's able to keep up with all of it. And that's where GitOps comes in. In this talk today, we're gonna tell you what GitOps is, why GitOps is important, and we'll give you a hands-on demo so you can see step-by-step how you can do GitOps yourself. To get GitOps, what is DevOps? What is GitOps? So GitOps is an operational framework that takes DevOps best practices, things we've used for application development for years like version control, collaboration, compliance and CICD, and it applies them to infrastructure automation. In a sense, GitOps is gonna have three major components. This is gonna be infrastructure as code, merge requests or what we call MRs and some types of Git platforms that call them a PR pull request. At GitLab, we call it a merge request and the MR, that's gonna be our agent of change and automation. We're gonna use CICD, but essentially we wanna automate our changes. When you combine these together, you get the practice of GitOps. Let me step through briefly each of these. Infrastructure as code, merge requests as your agent of change and CICD. Infrastructure as code has been around for a long time and when I'm thinking about GitOps, I like to think not just about infrastructure as code but sometimes I'll just say exes code because this could be your infrastructure, configuration, this could be policy as code or security as code. Any type of operations that you're doing at all that you can define in a text file, in a text-based format or define as code, all of a sudden becomes very, very powerful. All of a sudden you can version it, you can collaborate on it using standard development collaboration tools like Git that software development, software developers have enjoyed but now we can use it for operations. And so exes code is a powerful construct and in particular, more and more and more, we're not just having procedural code, you can use that, you can do GitOps with procedural code but more and more if we're using declarative tools, tools that allow us to just simply describe a desired state and our system is gonna enact that. These are tools like Kubernetes, a lot of modern infrastructure as code tools, Terraform and others, allow you to work in a declarative fashion and this is what really starts to unearth some power. Of course, as I mentioned, this is stored in Git version control and so all of the Git tooling, standard practice that engineers and developers are familiar with using Git and all of your Git tooling, you get to leverage that with your operations when you're doing infrastructure as code. But just having your infrastructure stored in a Git repo, that's not enough really to get the power of GitOps as a collaboration, you need to be using the merge request as your agent of change. And the way this works is the merge request or the pull request serves as a gate for any changes that go out. So you would have a main branch that is tied to let's say an environment. This could be a, we could call the production branch, the main branch, the master branch, the trunk branch or it could be related to let's say a staging environment or any other type of development or testing, whatever environment you have, but that branch represents the state of that environment and whenever you wanna make changes to it, you create a new feature branch the same way an app dev team would for a software development feature, but in this case it's a proposal for application changes. And that way you can collaborate on that branch and when it gets merged back in to that main or that trunk branch, that enacts the change in that environment. What this allows you to do is through that merge request you can do code review, collaboration, approvals, manage all of your compliance. So a lot of powerful elements, these collaboration capabilities to be able to do code review, collaborate between developers, security practitioners, operations engineers, that's what's really powerful about GitOps. And certainly important is this automated reconciliation or CI CD. We're gonna use CI CD as a type of reconciliation loop. You can imagine that anytime the state of the environment is out of sync with what we've defined as our declarative state and our Git repository, we want that to be matched up. So the CI CD can run, it can run every time let's say you merge to master, it can run on a timer. And what this allows you to do is allows you to tackle things like configuration drift. Unfortunately, when sometimes you configure an environment and then all of a sudden for whatever purpose, whether it's manual or there's errors and it just away from your configuration, the next time that CI CD runs, the next time that automation loop runs, it goes and it gets the latest state, the source of truth from your Git repo and it updates. Now this could happen in two different ways when you're using GitOps. There's two types of GitOps. This could be an agent-based GitOps where you have an agent and that runs within your infrastructure. This is particularly popular when you use Kubernetes. You can have an agent running inside of the Kubernetes cluster and that agent continually pulls in configuration from the Git repo or it could be a push-based where you have that CI CD running and it pushes into the cluster. And there are advantages and disadvantages of both. So the main core component here is that changes are implemented automatically and we are no longer making manual changes for infrastructure, but all of the state of infrastructure stored in Git and all those changes are applied by our automation tool, by our CI CD. So that flow for GitOps flow, it ends up looking very similar to a software development flow. You would create an issue. We need to expand the number of nodes in our node pool or we need to lessen resources for a particular service because it doesn't get as much traffic and we want to save on cost. Or we need to investigate because something went down. Based on that issue and defined problem that we want to go tackle, you create a merge request with a branch, you run that automation and then when it's merged back into master you have your running production environment. So some folks might ask, well, this sounds just a lot like infrastructure is code. What's the difference? Well, with infrastructure is code, code may or may not be version controlled, right? You can take a YAML file or you can have a text-based definition in Linux. We've been using .comp files for a long time and that .comp file or that configuration file, that can live on the server and you SSH to the server and you update that text file. You could call that infrastructure as code but that's not get-ups. You're not storing that code and get repository. You're not using all the power of get version control to do things like roll back, right? To roll forward. This is particularly powerful. Let's say something goes wrong and in the middle of a firefight you just need to get that infrastructure up and going as soon as possible. But then later on, you want to do some introspection and you want to do some forensics and figure out what was it that went a lot wrong? Well, if you just have the code stored in a text file somewhere, it's been updated and the state in history has been lost but if you're doing get-ups and the stored and get, you can actually see what the state of that infrastructure was before you made the changes. You could spin up another environment and do your forensics and introspection there. Same thing with infrastructure's code. Those changes may or may not go through any type of code review or approval process but with get-ups, we use that merge request as the change agent. This is so much faster and so much more automated than having your change management meeting, right? Rather than having to sit around and discuss what goes in and what goes out and what's on the calendar, everything is just automated. You have your automated tests. A lot of times the security approvals can be automated so you can still have compliance but it's super lightweight and it's automated when you're doing get-ups. And finally, as I've kind of already said, changes could be applied in many ways if you're just doing infrastructure's code. You get SSH to the server. They may or may not be automated but when we're doing get-ups, we're absolutely automated infrastructure. So in a nutshell, you could say that get-ups is infrastructure's code done right. If you're doing it with the best practices, we're gonna call that get-ups. So why is get-ups important? Well, you get a lot of benefits. Self-documenting environments, the code is there. You can now share knowledge amongst your teams when somebody's clicking in a GUI. How do they tell other people what they did or how they did it? And as I mentioned, you wanna duplicate that environment. Let's say to spin up another thing or to do forensics after a firefight, as I mentioned before. You get these version control benefits, the ability to roll back and roll forward. Sometimes the app version depends on the infrastructure configuration and vice versa. Let's say for example, you have a certain memory leak in the application and so you have this application and it takes a lot of RAM and a lot of compute power and the developers, they go and they fix that. So now that with a new version of the application, we can have a more lightweight infrastructure running that same surface. It doesn't require as many resources, but all of a sudden we found out that there was a bug in that new version of the code. So we actually had to roll back the application of the previous version that didn't have this really terrible customer-facing bug, but it didn't have the memory leak. It wasn't as efficient. Well, if you don't have the ability to roll back your infrastructure at the same time and you deploy that new application where there's not enough resources, now you've taken down, you're gonna crash the whole environment. So this ability to version your infrastructure alongside of your application and have those walk along, depending on the needs of each to really powerful, helps with misconfiguration of the infrastructure. And this can really, really help your meantime to recovery. When things go down to having it in the version control, sometimes it just being able to quickly roll back to the last known good config, last known good config and we can roll back and we get up and going quickly and then we can go back and try to figure out what's going on. We have these automation benefits. We can deploy faster more often with less risk. I've already talked a little bit about configuration drift, a key benefit of GitOps and these security and compliance benefits especially with a lot of enterprises or a lot of businesses and organizations that are regulated industries, you need to have a set of compliance policies that you follow. And this can mean permissions to your particular environment. This is compliance and collaboration. So you can have both. Sometimes you have only compliance where you can lock everything down and only a few people have access to the infrastructure tools and can update or make infrastructure changes and you're constrained by your compliance. But with GitOps, you're unleashed in your compliance, anybody can make a change, propose a change via an MR because it's just code, right? Anybody in the org, any developer, any engineer can propose an infrastructure change but potentially you could have just a few people that have the access to merge that change. And so that's how you can stay compliant and have collaboration. You can use all of the permissioning of Git, you get that for free in effect. You've changed process compliance as well because you have an audit log of all your changes in your Git repo, really, really powerful. So with that, we've talked about what GitOps is at a high level, why GitOps is important. At this point, I hope you are excited to see a hands-on demo and I would like to pass it on to my colleague, Cesar. Thank you. Thank you, William. My name is Cesar Cvedra and I'm a technical marketing manager at GitLab. Here you can see my social handles. Please feel free to reach out to me via any of them. So I'm going to be covering first a push-based GitOps demo for the CI-CD component, which we've been calling agentless and then later on we'll do the pull-based. In this scenario, I'm going to be talking about these three users. Sasha is a developer in this organization. Devon is a DevOps engineer and Sydney is a database administrator. Sasha is going to be requesting a database provisioning from Sydney and Sydney is going to look in Devon into the whole collaboration effort. So here you see the GitOps project is structured into two different sub-projects. This is Sasha's view or console. There is a set of projects under the infrastructure group and another set of projects under the application directory. And this is a separation of duties and concerns within this project and it optimizes the work and collaboration among the stakeholders that will be working towards a solution within the MR. So let's go to Sasha's issue. I've pre-created already one. It's about her needing to have a database provision that Sasha created it. She assigned it actually to Sydney, the database administrator and here are the details of the requirements of this request or the details of the request. And notice also that this is already in Sydney's work list or board of to-do items. So let's go here to Sydney. And when she opens her boards, you'll see that the item is already in her to-do board. So let's go into the issue itself and then what Sydney's gonna do here is she's going to read over the problem and she's gonna start a merge request, other competing solutions called this PR. And the merge request is where all the collaboration is gonna happen among the stakeholders. Here she's gonna loop Devin, he's the DevOps engineer, just to make sure that everything is gonna be okay with the infrastructure, with respect to the Kubernetes environments that Devin oversees. Now she's gonna open the Web IDE and she's going to navigate to a new file option and she's gonna create a new Terraform file for the creation actually of this database. In this case, they are using Terraform or she's using Terraform and she's gonna paste here the Terraform file to create the MySQL database. Also she checks that she's part of the code owners file and this basically says that anything under this project AWS, she needs to approve. So she's gonna commit the changes and with this she's going to commit the changes to that branch, newly created branch. This kicks off a pipeline with a merge review job and a plant stage which is a stage where stakeholders will have a chance to review all the changes before they are rolled into production. So now let's switch to Sasha's console and she's going to refresh her page and notice that there's been some activity in the EMR so she's gonna go into it. She's gonna notice that there is a Terraform plan that has been run as part of their review step. So she's going to go ahead and expand that artifact and she's gonna click on the view, the full log. And here she checks the Terraform plan output and ensures that everything is good for the creation of the database just right there. She checks all the parameters are correct and everything looks good. So now she's going back to the EMR now and now she's gonna go into the changes. Now this shows all the changes that have happened to the infrastructure's code. In this case it's Terraform but it could be cloud formation or any other type of technology to do this. And then here she's going to add a comment, an inline suggestion actually for increasing the allocated storage from five gigabytes to 10. Again, this is showing how easy it is to collaborate within an EMR. All right, so let's switch to Devin. This is Devin's console. He's the DevOps engineer that oversees the Kubernetes clusters among other things. And he's gonna navigate to the AWS project. He's been looped into the EMR, remember? That's Sydney. She's gonna go into the EMR. She's gonna go into changes. This is something that concerns him and he's going to also participate in the collaboration. He's gonna make a quick suggestion here about the database being up to the latest version of it. So he's making an inline suggestion again, like Sasha did. Also he's gonna make an inline comment here about the username. It should not be hard coded. It should be parameterized and also an inline comment about the password that it should not be there is insecure. And it should be really a mask variable within the project. So the last thing Devin's gonna do, he's gonna go back to the EMR and make a general comment to Sydney about the EKS cluster needing to be scaled up to two nodes to accommodate this change in the Java microservice. So here let's go back to Sydney's boards. Oh, yeah, of course. We forgot to move the issue to the doing board. Sydney's going back to the EMR now. She's going to start reviewing all the result threads that have been applied to the EMR from other stakeholders. So she's gonna go ahead and apply the inline suggestion from Devin. And that's with a click of a button. She does that. Same thing with the allocated storage inline suggestion from Sasha. She's gonna click on apply the suggestion. And that resolves the thread automatically. And then she's going to review Devin's comment about the username, parameterizing the username and making a mass variable out of the password. So before she can resolve those two comments, she's gonna go ahead and go to the actual code and open it in the Web IDE. And she's going to go ahead and make the changes that fulfill or resolve the comments from Devin about the username and password. What she's gonna do is she's going to parameterize both of them. And then she's gonna go ahead and commit the changes to the branch. She's gonna go back to the EMR and now reply to Devin's comments about the username and password as part of the collaboration that the EMR enables them to do. So the last thing she's gonna do is going to, now that the threads are resolved, she's going to go ahead and add a comment asking Devin to create an issue, to increase the number of notes in the EKS cluster during the next sprint. So now, since Sydney is part of the code owner's file, she needs to approve the EMR, which she does now, and code owner's outlines the exact users and groups that own certain files and paths in the repo. And it streams lines that merge request approval process. Now we're going back to Sasha's console and the EMR requires two approvers. So Sasha's going to approve the EMR as well. And with this, now she can mark the EMR as ready before she can merge it. Now notice there are some unresolved threads and if you would like the merge to be dependent on all the threads being resolved, you can set that up also in GitLab. But in this case, I don't have that set up, so I can merge with unresolved threads. So now Sasha merges the EMR, which is going to push, basically the changes to production it's going to fire up a pipeline with stages that will take, that basically will apply all the infrastructure changes. In this case, the creation of a database, a MySQL database, out to production. You can see the pipeline there has been launched and this pipeline has many stages that will take, it will validate the configuration file, plan and then apply to production. So let's go to the RDS service on Amazon and AWS. And there is a database is coming up there. Let's go into it and make sure that the configuration matches what the EMR basically merged into the repo. So the DB name is app DB and let's visit the configuration tab of the database. The engine version is the correct one per the configuration in the repo. And also the storage is 10 gigabytes, which was actually increased during the collaboration among the stakeholders in the EMR. So here is the database is up and running and it's available now. All right, so let's go back to Devin's console and now Sydney had asked them to create an issue for the take care of the increase of the nodes for the next sprint in the nodes in EKS. So for brevity and for the sake of time, let's just make the change directly in the repo because I have another demo to show you later. So this is the cluster itself on Amazon. As you can see under, it has only one node running. There's an instance ID. So let's go directly to the repo to get repo and update the minimum size of the cluster from one to two. And let's commit the changes. All right, so we've committed to the mainline for the master and that's going to fire up another, gonna fire off another pipeline here. You've seen that before. It's gonna do the validate plan, the plan and the apply. And once it completes, you should see a second instance that have been deployed to the cluster. And there you go. Now we have two nodes in the cluster up running on AWS. So one more thing now we can change here is Sydney notices that the deletion protection is actually disabled. So this database could be just deleted anytime. We have the right privileges. So what she wants to do is as a database administrator she wants to turn this on so that the database is not deleted accidentally. So let's go back again for brevity and for the sake of time go directly to the configuration file in Terraform and make the change to the appropriate field which is that one deletion protection. So let's edit the file and change that deletion protection from false to true. And then let's commit the changes. One more time pipeline is going to be launched that will go through the steps of pushing this out to the running infrastructure. So let's go to the AWS database console and you can see the deletion protection is disabled. So let's refresh. Well, let's wait. Well, let's check if the pipeline is done. Now it's finished. And then we can go back to AWS and let's do a refresh on the screen to just make sure that the deletion protection has been enabled and there you go. So the configuration has been successfully pushed out to the infrastructure. So you've seen an MR for the agent of change push-based approach to CI CD and infrastructure scope. We've been able to do GitOps in this first demo. One more thing I'd like to show you the title of this session is multi-cloud applications. So and right now we've been only using one cloud. So I just wanna show you that these microservices run on other clouds as well. There's a few of them. There is actually three. So here you can see the clusters. We do have an EKS cluster which is what you've been seeing so far we also have a GKE cluster that is also running for us. This is the configuration for the EKS cluster that you can see through our integrations to Kubernetes from GitLab. You can very quickly spin up and destroy clusters Kubernetes cluster, sorry. And here you can see the GKE configuration called GitOps C-Savvyder GKE. Now let's go to the applications group and we have a Python application there, microservice. And let me show you you know how it's running on GKE. This is a second cloud. So we go to environments dashboard. We can see that there are two environments running for that microservice task default which is the dynamic security application testing environment. This is just a simple microservices is hello from Python. And also there's a production environment that is also running on GKE. And it also says hello from Python. So that microservice is running on GKE. This other one called spring MVC JPA is running on EKS So we go to the environments dashboard for it. You can see two environments. There's a staging environment and a production environment. And you can quickly access the one in staging. And then through this live environment link we can open the one in production. And this is actually a running inventory, product inventory microservice. And we can just add an entry there. And you can see that it's now safe to the database that my SQL database that was provisioned earlier. So not only does doing getups with get lab posters collaboration but it also helps you with compliance and audit and with automated reconciliation of your infrastructure updates which results in a higher fidelity of your infrastructure in production. Although you can use push based CI CD with getups you also have the pool based approach to CI CD. And why is this needed? So many organizations, you know they cannot open their clusters to the internet. So this approach of pool based is good for them. Here is a high level architecture of our solution. We use an agent that is deployed to the cluster and there's a server side as well called CAS that is listening on the server side. So the agent side connects to the server side Kubernetes agent and he waits for requests to process. This is more detail about the agent server and what it does it authenticates the agents that are running on the cluster. It fetches configuration for the agents and the correspondent repo and then it keeps polling for incoming communication. This is a high level relationship diagram of the agent on the server side process. And this is a workflow of the whole process of the agent checking every so many seconds with the server to see if there are any updates to the configuration of the project. We use the GitOps engine open source project for this implementation. And the Argo CD folks and the Argo Flux folks came together and they are not collaborating in this new GitOps engine project. So let's go over what we call the pool based GitOps demo. And here I've already set up GitLab instance on EKS and let's go and sign on to it and this is the password for the root user of this GitLab instance. So let's log on as root and we're gonna create two new projects. One of them, the first one's gonna be what we call the GitOps project and this is the project that is going to be observed by the Kubernetes agent server running on GitLab and it's gonna be observed for changes in configurations of the infrastructure. In this GitOps project, we're going to create a file called manifest.yaml and it's gonna be empty to start with and you'll understand in a minute why it's empty. And then we need to create a second project which is gonna be called Kubernetes agent and this is the configuration of the agent that will be running on the Kubernetes cluster. We also need to create a directory called under .gitlab slash agents and the third director is the name of the actual agent that is gonna be running in the Kubernetes cluster in this case it's called agent one. And then here we're going to give it a config.yaml with the configuration of that agent which we can copy from the documentation and this configuration yaml actually is telling the agent what is the project to observe and in this case is a root GitOps project that is going to be keeping track of. Now what we're gonna do is we're going to deploy we actually there's one more step we need to configure the agent in the GitLab cluster in the Kubernetes cluster in this case. So we have to log on to the Rails console and the way we get there is through the runner we gotta log into the runner pod and then enter the Rails console and then here we're entering, creating a project, the agent and a token for it. And we're gonna keep, we're gonna copy this top token because we're gonna need it when we configure that agent in the cluster itself. So we're creating here the secret, the Kubernetes secret for the agent that is running the server on the server side and this is the resources yaml that basically describes the configuration of the agent that will be deploying to the Kubernetes cluster. The communication between the agent and the server side is gonna be using WebSockets in this case that happens to be the domain of the GitLab instance and now we're going to go ahead and create the pod for the agent itself in Kubernetes and then you can see the agent now is running in the cluster. And now we're gonna go to the GitOps project this is the project that is being observed for changes and we're going to go ahead and edit it and we're gonna kind of pay something the documentation is just a manifest file that contains an Nginx deployment and it's gonna be deployed under the same namespace of GitLab agent and that's Nginx right there and there are two replicas. So once we save this and it's committed to the mainline the agent will detect that update and then it'll communicate this to the other server side it's gonna communicate this to the agent the agent is actually polling and as soon as it detects the configuration change it will update the cluster and as you can see here the agent has already deployed the two instances of Nginx and now so that you can see how modifications can be immediately detected by the agent we're gonna increase the replicas to three the agent is gonna detect this change and it's gonna go ahead and act upon it and instantiate another pod with a third Nginx which is right there actually right there and now we have three. So this is a quick demo of the Kubernetes agent GitLab Kubernetes agent and here's the clusters running on GKE and if we review the dashboards we can review the dashboard of the server side first which is called CAS Kubernetes agent server and if we go into the logs you can see the different events that took place first is listening on port 5005 remembers WebSockets that the server agent and the agent itself are talking on and here it's detecting the agent the agent Kubernetes agent server is actually detecting modifications to the GitOps project that is the one being modified it's also checking the identity of the agent running on the cluster itself and this is the agent log that is talking to the CAS which is running on the server side and this is the agent running on the Kubernetes itself cluster itself this is in the case where organizations cannot make their clusters available on the internet this agent is the one running on that cluster talking to the CAS to the Kubernetes agent server running on GitLab the CAS is authenticating it and then informing the agent of any updates that have taken place on the GitLab side and then the agent is getting those updates and applying them to the cluster on which it's running and here's the different log events for the agent itself that is running on the Kubernetes cluster so again this is a demo of the pool-based CICD used in GitOps and you can use push-based and pool-based these two are not exclusive of each other they are actually complementary so you can use push-based approach with pool-based approach together in combination and it just depends on what your needs are and different customers have different needs and GitLab has you covered in either case so that's all we have, thank you so much and until next time