 At the start of the day, I mentioned that day two would be about the how, how do we get started? How do we go about leveraging this DevOps platform to become more productive right away, right now? Well, if you're interested in infrastructure automation and GitOps, I'm really excited to introduce this next session for you, in which our sponsor, Amazon Web Services, will start with an overview of GitOps, and then they will dive directly down into the details you need to use GitOps in GitLab to deploy to AWS. Let's watch. Hello. Hello. Good morning. Good evening. And welcome to the session to learn about GitOps on AWS and building a residence of a distribution with GitLab. Hope you all are having an awesome day and with the amazing session lineup here at GitLab Commit 2021. We have a lot to cover in the next 20 minutes or so, so let's get started right away. Quick introduction about me. I'm a specialist solution architect here at AWS and a little over 16-plus years of hands-on experience in application and infrastructure development. I'm really passionate about building large-scale, scalable solutions for developers and enterprise customers. On multiple hats ranging from software developer, architect, engineering manager and product manager, a variety of industries ranging from financial services to retail. When typically not building with my customers and partner, I love spending time with my running behind my three-year-old daughter indoors and outdoors whenever Seattle weather permits. Today, I'm going to talk about GitOps. We will first cover what it is and the challenges that led the community to think about GitOps and then exactly see what exact GitOps is. We will then dive deep into the architecture specifics of delivering a resilient distribution using GitLab developed GitOps solution on AWS. We will finally see the solution in action and leave some time towards the end for question and answer. Let's get started. Let's first see the problem statements which led the community to think about GitOps. With the evolution of microservices and application modernization, teams are becoming more distributed and the lines between application ownership and infrastructure ownership is blurring. Additionally, cloud infrastructure can be created with just few lines of code making it easier than ever before. Because of this, customers need a way to manage application code and infrastructure in a collaborative way, a way that ensures infrastructure is consistent as the account and teams scales. With the evolution of containerization, it has become even easier to apply the same concepts of infrastructure as code further to standardize, optimize, promote and enforce application specific deployments which have historically been taxed with the underlying needs to operationalize their applications. With good intentions, of course, these taxes that went hand in hand with any applications protection readiness assessments were related with A. Automating processes such as risk change and incident management B, providing a standardized set of tool and runtime environments to comply and regulatory needs. C, provide an outstanding user experience to promote the application innovation and provide telemetry information related to metrics, tracing and logging, etc. And lastly, automate application credential management in terms of Kubernetes secrets, certificates, service roles and bindings. Now, how so ever you choose to implement this, it all comes down to the same semantic functions. You need to map deployments to commits, which is around using Git and understanding that when you build container from a repository, you know what version it is. You need to manage the version deployments. You need to configure granular access, if at all possible, without too much of additional overheads, so that the version deployments gets into the specific service you intend it to deploy. Also, you need to configure the configure and deploy application and its interaction with another, or even AWS managed service, for example, application using an IDS database with secrets stored in AWS secrets manager. And that leads you to a lot of implementation details of DevOps methodology. But DevOps isn't a product or a software that you buy. It's a philosophical approach for bringing ownership of the code closer and more into the wheelhouse of the developers that actually write it. And these are all the aspects and the things that need to be automated when practicing DevOps. In between infrastructure as code, service code, Docker files, Kubernetes manifest files, that seems to be owning a big piece of code which automates everything between what it takes to push a code from Git all the way down to a Kubernetes cluster. The practice of defining infrastructure as code is the key to realizing the benefits of moving to cloud. However, defining infrastructure as code is just the beginning. Applications themselves need to solve complex problems and requiring complex infrastructure. Managing change across these complex applications is the key to bring successful with infrastructure as code. To do this, we apply DevOps practices to infrastructure, treating it the same way we treat an application code. The most effective change management is achieved by firms that really emphasize a high degree of testing and deployment automation, high degree of automated risk mitigation, less rigid and much less manual approval processes, writing changes and code essentially, allowing employees to more scope, allowing employees scope to more influence change management and lastly DevOps processes and cultures. Many of these that we talked about so far are the key to DevOps, are the keys to GitOps. For example, about seven years ago, there was a security bug in the open SSL cryptography library, which is actually widely used implementation of transport layer security protocol. It was called Hardware 8 and was introduced into software in 2012. Prior to infrastructure as code, fixing this would require you to search all your systems one by one and make patches wherever needed. With infrastructure as code, you wouldn't need to search all that because there is a single definition file and requiring a single patch. If that infrastructure as code file was managed through GitOps, the teams could simply roll back the changes to an earlier state that they would know exactly when that bug was introduced in the code base. That sounds right to introduce GitOps. We at AWS are committed to make the life simpler by providing tools and technologies which lets you focus on business logic that differentiates and is the value proposition for your users. Unless there is a good reason to write and own all the code by yourself, lean on systems and platforms like GitLab or AWS and running workloads using Kubernetes on EKS. Really, when you see the list of things that you need to automate the entire workflow end-to-end, you arrive at a logical group called GitOps. The foundational best practices of GitOps is combining infrastructure as code which is hosted in Git with version control and flows through automated continuous integration to bring scalability, consistency and automation to infrastructure provisioning and management. This practice has many different names but increasingly called GitOps and is designed to eliminate any out-of-band challenges to applications infrastructure. Infrastructure as code is the practice of defining your applications infrastructure in code format, often with template or file and automating and provisioning of your infrastructure to match that file definition with GitOps. The infrastructure definition file is hosted in a Git repository and each commit flows through a merge request process like application software code and flows through CI CD automation. The four guiding principles behind creating a GitOps tooling are Firstly, a system managed by GitOps must have its desired state expressed declaratively as data in a format writable and readable by both humans and machines. Secondly, the desired state is stored in such a way that supports versioning, immutability of version and retains a complete version history. Thirdly, software agents continuously and automatically compare a system's actual state to its desired state. If the actual state and the desired state differ for any reasons, automated actions are triggered to reconcile them on a timely basis. The heart of a typical GitOps system is a Kubernetes operator. Operators are software extensions to Kubernetes that make use of custom resources to manage applications and their components. Operators follow Kubernetes principle notably the control loop. A control loop is a non-terminating loop that regulates the state of a system. Overall, it is the responsibility of the Kubernetes operator, in this case the GitLab Kubernetes operator, which is deployed onto your workload Kubernetes cluster or the workloads EKS cluster to ensure the state of the workload matches with what is defined in the Git repository. The operator runs in a non-terminating loop which ensures the desired state consistency. If there are any drifts detected, it will remediate and reset the state back to what was specified in the Git repository. Customers typically will have more than one environment for dev, tests and production purposes, all separated by distinct EKS clusters, VPCs or even different AWS accounts. At re-invent last year, AWS announced preview availability of EKS anywhere, which will push the implementation of EKS clusters beyond the boundaries of AWS accounts or regions, which could mean EKS cluster by itself can spread across massive geolocations which could be near to the edge to the customers. Most CI CD tools which are available today use a push-based model. A push-based pipeline means that code starts with the continuous integration system and then continues its path through a series of encoded scripts to push the changes to the EKS cluster. The reason why you don't want to use your CI CD system as the basis of your deployments, specifically to scale the software deployments to distributed clusters spanning across geolocations, possibly even to the edge locations is because of the potential to expose credentials outside of your cluster. While it is possible to secure your CI scripts, you're still working outside of the trusted domain of your cluster, which is not recommended. In a full pipeline, Kubernetes operator deploys new images from inside of the cluster. The operator notices when a new image has been pushed to the registry. Conversions of the cluster state is triggered when a new image is pushed to the registry. The manifest file is automatically updated and the new images deployed to the cluster. With a pipeline that pulls an image from the repository, your cluster credentials are never exposed outside of your production environment. Now let's dive deep into the architecture of GitLab's GitOps approach. The core components of GitLab's approach are GitLab Kubernetes agent, which is installed on your EKS cluster, which makes bidirectional GRPC calls to the Kubernetes agent server. If you are using GitLab software as a service solution, the Kubernetes agent is managed for you. To be specific, the Kubernetes agent server is managed for you and if you are using the managed installation or self-managed installation of GitLab on AWS, then you additionally need to install this Kubernetes agent server along with your GitLab installation. In this architecture, GitLab Kubernetes agent periodically fetches configuration from Kubernetes agent server, spawning a GoRoutine for each configured repository. Each GoRoutine makes a streaming GRPC call. Kubernetes agent server accepts these requests and checks if the agent is authorized to access the specified GitLab repository. If authorized, it pulls Gitly for repository updates and sends the latest manifest to the agent. For those who are not familiar with Gitly, Gitly provides high-level RPC access to Git repository. It is used by GitLab to read and write Git data. Before each poll, Kubernetes agent server verifies with GitLab that the agent token is still valid. When Kubernetes agent deployed on the cluster receives an updated manifest, it performs a synchronization using Kubernetes CLI utils. If the repository is removed from the list, the agent stops the GRPC calls to that repository. With that, what's the fun without seeing all these talks in action? Let's get on to a demo right away. Let me switch my screens. First step is to create a repository which hosts our Kubernetes manifest files. In this demo today, I'm going to use a sample nginx deployment manifest file and call the repository as application-project. The next step is to create an agent configuration repository. I'm calling this repository Kubernetes agent and naming the agent as agent1. The config file which contains information about repository that the agent should be saved at a specific hierarchy, which is .gitLab agent1 folder, where agent1 is the name of the agent which I'm specifying. Next up, as I'm using GitLab SAS, we need to create a GitLab Rails agent record to associate it with the configuration repository project. What we are trying to do is essentially creating a record which will associate the agent to a configuration repository project. Creating this record will also create a secret needed to configure the agent which will be deployed to the workload EKS cluster. You can create this agent with GraphQL Explorer. I'm going to use the create agent mutation. The inputs that are required to be passed are the project path, which is the path to my configuration repository and the name which I have used. In this case, my configuration project resides at my specific location here and Kubernetes-agent. Agent1 is the agent name which I'm specifying. Once you execute that, you will receive the output with a specific cluster agent ID. It's best to save this ID because we will need this in the subsequent step. The next step is to create a token. For that, we will use a create token mutation. The create token mutation will require the ID of the cluster agent that was created. I will copy that and put it inside the cluster agent ID. Once I run this mutation, I will receive the secret. We will need this specific secret in the subsequent step. Let us save the output here. The next step is to run a specific command which will install the agent inside my Kubernetes cluster. I have a Kubernetes cluster here. For test purposes, I will test and get all the namespaces here so that it lists all the namespaces here. We will see I have created a namespace called GitLab Kubernetes agent. Let me just increase the font size so that it is clear to everyone. Let's see the resources which are inside the namespace which I have created. CubeCTL will get all inside this namespace. You see that it currently does not have anything. Next step would be for me to execute a command which will create the CubeCT or the GitLab agent which was created. The command looks something like this. In this command, you will notice that few of the inputs that I have passed is the cluster agent here. The agent token to be very specific. This agent token needs to match with what we received in the previous step. The Kubernetes agent server address as I am using GitLab SaaS has to be exactly like this. I am going to deploy the agent in this namespace. If you see exactly the resources which it creates, you can just use this. You will see that the specific command will create pretty much everything which is required to run a deployment of the GitLab agent into your cluster. It will create the deployment, the necessary secret service account, role, cluster role, cluster binding, etc. Let us run this agent and it will create CubeCTL, apply, and so that it actually applies this into the cluster. Let's wait for a minute or so that it will create pretty much everything. Let's see, CubeCTL, Git, all, and GitLab Kubernetes agent. We will see that it created the parts, the deployment, and pretty much everything. We also see now that the specific deployment that we actually wanted to deploy has already started coming up. Let's examine what exactly happened here. If we do CubeCTL logs and do GitLab, let me just copy it here. I just want to see the logs of this agent and now you will see that when this agent actually ran, it started syncing the repository from the master. It picked up the configuration that we specified here and it started to sync the repository which we mentioned here. When it synced the repository, it actually fetched the entire configuration, it fetched the entire repository, and then created the resources which are specified in the repository. If we see here, my repository has two replica sets of IngeniX running here and it exactly deployed all the two replicas. Let's do it once again here. Let's see all the resources which are specified here. It has two replicas of my deployment. Now, just for sanity, let's just modify this to make it three. The number of replicas and see what exactly happens just to see how much time it takes for it to deploy to the cluster. Once I have committed this, now this has already gone in. Let's see, it is still in the process of syncing. Let's see the logs here. We'll see that it started syncing and now if we see, repository. We have three replicas up and running now. The way to achieve GitOps, I made a change in the repository. Typically, you'll follow the entire process like you'll create a merge request, you'll have a peer review and so on and so forth. In this demo, typically as soon as the code is merged into the main branch, it is the responsibility of the agent to sync up the entire repository with your Kubernetes cluster. With this, we are actually at the end of our talk and end of our session. Thank you so much for joining me today and looking forward for any questions that you may have. Thank you everyone. Goodbye.