 Okay, my name is González Cerdo. I'm working currently at Iberus as a software engineer. I'm working on a team that is implementing all the part from ARCO CD on the ministries. So I am part of the Bob's team of it. So, how all started? This is how some conversations start with customers, like, hey, we hear a lot of Kubernetes, we are using it, only one branch, and then it just start like, yeah, but maybe our team is not ready for this. So, this was the legacy model, how it started. It was just a single branch, they didn't have no PR, no approvals, they have conflicts because code reviews were limited to, hey, please check my commit on the single branch. So they have a lot of conflicts, all the developers pushing out of changes. Yeah, and about the part of the ARCO CD itself, there is no ARCO CD in the legacy implementation. So they work it with the part of the Azure DevOps, but pushing the changes to an ACS cluster instead of Kubernetes. So they work in an automated way where the file, the image was pushed to an ACR registry, and an image definition was pushed to an AC3 bucket, and it was deployed to the ACS. But all the creation of the task itself from the ACS, the services, all the configuration, the environment variables, all the stuff was done by CIS admin team. So, this was one of the main bottlenecks from the ministry is that all the interaction was done by ticketing a mailing system. So, if you need to do some kind of change on the port of the application, on environment variables, or any kind of change that is for improving the application or something like that, you need to talk with the CIS admin team via mailing and with the team manager. So it was really slow. You have an infinite loop of mailing and it was pretty slow to maintain the applications. So, at first, we modified. This is pretty important because it's the base of the JTOP CD itself. So we moved from the single branch with no pull request, no approval, nothing else, to a JIT flow-based model. Maybe it's not the best, maybe not the worst, but it was one of the models that was used by some of the development teams that were on the ministry. Because on the ministry, there are like tens of development teams. So we need to try to specify one model for all teams working across a platform. We enabled the part of the pull request, at least for the code review. We moved to a multiple branch based for one of each environment that we will see later on the Kubernetes with ARGO CD. We disabled the part, for example, that it was a really problematic issue is that we disabled the direct push to the main branches. So we will modify the pipelines to work in a different way. So instead of having to modify the test definitions and all of the stuff in a manual way, contacting with the Cessath main team, be advocating, we implemented the part of the build pipeline itself. We improved the part of the image analysis and the part of the source code analysis too in a more automatic way. This is all the part of CI itself, but on the ARGO CD part, that is more important, the CD part, we changed it from a single repo with one branch to a multiple branch on the source code repo and another one for the part of the manifest itself. We will see that we changed at first with multiple folder on a same repo to multiple repos, one for each environment. We will see it later. So we changed the part of the S3 bucket with ARGO CD and ACS was modified to use Kubernetes instead. Okay, so at first we started with a small file that was an image definition push to the ACS and we changed all of that to prepare the base templating for ARGO CD itself, okay? First, we started with Customize and later with Helm, we will talk more in the dive about it. First, we start with Customize at first. So about this, I don't know, maybe all of you know about Customize, but we started with a single repo separated in different folders. So you have one overlay for each environment and the base models were on another folder. So when you need to apply any change for any specific environments, you modify the overlay itself. For example, for development, we modify the test and it will merge with the part of the base manifest that were on the base folder, okay? So you modify its file for the different environments. This was done in an automated way. So we will see. Okay, the part of the image versioning, when we were raised on ACS, we used the part of the latest tag always. So the image tracking was near to impossible because all the images were latest on all the tasks and services. So we modify it to use the build ID on the pipelines. So it's pretty clear to see what are you deploying. For modifying it automatically on the ARGO manifest and on the part of the Customize, we use the Customize tool itself. So you can set, for example, the mage tag. You can modify the mage tag on the pipeline itself without any interaction. And you can modify the, for example, some of the variables that are applied on the config map, okay? So later, ARGO CD will merge them because the configuration of ARGO CD was pretty simple with that. ARGO CD was pointing to the repo that is only one repo based on Customize, referencing to the main branch because we only use one branch for the manifest itself, not in the search code we use many, and the path itself to deploy to the right environments. So its application that was deployed for each environment has a prefix on its deployment. For example, development, preproduction, that's all. Okay, so after working with Customize, it was some kind of a change, well, we changed to help because we will see that it provides us a more flexible templating for us for this auto-hoc solution. So we changed from single repo with multiple folders for each environment to one repo per environment in the naming. As you can see, it's the app itself, the gtops for just know that this is the Customize, the manifest one, and the environment for each of them. So the templating itself was pretty simple because it's the typical chart structure that we will see later. And on the chart itself, we add some useful metadata for us, the typical that is implemented on all the charts, but with some extra metadata for the deployment on the cluster. Okay, so about this. At first, the Helm charts were modified only the main stack by the part of the pipeline, but we improved it a bit more because developers, for example, need to modify the part of the secrets and the config map itself to embed on the application main manifest. So it was pretty hard because they need to add the value from, the security ref, all that stuff. So to avoid that, we use the part of the helpers. We enable the helpers to, as you can see on the left image, the developers only need to add a config section, a secret version that is pointing the secrets to the secret provider that is working on Kubernetes. And the config for the config, for the environment variables were in the list later on the application manifest. So as you can see here, we have the part of the application with the helpers included. So the developers only need to add some extra lines on the values itself. It was pretty fast. So it's more simple. For the part of the left itself, as you can see, we use the typical variables to avoid the naming all the time for the full path of the configuration from the values and some extra functions that are the typical ones from help. But we move to this solution because it was one of the requirements by client. So it's a better idea for packaging the applications. And this is something important because on the other model, on the legacy model, all the administration of the task itself on the cluster was modified by the sysadmin team. In this case, the developers have all the independence to modify the manifest by theirself. So they can contact us and modify without any problem. They are free for doing that. So not only that, we mainly deploy the part of the application itself, but for example, we deploy multiple core services, like for example, for the application part, we deploy the databases with the cluster is the typical master slave and all that kind of core services that are used by many applications. So another point that is pretty important about the deployment itself of the application, we create not only one repo for the GTOPS, there is one repo for the GTOPS that is the application that we have seen that is reference it to a specific environment to model repo that is one for that. And another one that is managed by system team that contains all the base of the image itself. So it's the name space where the application will be deployed, the resource quotas, the limits, and the service account that is restricted to a specific role. This is defined in two repos. So the base one matched with the application itself to deploy all the core of the application itself. And yeah. So let's talk about records and limits. Records and limits are important in concepting Kubernetes because allow us to control the amount of resources, CPU, and memory that your containers can consume. So records determine how many resources a container will receive during deployment. When you define the request, Kubernetes only schedules a container on the node that can provide the resources it needs. So if you set the request to height, the containers will request more resources than they need. And if you need it too low, you will have problems like your application just won't work. And the limits represent the highest possible value that a container is permitted to receive before it's restricted. So this ensures that container never goes over the value. It's important to put a limit because if you don't have it, you can cause like a snowball. If your application has a memory leakage, it starts to grow and grow and it can affect the node and even the cluster. So it's important to put limits. How do we calculate those limits? We can, first we need a tool to monitor like Prometheus or any tool. Then we calculate the average CPU and memory. We recite the request and limits. We keep monitoring as the application continues being developed. So we can change the way it behaves. We tweak again these values and then we keep monitoring. So it's always the same monitor, tweak and I have the rest in the right place. Okay, so more related with the part of the ARGO CD itself, we have the part of one of the requirements that was recreated by client itself is that they have multiple websites. More of them are Drupal and some custom ad hoc solutions. So on the old model, they have some kind of downtime and it was not transparent to user. So they require us to have a deployment system that contains no downtime, transparent to user and is to rollback. So for that we implemented a mix between an ad hoc solution because it was some kind of requirement and the power of ARGO CD for that. So for that we created a multiple pipeline system that one of them built the typical limits that is the conventional CI system. But one of them creates a database backup for deploying the new database and the old database that for creating a blockchain model, okay? So this was done automatically modifying the manifest that are part of the job itself that does the dump and the import database. So it does the dump and the import, later be a work hook it calls to a CD pipeline that modifies the manifest of the application itself, of the application that we will deploy, the application, the changes are detected by ARGO CD itself so it deploys the preview service and the QA team can check that preview service to see if there is any error, any bug or something like that. We have a manual step that it was now a requirement is that the approval for the switch of the blockchain should be done manually. So for that the pipeline itself is waiting with a button for the confirmation and if you don't press that button, the ARGO rollout will roll back to the previous version. So if you press the button and you do and you apply for the change, another job will use the rollout client itself to call Orgo CD to do the switch itself but you need to approve that. If you don't approve that, okay? The application will roll back automatically because the promote will be denied. So the job will roll back automatically to the old solution, to the old version. So to monitor our environment, we use Falco. It's a CNCF incubating project, cloud native open source. It's an upper standard for runtime security for hosts, containers, Kubernetes and cloud. Falco relies on rules. With these rules, we can detect suspicious MySQL queries or a cell running a container changing our configuration files of our application. This is an example of a MySQL rule when a process name MySQL and someone tries to select from the MySQL users table, then it generates an output with a priority warning. So in our dashboard, we can see the output. We can filter by tag. We can filter by rule, by priority. To monitor, we use Prometheus. It's the de facto standard for cloud native monitoring. It's open source. It has an active and large community and it's integration. It also has a really powerful language. The deployment, you can use the official home chart. It's straightforward. It includes the role-based access control configuration and you can choose between different flavors depending if you are going into production or it's just a developed environment. Here we use Prometheus, the service discovery target. So it discovers out-of-the-box targets like Kubernetes, DNS. We can also create service monitor for Argo CD so we can get the metrics from different parts of Argo CD. To monitor our applications, we use supporters and we create jobs and then we visualize all the data using Rafa which is the de facto standard visualization tool for Prometheus. Once we have the metrics, we can create and push alerts to the other manager so we can create an error that if the IP pays out of sync, if our memory is getting full or try and lock is getting full. So this is an example of a service monitor that we create for Argo CD. So this automatically creates a job for Prometheus with the operator and scrapes metrics. So you can use Grafana dashboard and you can monitor how many clusters you are managing with Argo CD, how many applications. You can see if you have one application that it has servers and you can also use in metrics, you can create alerts like, hey, tell me if my application is out of sync or tell me if we have an application with a phase in error or fail. Now, JITOps and JIT security practice, the JIT repository is the source of truth. So if you manually modify anything in the Kubernetes configuration, Argo CD will automatically revert to the previous version, which is in JIT. It's important to have separate code and configuration repositories. So if you modify something in the Kubernetes stack, you don't have to modify your application code. And as well, we must separate the environments in different repositories. So it happens the same. I don't have to change something in development when I change in production. Branch rules are important as well as it prevents someone from deleting main branch or pushing to main. So then here comes a request with approvals. So you create a proof request, you have someone in your team that must approve your request and can just say, hey, this pull request is not good, you have a typo or whatever. And then once you do pull request, you must have auto-match checks. So some checks like unit test, LinkedIn, security test, so your applications emits a minimum requirement on a standard before it can be merged. Okay, and about the part of the results and benefits, well, at first it was pretty hard because the ministry itself, all the ministries was in a legacy model and they don't want to change anything because they was happy using the console itself pressing buttons, but at first it was hard, but later definitely the benefits are really huge because all applications now that we are implementing with Argo CD are deployed automatically, so developers only need to develop and that's all, maybe some modification on a health manifest, but that's all because all the part of the rest of the modifications are modified automatically with Helm, customize in this case. We improve the versioning because they use the latest and we changed to a semantic versioning that all of this is deployed on the cluster itself by Argo CD also with the image modification that is done automatically also. The development team are more independent, they don't need to interact directly with ticketing or mailing with the CSATN team, they can do the change by their own so this is pretty important because they can do whatever they want with their application on the minor measurements and on the part of the after-task deployment we improve about the part of the monitoring and the part of the login for example because for checking your application you need to contact also with the teams, the CSATN team, so we improve that also too. So I think that these are the key points. And to sum up, legacy applications doesn't scale so there is a moment where you have to take a step further then you use the dots but this is not an easy path as it has relevant changes on development and deployment process, it requires changes on the infrastructure, new tools, new technologies so the process sometimes can be hard with the team if the team is legacy focused. Sometimes you also deal with potential books from applications you're trying to use and to end this effect in monitoring we think it's a critical factor to answer the success of a project. So this is all, thank you for coming. For me, yeah, okay. Any questions? Thanks for presenting first of all. Thank you. Regarding customize and how we're showing I noticed that you were hard coding the secrets in like the values file and stuff. How do you do secret injection in your deployment pipelines so you don't have to put your secrets and clear text into your Git repository? The secret itself are managed by CSATN team in this case so they provide us the final keys that are pointing to the secrets that are running on the cluster itself. So only this part should be done manually on the part of the values itself. So pointing to the reference that CSATN team gives to us. We can do one more. Hi. So why did you go with lots of repositories instead of using monorepo with code owners let's say and how do you manage this amount of repositories? Yeah, it's a bit hard but it was imposed by our client because they want to do it because they think that is more secure because it was an imposition. It's not our decision in this case but it's a bit hard to manage all of them. We prefer at first the customize method but in this case it's more secure for the development team because sometimes they modify the overlays from other environments and it was a chaos. So in this case it's more secure because they are pointing only to the right repository. Okay, thanks. All right, that's the time we have for questions. Let's give them a round of applause. Thank you. Thank you.