 Hi everybody. Welcome to our talk. My name is Joaquin Rodriguez. I'm a software engineer at Microsoft. I live in Austin, Texas. I'm Priyanka Ravi. I also go by Pinky, and I am a developer experience engineer at Weeborgs. We're the ones that created the Flux project and then donated it to the CNCF. I'm also a member of the GetOps working group, so come join us and chat GetOps anytime you want. Okay, so today we'll be talking about cluster add-ons. We will be providing an introduction about what cluster add-ons are. We'll also be talking about some of the challenges around these cluster add-ons and how GetOps, it's a solution for maintaining these cluster add-ons and how Argo, Flux, and Flamingo can help us scale these add-ons. We'll be providing a solution diagram, and also we'll be doing a quick demo, and then we talk about scaling and some other things. So to get started, what's cluster add-ons? You might have heard this term. Cluster add-ons are tools or applications or services that expand the functionality of Kubernetes. So you can think about a vanilla Kubernetes cluster, or cluster add-ons are enhancements to these cluster that allows you to do really cool things. Like I'm mentioning here, they are not part of the core Kubernetes system, but they provide essential capabilities. When you think about add-ons, there are different types of add-ons. For example, monitoring and logging, think about Rafaana, Prometheus, Thanos, for example. Network and communication, think about the service meshes, SDO, et cetera. Security, you have your policies, OPA, Coverno, and storage, an example could be like Rook. Why are these important? They extend the capabilities of Kubernetes. They adapt to different use cases, so depending on what you're doing, you might not need certain add-ons, so it all really depends. For example, if you're trying to solve a problem around security, and then you want to focus on security-related add-ons, or if you care about insides, of course, you will be looking into the insides add-ons, such as monitoring and logging. When you're implementing add-ons, there are some things that you want to consider. The first thing is compatibility. You want to make sure that the add-on is compatible with your version of Kubernetes, or if two add-ons are working with each other, that those versions are matching, so something to keep in mind. Of course, you want to have a plan on how you're going to maintain these add-ons, how you're going to update these add-ons, so something again to keep in mind, and the resource overhead. If I'm deploying an add-on in a cluster, well, how much CPU do I need? How much memory do I need? How much is it going to cost? Those things is something that you need to consider when you're deploying add-ons. Of course, with any software lifecycle, you install them, you upgrade them, you scale them, and then at one point or another, you decommission them. The cluster add-ons are awesome, but there are some challenges that will follow. The first one is maintaining a fleet of clusters to ensure that they are operating efficiently. I think it's self-explanatory. You need to make sure that you have consistent configuration. Let's say you have a fleet of production clusters that have certain requirements, or you have another fleet of non-production clusters that have certain requirements. You need to be consistent across this fleet, so that way you don't have potential issues. Of course, you want to manage the add-ons dependencies efficiently, and also you want to visualize how these operations are working. You want to have some powerful UI that will help you visualize how these add-ons are being deployed across your fleet. Of course, scaling is hard, so you need to ... We'll be talking about scaling in a little bit, but it's something that we want to bring up here, that scaling is definitely a challenge. Now, I'm going to turn it over to Pinky, and she's going to be talking about how GitOps in general can help us tackle these issues. Yeah, so for anyone that's new to GitOps or any of the tools, I'm just going to go over GitOps and then the tools, and then give you a brief overview of Flamingo. GitOps is an operating model for cloud-native applications such as Kubernetes. It utilizes a version-controlled system such as Git, most commonly Git, but there are other sources you can use, like OCI Registries, as the single source of truth. It enables continuous delivery through automated deployment, monitoring, and management by a version-controlled system. There's an audit trail, everything's locked down, and then you manage your infrastructure and applications declaratively, which has a lot of benefits to it. You can see everything in code. You can tell what's exactly deployed, and it's reusable, all those things. Now, let's talk about Argo. Argo is a fantastic tool also for GitOps, and the benefits of it is that it has an application dashboard that's really awesome. It's powerful. It's a real-time UI dashboard that provides a holistic view of your application and your resources, and there's health monitoring and configuration drift detection as well, so you can detect and get notified when applications become healthy or, for some reason, get out of sync. It's also multi-cluster and multi-tenant, so you can create sandboxes and establish guardrails across multiple clusters and namespaces using projects, which is an object in Argo. And then there's also advanced deployment patterns. It supports complex pipeline-like deployments using pre- and post-sync hooks and sync waves, so you can set up checks for pre- and post-stuff of your deployment. And then it's also highly extensible. You can customize resource actions, integrate any config management tool, and also extend the UI. And it integrates really easily into your existing environment, so there's REST, ERPC, API, and CLI enables seamless integration with your existing tools. And then I'm going to explain Flux as well. Flux is GitOps for apps and infrastructure. You just push to Git, and it does the rest. It's declarative, automated, and auditable. It's also designed with security in mind. It's a pull versus push model, so it's created with the least amount of privileges. It adheres to Kubernetes security policies and tight integration with security tools and best practices. You can find out more about our security considerations in our docs. So there's also another tool called Flagger. So using Flux and Flagger together, you can deploy apps with canaries, feature flags, and AV rollouts. And Flux can also manage any Kubernetes resource infrastructure and workload dependency management is built in with Flux. So it can even push back to Git for you with automated container image updates to Git, such as image scanning and patching. And you can describe the entire desired state of your system in Git. So this includes apps, configuration, dashboards, monitoring, everything else you're doing. So you use YAML to enforce conformance in the declared system. And you don't need to run kube control because all changes are synced automatically. Everything is controlled through pull requests. So your Git history provides a sequence of transactions allowing you to recover state from any snapshot. We also say it's multi-cluster, multi-tenancy, and multi-everything. You can use one Kubernetes cluster to manage apps in either the same or in other clusters, spin up additional clusters themselves, and manage clusters, including life cycles and fleets. It also works with any Kubernetes and all common tooling. It works with your Git providers, such as GitHub, GitLab, Bitbucket. You can even use S3-compatible buckets as a source as well, OCI registries, all that. And it works with customized Helm, Harbor, custom webhooks, notifications, and all other things like that. And we also say dashboards love Flux. You can use multitude of variations. There's different Flux UIs out there, including the one I'm about to talk about, and hosted cloud offerings from your cloud vendor. There's a thriving ecosystem of integrations and products that are built on top of it, and different options out there. All right. So now let's talk about Flamingo. So Flamingo was a tool that was created by Webworks as well to allow you to utilize both Argo and Flux. So Flamingo is the Flux subsystem for Argo. Both are really awesome options. They both have different reasons. Like, they both have different pros to use them. And so with Flamingo, you can get the best of both. And it's a container image that can be used as a drop-in replacement for the equivalent Argo CD version to visualize and manage Flux workloads along your existing Argo CD workloads. And it's drop-in and non-invasive. It's a very easy component to start using if you're already using Argo or Flux or whatever. And that's the GitHub link if you want to check that out as well. So why? What do you gain from this? Why even try this? So we have talked to different end users. And we got some feedback of why they're doing this and why they like Flamingo. And they said that for Argo, the pros for them were that they wanted to take advantage of the UI, the scalability, the cluster management, the centralized control plane, and the precinct and post-sync validation I mentioned earlier. And then with Flux, they wanted to take advantage of the helm lifecycle. And it depends on rollback, upgrades, dynamic, config, helm values, helm hooks, and retries. And there's also another use case, which is the Terraform controller. So if you want to manage your Terraform deployments, that's another reason you could use the Flamingo UI. All right. And I'm going to pass it back to Joaquin. Thanks, Pinky. So today, we will be showing a very small example on how you can combine these tools. I'm going to walk you through a quick diagram. And then after that, I will show you a GitHub page where I set up some instructions on how to set up if you want to try it yourself. So like Pinky was saying, when you install Flamingo, essentially, you're installing Argo. I mean, you can think Flamingo is Argo. The only difference is Flamingo has some extra extensions that allows you to visualize Flux resources within Argo. But if I use the word Flamingo or if I use the word Argo, basically combining them or using them interchangeably. So back to the diagram. Let's start with a vanilla Kubernetes cluster. We don't have anything there. We just call it a management cluster. And I'm going to install Flamingo. After that, you can think of personas. So let's say we have a cluster admin. And I think it's kind of small, and I apologize for that. But the cluster management app manager is responsible for adding new target clusters into this Flamingo implementation. And we're registering these clusters with the help of Kyverno. This is just one approach. You can do many things. You can use the Argo API. You can use the Argo CLI. But for the purposes of the demo, we're using Kyverno and also we're using the clusters. So after that, we create an application set. And this application set, if I can see it correctly, talks to a Git repo in which we have an add-on repo that contains the configuration for our add-ons. So also, one more thing that we're doing, when we register a new cluster using Kyverno, we're also setting cluster labels that will tell us which add-ons we can enable in a fleet of clusters. So for example, if you have a fleet of clusters for prod and you want, I don't know, ser manager, Kubernetes dashboard, OPA enabled, you set those in your cluster labels, likewise for the non-prod fleet. And I was showing that a little bit, if that doesn't really make sense. And then that application set will generate a few applications that will target different clusters. Right now, just for the purpose of simplicity, I just put one cluster. But you get the idea. You can have more than one. Also, like I was saying, for the demo, we're using vclusters for this. Now, in the target cluster, we also have installed flux, the source controller. That way, you can install different applications in that target cluster using flux. Just like Pinky was saying, there's some features that we can use in flux, and there's some features that we can use in Argo, so we're combining them. And then the target cluster will also query an OCI registry to get some artifacts and install those applications or those add-ons in the target cluster. And also, going back to our personas, we have an add-on owner that will be responsible for maintaining the configuration of the add-ons in a Git repo. So that way, the cluster admin will be responsible for the administration of the cluster, and then you have an add-on owner responsible for the configuration of the add-on. And then at the end, we have a user, which can be anybody really, and that user will have a unified UI that will be able to see the deployments that Argo is doing, or Flamingo is doing, and also flux. So let's jump into the demo. Okay. So the first thing I'm going to show you is the Flamingo UI. You might be familiar with this already. It looks just like Argo, with the only differences that it supports the flux resources. The first thing I will show you is the clusters that we have registered already. You can see that I have a fleet of non-prod and prod. And if I open a production cluster, you can see that we have cluster labels, and you can see that some of these add-ons are already enabled. So, for example, I have Kubernetes dashboard already set to true, and I have cert manager set to false. So in my production fleet, automatically, Kubernetes dashboard is going to be installed, and cert manager will not be installed unless I set it to true, which that's what we're going to do right now. So I already have a PR. Oh, and by the way, this is the repo in which I explain the instructions of how to do the setup. I'm not going to walk through every single step, but there's pretty good documentation, and you can always reach out to us. So this is the repo if you want to check it out later. But I already have a pull request, and basically what I'm doing, I am enabling my cluster labels. So I'm saying I want you to install Lopea, and I want you to install cert manager. So I'm going to review, and I cannot approve. I think I changed something here. Okay. Oh, there we go. Merge. Okay. Sorry about that. Okay. So this is going to happen in the background, but what I'm trying to explain here is that once these labels are set, Kiverno is automatically going to change those cluster labels to true, and then we're going to do some magic with application sets that will allow us to install these add-ons automatically to the fleet of cluster. So I will do that. And also, another thing that I'm doing, I'm going to add a new cluster for non-prod, and I'll show the code in a second, but what I'm doing is I want to install a new vcluster, and I'm calling it non-prod east US. So let me merge that pull request as well, and I'm doing these first that way. It takes a little while to propagate, so I don't want to be doing this at the end, and then we have to wait. So okay. So now that's done, let me go back to Flamingo, and I'm going to show you, so right now I have a project called the workload non-prod and workload prod, so if we look at the non-prod one, you can see that we already have a few add-ons enabled by default, like the dashboard, but this dashboard is actually being installed using flux, using the helm release object, so you can see here that we have our communities dashboard installed. And also, let me go back to the code. I'm going to show you the policy from Qverno. So why did I use Qverno? It's super simple for demo purposes, but it's a very powerful tool. I highly recommend it if you haven't used it. What this policy is doing is every time I install a new V cluster in my management cluster, Qverno will be like, ah, you just install a new cluster. Let me get that secret from you, and I'm going to register it in Argo. That way Argo can automatically manage that secret for you, and you can deploy things to that target cluster. So down here, if you look at the labels that we're generating Qverno, you can see that I already have some labels set to true, and these are the labels. So once that's triggered, then we have a cluster add-ons non-prod application set, in which we are listing the components that we typically manage, the dashboard, the OPA, and the server manager, and then we have a template, like, you know, you're familiar with the application sets, and in here for my cluster selector, basically I'm saying I want you to match the clusters that are enabled by Qverno. I want you to select the clusters that are also in the non-prod fleet, and I also want you to give me the components that are set to true, and based on that, it will generate the application, and it's going to install that component in our target cluster. So you can see here is the template. I created a few dummy repos that manage the values for these add-ons, so that's why it has this placeholder. And yeah, so once you do that, just like I was showing a little while ago, it will automatically install that cluster to that fleet that you're targeting. What else? Oh, I also have an application set for vcluster on how I create my vclusters. So it's nothing more than just like a home install. I didn't want to go too complicated, I just wanted to install a few vclusters using home, and this is how they're getting installed, nothing too fancy. So by now, if I'm not wrong, we should have the new cluster already deployed, which is right here. You can see non-prod East US 2, and you can see that the labels are there. True, sir manager is also true, and I think OPA is somewhere there here. OPA is to true. Also, we enabled these in the production, if I remember correctly, in the production fleet. So let me go back to my production. I can pick any of them. And now you can see that we have sir managers to true, et cetera. And if I go back to my applications, and if I target my production fleet, you can see now that five minutes ago, OPA was installed in production East US, oh, East US 1, sorry. Oh, yeah, that's fine, because we're installing these across all the fleet. So we have that, and then we have sir manager as well five minutes ago installed in production south. So that's it for the demo. I'll go back to the slides, and again, I'm going to go back to this repo. Again, if you're interested in setting this up, I have instructions. This is automatically installed using AKS, but in theory, you should be able to do it with EKS, Google, just vanilla Kubernetes, you just have to tweak it around. Or you can always reach out to me if you have any questions. Okay, so let me go back to the slides. So an issue with what I just showed you is scale. Obviously, with any application, there's limits. By default, with Argo or Flamingo, we're not going to be able to install 10,000 clusters and 40,000 applications. It will break. So you have to come up with creative ways on how to scale. And if you look at the Argo documentation, they make a really good documentation on how you can scale. But two options that you can consider, you have a management cluster, and then on each cluster, you can have Flamingo installed. And you can do it that way, and then you just have to figure it out, how do you target this? Do you want this only for my production workloads, my non-production workloads, by region, by zone, by team? So that's one way. Another way, you can short it, which is option two. Basically, you have to set up some sort of orchestrator that will manage that sharding, and then after that, you can target different clusters. But yeah, so I highly recommend looking at the Argo documentation when you want it to scale. That being said, I'm going to turn it over to Pinky. So I'm just going to go over some key takeaways. So one key takeaway is that cluster add-ons are powerful components that enhance the Kubernetes ecosystem. We really wanted to show how impactful they can be for your deployments. And then managing cluster add-ons at scale is hard, we acknowledge that, so we just wanted to show how it can be done. And then with the right tools, though, you can tame this complexity. We've shown how you can do it. And keep in mind, there's no right solution to solve all cases. This is a use case that we've seen, so we really all depends on your use case. So that's why you're trying to do, that's why there's many options. So there's no silver bullet that will fix it all, right? So it's just an option. Yeah. And then we also wanted to say a special thanks to Tebow and Kuldip. They really helped us out with this presentation a lot. So we just wanted to give them a special shout out. And then this is a QR code that you can scan to leave us feedback on our talk. We really appreciate it. Also, the top link at the bottom is the link to the live environment that Joaquin's been showing. And then the bottom one is the link to his GitHub repo that you can go follow the steps there and make it happen as well. Yeah, so I was going to say something I was going to mention earlier, but if you're interested in looking at the Flamingo setup that UI, it's flamingo.cubelab.pro. It's set on read only, so you can go in there and take a round. You won't break it, so I just honestly want to. The demo's over. Yeah, demo's over, so it's all good. But yeah, so that's it. Yeah. And then also, I'll be at the Flux booth. You can come talk to me if you have any questions about Flamingo or Flux or anything like that. Also, we're both on the CNC of Slack if you want to reach out to us there. We're also on LinkedIn, and there's also Flux and Argo channels in the CNC of Slack if you have questions for those, respectively, as well. Thank you, guys. Do y'all have any questions? There's a mic, I think. Yeah, yeah, sure. I think we have liberal time. I didn't know there was a mic there. Yeah. Great. As someone does with Flux, Argo, great. Love this. Do you need to migrate to a Flamingo API, or can you just bring over traditional Flux helm releases that you already have to integrate? So that's a really good question. Can you migrate, oh, can you do this with an existing already Argo instance? Is that what you said? Which one do you already have? Sorry. I have both. You can. So I do a lot of Flux CD helm releases. Do I need to do anything to divert that for Flamingo, or is it just like, it just pulls it right? So it's a drop-in. So you can install, if you're going to the Flamingo documentation, they have, okay. Yeah, let's just do that. Let's just people get. Yeah, maybe we can get in a session and just talk. Yeah, so.