 Hello, I hope you all are having a nice time at GitOpsCon so far. Welcome to my talk, thanks for being here, giving me a portion of your time. My talk is on observability with GitOps, specifically Flux, because that's the project I work on. Just a little bit about myself, my name is Somtochi Onyukwere, a developer experience engineer at Weaverworks, based in Nigeria, I'm a Flux maintainer, and I love open source, and you can connect with me on Twitter and GitHub at Somtochi, that's my handle. So today, okay, Flux is at GitOps, we have the Flux boot and a couple of talks, so you can scan this so that you can know our times, so you can always come by the Flux boot or the Weaverworks is the company that created Flux that I work for, they're also the creator of other open source projects like Flugga, Cortex, and a bunch of them. So yeah, you could always scan the QR code if you want to reach out to us, come by and say hi. So at the start of the GitOpsCon, they already went through this, so I'll just be running through it really just for anyone who is not familiar. So we have the GitOps principles, your system is described in Git and is version controlled, so basically Flux sits at number three and four, where it is the software agents residing in your cluster, continuously reconciling the states you've defined in Git on your cluster, so basically you store your YAMLs in Git, Flux, which is a self-controller, it's on your cluster and pulls your changes and applies them. So Flux basically is a set of controllers, it extends Kubernetes, it uses custom resource definitions and operators. So I'll just move through quickly what each controller does, just so we're familiar. So the source controller is, it pulls the YAMLs that were stored in Git, not just Git actually, it doesn't have to be, it's just the most popular version control system. You could store it in a street bucket, OCI repositories, I saw the talk before break was on OCI, so recently we've added supports for storing your manifest in an OCI image and Flux is able to pull that. So the customize and helm controllers are what I would refer to as appliers, customize and helm are popular in the Kubernetes ecosystem. So basically if you had helm charts, if you had the source controller, pull helm charts or helm repositories, the helm controller would be what would apply it on your cluster or customize controllers. We customize, if you just have plain YAMLs, so that works. The notification controller, which is sort of like, what I would say the observability part, which is what we're going to be looking at in the talk, is what sort of gives of notifications to you, it supports a bunch of providers. It can send notifications to Slack, Discord, MS Teams, they're a bunch of them. Then lastly, the image reflector controller and image automation controller, they are able to monitor your container registries and update your YAMLs in Git when it sees a new tag, so it rides back to Git. So of course, we're all familiar with the benefits of GitOps. It helps us to iterate faster, deploy faster. It gives us a commit log, but in all of that, it means they're sort of like, when you remove the human components in all the system, which of course makes it faster and more reliable, it increases the need for observability. You need to know what's happening and when something goes wrong so that you can quickly step in and fix something. So you need visibility, you need to be able to monitor these systems. Observability involves logs, traces, metrics, flux, of course, because flux applies things for you, of course, you need to know if something is wrong, if it's successful, was it able to manage it successfully? So it strives to provide these things for you so that you have insight into what this is doing on the cluster. So basically what I'll just be showcasing here is the notification controller, structured logging, all of flux controllers aligns with controller runtime logging. So it's easy to know, you can easily ingest and filter the logs. We also export permittes metrics. Of course, permittes is one of the most popular observability solutions in the Kubernetes ecosystem. We also have Grafana dashboards for visuals. You'd want to see what's happening quite quickly on the dashboards. So there's a link to the demo I'm about to, I've included a link to the demo I'm about to show, just a quick run through of what's in the repository. I've already set it up because of time. I'll be working through the setup, but it's basically already done. So there's a telephone controller for spinning up a GKE cluster. But this demo works with any Kubernetes cluster. We also use Pixie at the end. You don't have to, but Pixie doesn't work with Docker desktop and hence won't work on a kind cluster. We have the clusters folder that contains, this is where the repository that Flux would watch and apply on the cluster. And we have the infra folder, which contains the demos for different things we're going to be installing on the cluster. Additions like tail scale, Pixie, Kube permittes stack. So I had slides prepared for someone who, after this talk, wants to go step by step. So if you want to go through the whole thing, you can always do that. So of course, just like I said, observability is important. We don't want to, of course, you want to do a good push. You just want to do a good push and have your things applied. But you want to know, did it get deployed? Is anything wrong? So that's basically the basis. So with the notification controller, we have various providers that you can send notifications to. They are not limited to what's listed here. They're actually a lot more. There's Discord, Slack, Google, chat, MS Teams, and a lot more. You could also have a commit status notification sent to the Git providers like GitHub, GitLab, that green or red check that is beside a commit to show if it deployed successfully. That might also be helpful. That might be what suits your need. So here, time for the demo. I'm just going to, mostly going to be showing y'all most. So be up with me. This is GitOps. So I've already... Sorry, could you increase your screen size there? Okay, thanks. Is that working? One second. Okay, so check. Okay, so I have the... I can't even set up my slides. I think it's probably... Yeah, so the Terraform folder creates GKE cluster and I'll be showcasing some features of Flux in the process of setting up notifications just for someone who hasn't come across it. So we're also setting up GCP KMS rings. Basically, this will be used to encrypt secrets in the secrets that we're going to create in Flux. You don't want to store your secrets in the cluster. We want to encrypt them so that they are safe while doing GitOps. So yeah, you could Terraform apply what we have here. Sorry, let me just shift this around. Yeah, okay. So I've run Flux Bootstrap to install Flux on the cluster to the cluster repository and we have the infra folder that contains QPremity stack. Just a quick example of what we've done to set up the notifications. So with the notifications, we have a couple of CRDs for the notification controllers. We have the ALATS CRD and the provider. So basically, ALATS helps you to pick which of the Flux objects you want to monitor, right? So this is basically what it looks like. So yeah, you're basically telling me that you want to gather events from... So basically, when the controllers do something, they send out events to the notification controllers. So you don't have to... If you don't create this, nothing happens, right? It's receiving the events, but you haven't instructed it to forward it. So this basically tells it to watch different kinds like the Git repository, OCI, customization, home repository. So you could specify a particular name space or you could specify a particular name, but the wild card tells it like any of these kinds, regardless of the name, ALATS from them. So you also have to reference a provider, right? Since it can send to a bunch of different providers, you have to tell which of them is going to send out the events to. So we are creating a Slack provider, a type of Slack, and we are referencing a secret Slack URL. So you don't have to make it a secret. You could also add an address here. If you don't think your Slack URLs are super secrets, you could add the address here. But if you push it to Git, I've noticed that especially for your public repository, GitHub and Slack have this thing where they invalidate webhooks. So what I'm doing is I'm creating a secret. You can reference it as a secret too. So what I've done is this is what my secret looks like. This is because I have encrypted it with the key ring, the crypto key that I created with GCP KMS. So basically SOPS is going to... I've also included this modification to my flux deployment and telling it that it should decrypt using the SOPS provider. And I don't know if anyone here is familiar with workload identity, where you don't have to do your... Give, download a service account key. You basically just annotates the Kubernetes service accounts with that of the service accounts on GCP and it's able to make calls. So I created this cluster with workload identity enabled so that it's easy. So basically you don't have to do the whole things with keys. So decryption works with not just KMS on flux. Decryption on flux doesn't just work with KMS, but it works with the popular cloud providers like AWS, Azure Key Vault. And even if you don't want... If you'd like to manage... Sorry. If you'd like to manage your keys yourself, you could use GPG H. You would instead put in a secret ref here so that flux knows what key to use and decrypts the secrets. Okay. So... Can I make any changes? Okay. I'm going to... I think that's most of it for us. So this... If you are using a different type, this would change basically to reflect what you have. So I'm going to quickly create an app just so that we are able to get some notifications. So you could see that the alerts have already applied before now. And they are initialized. So just going to do this quickly. I'm creating... This is Stefan's Podinfo repository. It's pretty popular. So he has created an OCI image with the manifest to deploy it. So basically, I'm telling Flux to pull the manifest from the OCI image. And this would apply it on the cluster. I think we're good. Was it already created? Let's see. Okay. It was already created. So I'm just going to put in an error there just so that we get something. Okay. We should... So because Flux reconciles as a particular interval, right, we would have to force the reconciler. Because of the demo, we would have to force the reconciler. So this is the VS Code extension. Sometimes we have people who say, oh, they are fine with GitOps, but they want developers to take control of their deployment of their app, but they don't want to have them like interface with YAMLs and stuff. So this is a great way. There's a feature coming to allow people to create, to basically have fields where they can fill in and create Git sources. So if your developers don't want to have to deal with YAMLs or they're not familiar with Kubernetes, you could basically have them fill in the fields. So I'm going to reconcile the Flux system customization so that it can pull the most rich... the Flux system Git repository. So, oh, we're getting some notifications here. So basically, this is what we want, right? If something is not working, it's unable to get... So basically, this is what we want. If something is not working, okay, yeah, because it's not right, right? We didn't put in the writing. If something is not working, we would want to get some alerts that say, oh, hey, you need to check this out. So you could configure it to whatever, like different resources to different Slack. Maybe you want to section it out by teams. So you want maybe alerts from a Pascal name space to go to a Pascal team. So you could create two different provider URLs. So I'm just going to resolve this so that we sort of get the notification that is working okay. And we would trigger the recon cell. So another thing... I'm not sure any here, but what the notification controller does is that it allows you to define, like receive something called receivers, right? You could define a... You could create a receiver, and it will create a unique URL for that, for it. So you could... If you have a service load balancer like that, it's exposed out of the closet, you could give that to GitHub, right? So that whenever you make the push, you're waiting for Flux's reconciliation interval. Instead of like, as soon as you make the push, GitHub sends like a post request to the notification controller, and then it will trigger the recon cell for you. So you don't have to like do this all over. So yeah, this should be working fine now. I'm just going to... This is OCI. The extension is having some issues. Yeah, so it's stored. We could see that it has stored it okay here. Yes, so it sent us an update. It wasn't working here, but we're able to receive an update. So sometimes the notification controller might not be what you're looking for, or it might not be insufficient. It's not fully built out observability solutions. So of course, Flux exports what we call primitives. Most of us are familiar with primitives metrics. So at a particular end point, so that if you want to script them and get more visibility into what's happening, you can. So I have installed Kube-Primitives stack. It's a home chat that bundles from Operator and Grafana, and a LAT manager too. So this is where it is. Kube-Primitives stack. So I'm basically telling Flux to install, install the home release, Kube-Primitives stack, giving it some values. I'm giving it the Slack URL from a secret. So you could provide, just like your home values, you could provide it from secrets, config map. Flux allows you to do that. You could also specify them, inline them in the YAML if that's what you need. So basically we're enabling a LAT manager and primitives. So the Flux team also provides some escape. Also provides some YAMLs to help you set up Kube-Primitives, right? For, I mean, Christus, just in case it's just here. So it has like dashboards for Grafana for you to be able to visualize the metrics and also this basically configures, this pod monitor configures primitives. So monitor the Flux system, the various controllers under Flux. So yeah, I've already installed that. And I've also installed, I was supposed to show you this, but no time. I've also installed tail scale as a subnet router. So basically I'll be able to access internal IPs from my laptop. Tail scale enables you to do that. You could check it out. You could check out the repository in your time. So I'm able to use in cluster URL. If you see this, I'm using svc.cluster.local. Here's my tail scale dashboard. I've added this subnet router. So basically I'm able to get a direct connection to my cluster without exposing the services. Okay, so I can use this URL even though I don't have any ingresses set up or anything fancy right from my cluster. Hopefully it should work. Okay, yes. It's taking a bit. So yeah, we also take a look at the Grafana dashboards. Internet issues. Okay, and the last tool that I'll be looking at, I will be having a look at today is Pixi. Pixi, it's an EBPF monitoring tool. Basically it doesn't require you to install stuff. It gathers its metrics using EBPF proofs on the kernel. So basically it's able to glean some information from your syscores. The syscores that your application is making and make that visible to you. So you don't need to install, like, inject a side card to your application or something to route your network through. So the metrics that Flux exports start with G-O-T-K. They're supposed to be coming up. Okay, let me just check that everything is running well here. Seems to be running okay. Okay, I'll look for the first... It seems the network is taking a minute. Trying to showcase the Flux dashboards. And lastly, Pixi. Yeah, so this is what... Pixi has a really nice UI. This is basically a map showing because Flux ports sort of call out to each other. The customized controller... Okay, I'll just round up. So the customized controller tries to connect to the source controller and stuff. So you would want to know... Pixi's helpful in seeing, like, if you have any networking issues, like, it's able to provide with some of that. You can see, like, the duration that things... the duration that your HTTP requests take and stuff like that. So it could be really helpful in debugging. My time is almost up. Okay, yeah. So I'll just show the dashboards. And I also had an alert... So you could also create more complex alerting rules. I had one set up, like... This is basically supposed to... So maybe you don't want to get your alerts, like... Because the notification controller basically forwards it immediately. Because you might expect your applications to be in a transient state where they are not ready. So you don't want to be alerted immediately. You could use alert manager to configure these alerts. Like, this is really short. But maybe you'd only want to be alerted if your customizations are in not ready states, which is basically ready false. In a not ready state for over an hour, maybe one minute or immediately, you're expecting that, oh, something is still getting set up. It's fine. I don't want an alert for this. So basically you could use alert managers, alert manager for a more complex set up. So yeah, the network is loading. I would have loved to show off the Flux dashboard. But in the meantime, while it's loading, any questions? So this is what the Flux dashboard looks like. The controllers... I'll go to cluster... So basically this shows you the health of the controller. There's one that says Flux cluster stats, which gives you insights to, like, your reconciliations and how long it's taking, right? Maybe you're consigned. You want your applications deployed as quickly as possible. How long is Flux taking to deploy the different applications? So yeah, I think this is most of it. We explored DVS code extension, denotification controller, Q parameter stack, tail scale, and pixie. So in summary, you know, observability is very important. With Flux, you can pick and choose what you need. If you think that the notification controller is sufficient, you just need quick updates on Slack or whatever chat provider you're using. You could use that. You could also layer it on with more complex... With more complex observability solutions, like Prometheus, Grafana, and so forth. So I've included various links to the different repositories if you'd want to check it out later. And yeah, I think my time is almost up. Come by the Flux, I'll love the chat. Thank you.