 Hi. Hello, everyone. I hope you have a great time on this observability day. It grown a lot since we started it, I think, first one in 2022 in Detroit, I believe. It's super amazing to see it's growing bigger and bigger. In this talk, we are really excited to introduce you to the topic of Prometheus Kubernetes operators. But you won't be a standard Prometheus operator talk, by the way, because of two reasons. First, we actually introduce you to two operators. You know, one is the official Prometheus operator. We all love to use and have fun with it. And we also introduce you to BitNewer Google managed Prometheus operator in short GMP, which has some pros and cons. The second reason why it's a bit unique talk is that we'll actually focus on metric collection use case on Kubernetes clusters. So only scraping, maybe aggregating, and sending this data straight to remote endpoint being another Prometheus, Thanos, Cortex, or just Venter. So a little bit simplified concept, but still not easy to grasp. So in the next 20 minutes, I really hope you to learn first why operators are really useful in this collection use case, how to configure and use both Prometheus operator and GMP operator, and when to use which. And finally, you'll see, hopefully, a working live demo of both running on a cluster. But before, short introduction, I have Max with me. Hello, guys. My name is Mohamed Amin. I also go by Max. I am a software engineer at Google. And I'm a GMP operator maintainer. And I was an open source mentor for a few projects like GSOC. Fun fact about me is I have 10 patents. I used to be a closed self-developer, but it's more fun to develop in the open. Nice, thank you. I don't have any patents. But I'm working also with Max. I'm actually leading the Google Cloud Management Service for Prometheus team. I maintain Prometheus. I co-started Thanos project. And generally, I love working in open source. I maintain various libraries as well. I'm also active in the CNCF TAG observability. Join us if you're interested in observability. And finally, I also wrote a book called Efficient Go. It's about goal and programming and optimizations. And actually, tomorrow, I will be giving away and signing those couple of dozens of copies for anybody that will come first. So save this date if you want to have a book on how to optimize your software. You're welcome. But let's focus on today's discussion. In our experience, the most pragmatic way of designing your Prometheus collection pipeline or metric experience, in general, is what we call this collection use case, or agent mode sometimes. As you might know, Prometheus is a very powerful project. It's a binary, actually, that has multiple metric functionalities into one binary or into one process. It scrapes, it compacts, it stores, it gives you alerting, it gives you querying. There's so much stuff that you can put in this binary, simple and convenient. However, while this is useful for more dynamic environments like Kubernetes, especially when hybrid stories into place where you have multiple Kubernetes clusters, but also other environments, maybe other cloud providers, maybe it's quite beneficial to partition the collection logic and the other part. And the best part of the design of just scoping through the collection pattern is that tons of complexity is moving somewhere else, maybe to some other team, maybe to a vendor that you can pay temporarily or do it on your own. And on Kubernetes, what you are left with is really stateless processes, be it Prometheus's, which only scrape your applications, scrape metrics from your applications, group those metrics, possibly add some metadata, group them up, filter, and then stream those metrics to remote backend of your choice. And in CNCF, we have lots of options for that. We have Prometheus actually can be your remote backend. And you will see that on the live demo. But there's also Cortex, Thanos project, but there are many, many other vendors and other projects as well. So do we need to have operator in this case, right? Because we simplified stuff. So before I answer this question, let's have a small disclaimer. I'm generally not the greatest fan of operators, Kubernetes operators, generally. I think we built too many operators for two simple operations. We did a bit interesting stuff around operator actually playing and scheduling CRDs, resources of other operators. And this other operator also scheduled some CRDs of other operators, nested operators. It's kind of getting out of control a little bit. There are certain things you can do in your application instead of actually writing an operator, especially if it's stateless software. So we kind of went overboard. And each operator in your cluster is additional problem. Like it introduces overhead with resources, additional resources, another point of failure, another lag, and kind of delay in whatever operations you are doing. So there is already additional complexity to already difficult systems we run. Despite all of that, I think the answer here is yes. I think a Kubernetes operator for collecting Prometheus metrics makes sense. And I will tell you why. So first of all, first reason why operator is useful is scrape configuration. As you might know, Prometheus requires scrape to be configured in a one-bit YAML, right? So each application can be encoded in SCD visual jobs, maybe. You filter on and reach this with relabeling, which is super powerful. And once you apply this configuration, maybe mount a config map or something, your Prometheus, we know what to scrape done. The three problems with that is that first, you have to understand relabeling and really understand this YAML configuration. And maybe it's kind of complex for you to use. You have other systems to learn as well. Second, if there are different teams, multiple teams on the single cluster, which we already, I mean, we have that a lot because Kubernetes comes with multi-tenancy semantics, like namespaces, for example, you have a problem because suddenly, multiple teams, multiple people responsible for different applications, they come in and change one single file. So there will be conflicts, there will be one team breaking, monitoring for another team, so this can be problematic. Final problem is that to debug any problems with this, you have to really understand Prometheus UI and what is happening. Maybe you go to Prometheus UI, you have to kind of like have a read access to this very critical process that collects your metrics, which is kind of like sensitive. Then also, you break multi-tenancy because you suddenly see all the pods running everywhere on this page. So there are lots of considerations for this to not work well. How operators can help here? Well, there is like this common pattern where operator can expose a custom resource, custom resource definition, that allows to scrape configuration in fine-grained way, for example, per application, per team, per pod. And this is super useful because then you can have a separate file per team that literally specifies per application where the application metric endpoint is, how to select the application, how to filter those metrics, how to remove, add those metrics and maybe how to authorize Prometheus to scrape those metrics under secure environment. So generally, that's useful. On top of that, there is possibility to kind of maybe propagate some status of this scrape, like was this successful, how many targets you actually discovered, and you can propagate those information back to this custom resource. So for example, in the image, you see the status field and you can immediately debug things without going to UI, exposing maybe sensitive information and so on. So generally, it's beautiful because it's way easier, much easier to use Prometheus and make Prometheus collection in a shared environment. And finally, especially on communities, collection ideally scales, right? It scales by the amount of metrics you have, by the amount of applications you have. The beautiful part about using Prometheus for collection is that it's suddenly stateless. So you can totally create another Prometheus and tear down dynamically as long as there is a graceful period for remaining samples to be sent. But that's solved by you in Prometheus project. So generally, you can scale out and back very dynamically to save costs and for ability. While operators are not strictly required for this, they help, right? They can serve as a controller for this auto scaling and at least simplify configuration for kind of complex, maybe hash mode configuration that we have in Prometheus. You literally hide that by, I don't know, specifying number of shards you have. And you don't need to think how to really configure the complex wellabelling for hash mode. So consistent hashing kind of load balancing the scrape targets, right? So to sum up, we should look on either Prometheus operator or GMP operator if you want to simplify your collection configuration, especially if you have multiple teams, multiple tenants. And when you are kind of new or you want to have a very as easy as possible scaling capabilities of Prometheus collection, right? Now let's quickly walk over those two operators. Let's start with Prometheus operator. And I'm kind of grateful to have privilege to work with the team in Red Hat who created Prometheus operator from the start. It was actually one of the first operators, Kubernetes operators, if not the first one. So it was a beautiful journey. I learned a lot from it. So let's walk through it. So kind of for the collection purposes, we can look only on two things. First of all, we can look on the Prometheus custom resource or Prometheus agent custom resource. Both can kind of allow you to specify your Prometheus deployment. And we can kind of specify that for collection. For example, we can specify its name and how many shards you have. Arthur and Nicholas today had a talk about how to make this auto scalable. So that's also an option and it's incoming or already possible as an experimental feature in Prometheus operator. And you specify remote write URL and that's really it. Make sure to put this podmointer namespace selectors even if they're empty because otherwise they will not select any custom resources for scraping. So once that's done, once you apply it, you have three Prometheus running it with hash mode and sending data to remote backend. Now for target scrape configuration, you again have pod monitors for example. There are also service monitors, but I will focus on pod monitor here. And literally you specify what pods they select and what monitoring properties they have. So essentially where their metric endpoint is, what interval do you want from those metrics and maybe what kind of like additional metadata or labels you want to add or remove. So that's all possible. Now let's talk about GMP operator. All right, let's talk about the GMP operator. So what is GMP? It stands for Google Cloud Managed Service for Prometheus. We call it GMP for short. It's a similar to the Prometheus operator where it offers a managed Prometheus experience. It allows you to manage your Prometheus instances. It also allows you to deploy like managed rule evaluators and alert manager. It was developed by Google and GMP operator has a different set of CRDs from the Prometheus operator. I'll show that to you guys now. And it's fully open source and it's available on GitHub. So GMP is different. It's different than the Prometheus operator because unlike the Prometheus operator that deploys Prometheus instances as deployment, the GMP operator deploys it as a daemon set, which means for every node you have, you run a copy of Prometheus. And whenever a new node is added, a new Prometheus instance is added. Whenever you remove a node, a Prometheus instance is removed. And each node has like a dedicated Prometheus for scraping metrics in that node. And Prometheus uses filter targets to ensure that only targets in that node are scraped. And because it's only limited to a single node, that means the number of metrics are naturally constrained to the capacity of the node. So let's talk about our first CR that's called OperatorConfig. It's very similar to the Prometheus and slash Prometheus agent CRD that Bartek just talked about. And it's a singleton that runs on your cluster and it's like a top level view where you can configure special configurations for all your Prometheus instances. Some of that is like remote write, like you can see here or collection, what kind of compression you want for your metrics, what kind of filters you want to use, et cetera. So our main CR, which is similar to PodMonitor that Bartek talked about, it's called PodMonitoring. And PodMonitoring has a strict namespace tendency. That means it can only scrape workloads that exist in a certain namespace and you can't scrape targets in other namespaces. As you can see in this example, you can see in the bold namespace one is highlighted and we're scraping app A. So app A and nodes one and two are getting scraped but node three app A is in different namespace so it's not getting scraped. So the next one is called cluster PodMonitoring. It's the same thing as PodMonitoring except it's cluster scoped. And this allows you to scrape from apps all over your clusters without namespace. This is an improvement on PodMonitoring because if you have many similar apps that exist in different namespaces, you don't need to create a PodMonitoring for each of these. Also if you're an admin, this is a very powerful tool if you want to query across namespaces and look at what your apps are doing. All right, let's talk about optimizing GMP operator for daemon set deployment. So all the GMP CRDs are a Taylor set for a daemon set. It also optimizes Prometheus configurations for daemon sets. And this is the complex engineering that we're hiding from the user. And some of these technical challenges were utilizing the Kubernetes API WatchCache which keeps an index of all the pods that exist in every node name and this is a cheap way for Prometheus to know what pods exist on the node that it currently resides in. And this all happens without impacting Kubernetes performance. For the same reason we don't need service monitor abstractions like Prometheus operator. If you're interested in more details you can watch this talk by Danny Clark called Stateless Collection for Stateful Data Collection and from KubeCon 22. So an upcoming feature we have is called Secrets Management. This feature allows the GMP operator to handle your secrets without exposing a Prometheus that doesn't need access to that secret. And this is really challenging engineering problem and it's very hard to implement in a secure and efficient way. We have some ideas on how to implement it, so stay tuned. All right, demo time. So for this demo we're gonna show you how easy it is to deploy GMP operator and Prometheus operator and we're gonna have a sample app that we're gonna scrape metrics from and we're gonna do all that at the same time. Pretty ambitious. All right. Let me mirror my screen. All right, so I have a clean cluster running right now and it has three nodes. So for GMP operator it's gonna deploy three collections, three Prometheus instances and for the Prometheus operator we can configure that. Sorry, let's configure a metric source. So we're gonna configure this app called Avalanche. Avalanche is basically an app that emits metrics and it's scaled over time and let's configure a Prometheus remote write that you can write metrics to. All right, so this is how you install a Prometheus operator. This is provided to you, bundle.yaml, it's on there, GitHub, you could just use it and you just apply it right away. And then the next step is you can figure the Prometheus CRD that Bartok was telling you about. So as you can see here, we're gonna have two shards. We're gonna have a remote write and we're also gonna add an external label called operator, Prometheus operators to mark that this came from Prometheus operator. That was Prometheus operator created, Prometheus instance. All right, so we're gonna apply that and yeah, here we go. We see the operator came up and the two shards of Prometheus are coming up as we speak. All right, let's also install GMP operator as well. So this is how we set up the CRDs for the Prometheus operator. This is all provided to you on the GitHub. It's the first thing you run. And then the second thing is just install the operator also provided to you on the GitHub. So also similar to what we just did for the Prometheus operator where we configured the Prometheus agent CRD, let's configure the operator config which is its equivalent for GMP. So we also have the same thing remote write, same endpoint and we add an external label operator GMP so we can know that this came from Prometheus created by GMP. You don't need charting. We don't have that because we exist in every node. All right, let's apply it. All right, so as you can see, like remember we had three nodes so three collectors are automatically created and you have the operator. So yeah, that's how easy it is to start Prometheus operator and GMP. All right, let's do some interesting stuff. Let's start scraping. All right, so this is a pod monitor. Walter introduced us to it and we're just scraping this endpoint metrics every 15 seconds and our app is called metric source. So let's apply that and let's do the same thing for the GMP operator. So we have a pod monitoring GMP, same endpoint metrics every 15 seconds and pod monitoring is our namespace scoped but since there's no namespace here, it's gonna use defaults. All right, let's apply that as well. And let's see, all our apps are running. All right, let's try to see and Prometheus if I can find it. All right, so click to execute. You can see Prometheus operator come up. There's two avalanches that we're using and GMP operator is starting up also. The Prometheus operator we started up first. So as we query, GMP operator should start up and you can see that I'm aggregating by operator. So Prometheus operator has that external label and GMP operator also has that external label. You can explain what this backend is. Let's explain what the backend is, like receiver, for the Prometheus receiver. Yeah, so our backend is essentially a Prometheus receiver and we are sending these right requests via remote write. And so it's gonna come up slowly over the next couple of minutes. But yeah, all right, let's, yeah, Bartek will give us the TLDR. Amazing, thanks for the demo. Went almost smoothly. So we will have two choices and probably you're asking, okay, which one you should choose? It's not like one is better than another. There are essentially pros and cons. So let's go walk through them very quickly. So generally, Prometheus operator really focuses on hash mode deployment. Like that was the first deployment model Prometheus operator offers and it's really good at this. And it has, again, like benefits. One benefit of hash mode is for example, when the Prometheus running on the node, you know, you have Prometheus per node as a demo set. If the target is to be on one node, you might have a problem. You have to kind of vertically scale that. But in practice, you know, we found that's a good trade-off to have. It's also worth mentioning that Prometheus operator is working on more deployment models. In fact, it's incoming the demo set based deployment model and we are kind of helping contributing this to Prometheus operator will come eventually, right? However, the CRDs will be still kind of like tailored for everything else. It's not gonna be like fully tailored for demo sets. Which comes, this is where another point comes, right? So Prometheus operator is really good at running Prometheus. It's not only for collection, but also for all the other stuff, like literally as a full metric backend, right? So you can scale on Kubernetes Prometheus' for hail availability, you can have queries, storages, alerting, all of this. It's combined with Thanos processes and so on. So it's kind of smooth on that flow, on that stage. For our GPU operator, it's a little bit different because we only support a demo set intentionally, right? So we can have simpler CRDs, simpler pod monitoring and so on. So you kind of like, yeah, you don't have a lot of choices. You just, you know what you're doing and it's optimized for that model. Finally, Prometheus operator has just all the knobs that you might need from Prometheus, like literally what Prometheus offers as a flags. You know, Prometheus operator has as well. For GMP operator, we don't show all the options. We have opinionated, Google opinionated version of your configuration. Watch you, in our opinion, you should kind of like know about. So it's much simpler. But then if you're more advanced users of Prometheus, then maybe you'll be missing some option. And still it's open source, so let us know. Maybe there is some important option to have, but we keep it limited so it's easier for users, but it has some trade-offs again. Prometheus operator has service monitors, right? It's not easily possible to have service monitor in a pod monitor, in demo set deployment, although we can kind of like try to hack it through. So, yeah, we drop it. And to be honest, to be fair, like it's really not that big change. You just switch to selecting pods instead of selecting your services, so it's not that a big deal. And GMP operator, for example, has target status propagation. So it kind of tells the YAML back. Essentially tells the CRD pod monitoring what actually is scraped, what targets they discovered, and so on. You have to enable this feature manually in operator config, but it's worth a while. So, generally, TDR, Prometheus operator, more options, more useful if you want a local querying capabilities. And GMP operator, perhaps simpler stack, and also it is enabled by default in GKE. What I mean is that when you create a new cluster, GKE cluster in Google Cloud, you have this installed, and literally you can start using pod monitors, and you will see your metric in Google Cloud monitoring, or you can configure a custom remote write endpoint to send it to your whatever else cloud, or Thanos Cortex, and so on. With that, that's all what we wanted to cover. Thank you so much, and yeah, we're open for questions. Yeah, I think there was a mic somewhere for questions, and... First of all, thanks for the presentation, awesome. My first question would be about the pod monitor in the operator of Prometheus. You can do relabeling and metric relabeling as well. Are you able to support the relabeling config on the pod monitor provided by the operator? Yeah, so you have metric relabeling, right? So you can do whatever you want after the metric is scraped, but for relabeling that is before metric is scraped, so this relabeling is used for service discovery, we locked that in, and we offered a simplified, opinionated filtering, and a way of adding labels and removing labels, right? So literally you have like add label and remove label. And just simplify the logic, but of course you cannot do all the magic that you can do for relabeling, right? And then I have a, I don't know how to name that type of question, but the open-temperature operator, we discussed about it, has a target allocator that discovered the Prometheus CR by default, so if you have any pod monitor, it will scrap it automatically. Are you planning to release like a Google open-temperature operator supporting those CRDs instead of the traditional Prometheus versions? Yeah, that's a good question. I don't know what I want, what I can share, but we are definitely working on some open-temperature, you know, like managed experience in some opinionated way, but yeah, stay tuned, there will be something. And last question is the Prometheus operator provides a Grafana, and there's an option to say, to deploy config maps with your actual JSON file of your dashboard, do you support this or are you planning to support this? No, again, we are only collection focused, which makes us, makes this operator much simpler and easier to maintain and for everyone, so no, it's only collection, yeah. All right, thanks, thanks. Thank you. I think we have to finish, but one thing we have a ContripFest, which is essentially a workshop on Friday, where we will be literally with you on your laptop, we'll just deploy Prometheus operator or GMP operator if you want and we'll try to stress it, so come and we'll help you to run it on your own laptop, yeah. Thank you. Thank you.