 All right, with slight delay, welcome to the Prometheus introduction. Unfortunately, I am not Julius. Julius couldn't make it due to reasons. So I'll have to do some PowerPoint karaoke for you. My name is Matthias. I'm an engineer at SoundCloud. I, let me make my notes bigger so I can read them. All right, I work on the production engineering team, which is the successor of the team that originally built Prometheus. Yes, in the first part of the session, I will give a very brief introduction to Prometheus. This will not be an in-depth guide. We have like 30 minutes to get through everything. So we'll just be a teaser. Show of hands actually, who here is not yet using Prometheus? Ah, there's a few over here and there. All right, for the rest of you, this is gonna be a very brief refresher, and then we'll get to Bjorn, who is gonna walk us through some news and updates. We have about half an hour together. With the delay, we may not get to Q&A. If you have any questions, the Prometheus boof in Pavillon 2 is always staffed. You can always find us there and you can always ask questions. Prometheus is a metrics-based monitoring system. At the time, we created it because there was nothing that was able to monitor our growing zoo of microservices. We were already deploying to a dynamic container-based platform, and we really needed some more details than what Nagios was giving us, which was, is it up? All right. Prometheus is not trying to cover all your bases. Prometheus is a metric monitoring system. Typically, you will combine it with other systems for log collection and tracing. So, again, brief refresher. We'll take a quick look of how you build a monitoring system with Prometheus. Prometheus is a pull-based system, so everything starts with your web app or your API server, whatever it is that you are running. You add one of the client libraries to instrument your application. We have one in almost any language, and if we don't have it, someone else has built it, which is kind of neat, or you can build your own. The output format is really not that difficult. The client, the instrumentation in the client itself is very lightweight because, and that's a big difference to push-based monitoring systems, your application doesn't put anything on the network at the point of an event happening, at the point of an HTTP request happening. We're not trying to send metrics anywhere. All we're doing is incrementing some numbers in memory. So, it's blazing fast on the hot path of your application. Then, and then these current values of, typically these counters are exposed as an HTTP endpoint that can be scraped asynchronously. Now, you don't control all the software that you run. Not everything that you run is something that you've written yourself, hopefully. So, there's a lot of interesting metrics in software that you don't control and that don't have native Prometheus or open metrics support. More and more things do, but if they don't, they need a little help from a friend. We call this friend the exporter and we'll talk a little bit more about exporters later today. So, in the Polar architecture, the Prometheus server is the thing that reaches out to all the things that you want to monitor. And in this example, we have only one, but you can have as many as you want. They may be interested in different subsets. It doesn't matter to the client to the thing that is instrumented, whether it is being scraped at all by how many Prometheus servers, how often. That means there is very little to configure in all of your microservices. Instead, Prometheus integrates with various runtimes, various mechanisms or custom mechanisms that you can build to find all the possible targets. And it has some rule language for filtering them and adding or like shuffling around the metadata you have about them. So, for example, typically when you're monitoring Kubernetes pods, Prometheus just fetches a list of all the pods and then this language filters it down for you. If you're using the Prometheus operator, it generates that for you, which is kind of neat because it can get very verbose and slightly tricky, but it's there. You probably want to look at your metrics. That's kind of the point. So, typically there is a web UI in Prometheus. I would really only recommend it for exploration and debugging. It is not a dashboard. We used to have our own dashboard builder, but then Grafana came around and we said, all right, this is so much better. Just use Grafana. It's the tool of choice. However, everything is API-driven, so you can totally build your own automation, your own visualization, whatever you want on top of Prometheus. Finally, of course, in monitoring, if there is a problem, we do want to tell someone about it and this is where Alert Manager comes in. Alert Manager handles grouping alerts together so that you're not overwhelmed by a million fine-grained alerts happening at once, and that is totally a thing that happens. We have a lot of alerts where we just say, all right, tell me if any of the endpoints on any of my instances has an error rate larger than 10% or 1%. You have 1,000 instances and you have 1,000 endpoints. That's a million. That can happen. Alert Manager will, if you configure it that way, group it back together and say, yo, you have a problem, here's a million things that are wrong. You may want to look into them. But it doesn't, like we used to, before this, we used to have pager storms where you would, in an incident, you would have one person just sit there and acknowledge pager duty because there was so much coming in. This is much easier to handle with Alert Manager. And Alert Manager allows you to configure routing rules to determine who should be notified and how. Refuel. All right, so next, I want to walk you through some of the key points of why we think Prometheus is worth considering or why we think it's good. First of all, the data model. The data model is simple and rich at the same time. The fundamental building block of Prometheus metrics are time series. Time series are some identifier and then a series of timestamps and values. So often these timestamps are in a regular interval, but they don't have to be and the values change over time. So what's the identity, oh yeah, and all the timestamps are just integers, all the values are float 64s so far. That will change in the future, but fundamentally this is what's happening, right? It's a series of, at this time, the value was this float. What does the identifier look like? An identifier typically has a metric name and a series of labels and label values. It's sort of up to you when you make up your instrumentation, how you split the two, what you put into a name and what you don't. As a rule of thumb, we say it should be in a label if there is some sensible way that you would want to aggregate it away. So in this example, for example, you might be interested in this single time series in particular or maybe you wanna see all the status 200s of across all the paths on all your instances, right? So that's why the path is a label here, whereas you wouldn't want to mix together HTTP requests and CPU time used. That makes no sense, so those two should have different names. That's the rough guideline we can give there. Frequently we use counters like this. They represent the total number of events since some unspecified point in time. This can be the time that your instance started, it doesn't have to be, it is absolutely valid to start your counter at a random number that you just pulled out of your hat. Or in other, yeah, so there's many ways. Typically they start at zero and they just count up over time until the instance restarts, but it doesn't have to be this way. This lets us be flexible on scrape intervals. We scrape every 15 seconds, you may scrape every minute or every three seconds, you may have different Prometheus servers scraping the same, hello, okay, yes, power up. All right, where was I? Counters, counters represent the total number of things that happened since some unspecified period of time. We never really look at the absolute value of a counter as such. We always look at how much did it change over an interval. The nice thing about this is we can be very flexible how often we actually scrape these or whether we leave out the scrape. Something happened, Prometheus was restarting. It doesn't matter, we don't lose track of the counting unless in that interval we also reset the counter, of course then the information is lost. But for the most part, we can be very flexible, we can have multiple Prometheus servers scrape the same counter at different intervals and it all still works out. So, and afterwards, we are not necessarily, as long as we scrape often enough, we can reconstruct how many requests happened in any one minute interval, 10 minute interval, seven and a half minute interval if you really want to. So that's why we use these ever incrementing counters rather than just recording the rate at any point in time. Yeah, we'll see how that plays out later. I really want to give you just a quick glimpse, quick glimpse of the query language, it can't be a full tutorial, the query language is by far the most interesting thing to wrap your head around. So this is just a sneak peek. The most important hurdle to overcome is trying to think in SQL. If you've done SQL before, some of it seems familiar and it seems like you can kind of try and translate between the two don't. That's not a helpful way to think about PromQL. Think about PromQL as managing time series and like multiple time series and matching them up and working with them. First of all, you can do math on individual time series. In this case, we're just dividing by one billion and we're filtering out those that are larger than 100 afterwards. This happens individually on each time series. So in this example, we will have, we will only see, we will first divide the file system, the bytes by one billion so we get gigabytes and we only show those that are more than 100 gigabytes. So then we can do it more complicated. So we're back to HTTP requests. There's a few things happening in this query and I want to go through them quickly one by one. Can I point? Oh yeah. All right, you already heard that we like to keep total counts. The rate function here is what looks at the time window. This is what looks at the difference between when the counter started and the counter ended. Plus it does some magic to compensate for if the counter reset in the meantime, it understands that, it understands how to interpolate values if we're asking for a time window that doesn't quite align with when we scraped. And just like the scalar calculation we just looked at, this happens individually for every time series. So we're doing this one, this rate separately for every path, for every instance, for every status, except in the second one for every status, here we're only doing it for the ones that have status equals 500. So we don't do unnecessary calculation, but we do calculate the rate individually for each of them because they will have reset at different points in time. We cannot, we can't turn this around. Then we sum over them in this case, this just gives us a whole number. So we can divide those numbers again and we get a result. In this case, so our error rate here is like 2.9%. Not good, not bad. What should we do if we try to divide some metrics that have labels? So here we are summing things up, but we are not summing over the path dimension. We are preserving the path dimension and then we are dividing the two. And then the division again happens one by one for the ones that fell out of, for the multiple series that fell out of the sum by, we divide each of them individually. Homework for you is to think about what should happen if a path only appears on one side of the division. Alerting, again, I can't go into any of the details here. The very, very short story is you write an expression like this one that only returns data when there is a problem. And if there is a problem that you want to alert someone about, alerts have labels and annotations. Labels help alert managers sort out the routing, annotations help humans sort out the resolution. Efficiency, Prometheus in itself does not scale horizontally. That's a deliberate choice. We want to keep it simple. We want Prometheus to be the very last thing that breaks. So we want Prometheus to not itself be a distributed system. In practice, you can actually go quite a long way with Prometheus without actually having multiple. The storage and collection engines are quite efficient. We have many Prometheus servers with millions, like 10 million active time series is no problem if you give it a good but not outrageous amount of memory. And a million samples per second on a single server, no problem. Memory is really your limiting factor. You need to keep up two hours worth of samples in memory. The good news is that compression is very good thanks to Björn. So we only need about one to two bytes per sample on average for typical workloads. Retention time is, it defaults relatively low. It's something like two weeks. In practice, we have not found an upper limit yet. As much as your storage can handle, we haven't found the point where Prometheus really breaks. So it's really just how big of a disk are you willing to give it? And if that's not enough, there are related projects that add scaling and aggregation across many Prometheus servers Thanos, Cortex, Mimir, all help with this. Circling back to exporters, not everything speaks Prometheus yet. So exporters help you bridge this gap from other metric systems that you may already be using from other system specific metrics like the Linux kernel metrics, the MySQL, whatever database you're using, or you can DIY it because they only need to return a number when they are being asked over HTTP. Exporters are much simpler to write than you think and I would encourage everyone to write an exporter just for fun. The result of having exporters is that Prometheus itself doesn't have to understand every possible software in the world. It's very easy to integrate it with whatever you already have without having to have Prometheus itself understand everything. So in conclusion, if you're not using Prometheus yet, but you're interested enough to watch this, you should give it a try. And if you're already using it, you should use it more. And with that, I'm gonna hand it over to Björn for the deep dive, or more like a dip dive. Yeah, I hope I'm on now. So Julian was supposed to be here. He is missing for the same reason as Julius. Now you can guess what reason that is. I'm still here, like we wanted to show off that for me, this is a true multi-stakeholder project. We are four people all from different companies. Although originally myself and Julius and Amar, we all worked for SoundCloud. That's kind of the common origin, but he's still at SoundCloud. I'm at a company called Grafana Labs. You might have heard about them. Julius is running his own gig and Julian came later to the project. I think he's the most active person in the Prometheus project right now. And he is running a new thing, Olly. Also quite interesting. So he can't make it. This is called the deep dive section. It's a bit weird that in contrast to the olden days, we now have a deep dive and intro session in the same 30 minutes, which is very short, also completely different audiences, right? But first you're all somewhere in between and you actually like this. So I decided to not dive deep into a certain topic, but just give you like a flashlight of stuff that is new. Julian is also giving a talk. He wanted to give a longer version of this at Prometheus Day, but he couldn't make it. So he will record this and share it via his blog or Twitter account. So totally look for that. It will be a nicer version of this. This is essentially just pointers. Also just looking at more or less the last 12 months and only looking at Prometheus, Prometheus, which is the Prometheus server, but that's only one component of the huge ecosystem. So there's even more than that. So it just goes through a few things. In version 2.25, we embraced feature flags. This is actually a bit longer ago than a year, but I like to give it kind of an honorary mention here because feature flags were pretty much like in Kubernetes. We should have had this idea way earlier. It allowed us to be like a bit less risk averse with new features. Originally Prometheus, half of the world is using it. So we are fairly conservative in terms of stability and breaking people. So now we have feature flags. One reason to have them is to be really explicit about experimental features. So we had this before. We just said, we had this feature. Don't use it in production. It will change anytime, but people just used it in production because they didn't even know it was an experimental feature. So now you have to switch it on with a feature flag. Marks that really clearly, that is one reason to use them. Also we can use a feature flag to introduce a breaking change without having a new major release because you switch it on deliberately and then you have a new behavior which is breaking, but by default you don't switch it on. In this case, if we like the feature and we have a major release, like in 10 years or so, we have Prometheus 3, then we can make it a know-of and it's just on forever. Similarly, if an experimental feature that is not breaking is fine, we just make it the default and then the feature flag becomes a know-of as well, but it will still work, right? It will just warn you. And then the final reason to use a feature flag is when you really want to change the behavior in a way that wouldn't even go into the next version. It's just a different mode of Prometheus. If that feature becomes stable, it would become a normal flag of changing behavior. I have examples for each of this in all the new features here. So what's new in Promptuel? By now they look kind of incremental those changes, but if you use them, it's kind of great. The first thing is here, we allow negative offsets now. Offset allows you to query something in the past. Now you can query stuff in the future, predicting the future, yay. Of course that doesn't work even for Prometheus, but sometimes you run query historical queries. Like you are interested, it's like the temperature per last month. How did it compare to the average around this evaluation time? And then you kind of from your point in the past, you look into the future and use a negative offset. Similarly, we have an app modifier where you can kind of pin parts of your query to a certain point in time. They're even like helpers. This is the end of a range query if you're on a range query. So this is pretty special, but it allows a number of things. This query, for example, finally solves the surprisingly hard problem, how to draw the top five lines in like traffic or whatever. It's a bit more complicated. There is a link here. It's also clickable if you look at these slides in the PDF where there's a blog post explaining this. This is Boston Caches. So we Caches could assume that you never look into the future. And so we thought it might be a breaking change, but in the end, we decided not. So it became stable as now default. You can just use it in the newer version. Trigonometry happened finally. It's a bit of a problem that people want all kinds of functions. And we don't like one to implement a million functions from PL, but this finally made it. So you have sine function, cosine, even a 10, two, which has two parameters and we use the promcule label matching. It's a bit weird, but if you really need this, you will appreciate it. Happened 2.31. This converts from normal degrees into radians, which is the default. And people use promiscues to monitor wind turbines, which is pretty cool. Present over time is also a nice little thing that tells you if there is a sample at all in this time range, there are use cases for that. Most of you haven't run into it, I guess. So this is a biggie agent mode. It's also an example for a feature flag that will become just a normal flag because it just changes how Prometheus works. There's some irony. Matthias mentioned this. Prometheus is designed to be the last thing standing in your environment because it was meant to enable you to monitor your systems without relying on a monitoring provider. But nowadays everyone wants to use like cloud providers for your monitoring. So you can still do both kind of with Prometheus. It's also ironic that all the cloud metrics cloud provider ingest Prometheus metrics now, which was supposed, it was created to kill them, right? And now we all love each other again. But like, yeah, so you can mix a match, but if you're really relying on your cloud provider, it feels like, why should I run a fully fledged Prometheus setup? So now you can just slim it down with the agent thing. You have no local storage, only the wall. You have no valuation engine. It all consumes fewer resources. And the only thing it does, it sends everything via remote right to your metrics provider. Could be actually quite cool. Like a lot of our customers are doing this and I wouldn't do it, but I'm also like very conservative, esri-ish. Anyway, remote right receiver, that's kind of the flip side of that coin. At some point we realized we need like just 10 lines of code to let Prometheus read its own remote right protocol. This turns Prometheus into a push-based monitoring system, which should never happen, right? So this is why it was a feature flag initially because we were really like kind of concerned. People would abuse this. They will assume Prometheus is now a push-based monitoring system. I still don't recommend this for like production, but it's just nice. You want to just ingest something that comes over remote right, just in a toy test setup. Want to play with something or you have just a tiny amount of metrics, then it might just work. So it's now stable. So it's on by default, but please don't try to ingest humongous amount of production metrics in this way. Okay, what else do we have? This is, that was great fun. We had a year, almost a decade long discussion about should we expand environment variables in our config. Many programs do that. Prometheus has this purity ambition that you have one and only one way to configure things and this is in your config file. If you really want to just write a wrapper script, they will expand the environment variables and then write out a config file. But so many people wanted it. So we had a long, long, long discussion. And finally we settled on, it should be allowed in the external label section and only there, which is the most requested use case for that. Personally, I would have done it more, but now I don't want to open up this. Kind of works again. So it is still a feature flag, why? Because somebody might have had a dollar curly brace, like not meant as an environment variable, but as the real thing. So it is, strictly speaking, a breaking change. Unlikely that somebody's doing this, but with, I don't know, trillions of Prometheus users, it might happen. So it's a feature flag until Prometheus 3, which will happen in 2031. I don't know, no timeline. Okay, this is an interesting one. And Julian, I will cut this really short because we are short of time. But this is all for downstream distribution and library users. So normal users will notice nothing of this, but Julian was really, he did a lot of stuff here and it's a little Twitter storm about this. And he will certainly talk about this in his longer version of this. So it's essentially, finally you can use Prometheus code base more or less properly as a Go module. You, the UI, which has a million react dependencies is packaged as kind of some pre-compiled source package. So it can hopefully be packaged in Debian probably. We published photo buffers on, what's his name? I will just get that right. Buff.build, right? And what was the middle one? I can't see this on my notes here. Ah, plug plugins, that's the coolest thing, right? So you have build time plugin system Prometheus binary grows and grows because you support a huge amount of services carry mechanism and every services carry mechanism comes with a huge client library, Azure, GCP, whatever. So this is the main reason why the Prometheus binary is big. Most people don't care, but some do. So you can just switch this all off at build time and then you have very slim, nice, small Prometheus binaries. All right, what's coming? Very quickly, the most important thing in my view is the new false high resolution histograms. Very cool stuff. I also have worked on it most of the time. It's my heart and soul into it, so I'm clearly biased. But Ganesh, who also worked on this will give a very, very cool talk later today. Please all go there. Also alert manager, I will see more love now. The current maintainer has more time and my colleague Josh will also work on it, gives the talk here pretty cool example for something outside of Prometheus. If you wanna look, I know I haven't done this, but we have loads of design dogs in the pipeline. A lot will happen. You can see this in our documentation. We also want a more formalized proposal process for new stuff. Another new thing is important for Julian. He will maintain this. We will do 2.37 will be a long-term support version where you get security updates for a longer time. Also very cool. And I think that should be it, right? Yeah, thank you very much. Do we have...