 So welcome to the Prometheus update. I'm going to do the usual social offense. Who knows what Prometheus actually is? OK. This is way more than in 2007 and 2018. Like there was a full room and no one actually knew what it was. So this is really good. Who's using it? Like just a little bit? Or who's using it in production? Nice. This is way more than a few years ago. This is really nice. So for those of you who don't know the structure we always have with the Prometheus update is we have a short intro. We have a little bit of a deeper dive session. And we try to get through this as quickly as possible. So we have enough time for questions at the end. So you can just ask whatever you want to ask without us basically prescribing the content and just babbling until the very end. So the very quick version of what Prometheus is, it is a metrics engine. Metrics are numerical data, which you can store about your systems. It might be temperature. It might be how many requests your web server gets, how many sales you do through your web shop, things like these. You would normally instrument your applications. They have a variety of different instrumentation libraries and also auto-instrumentation. You can also use open telemetry. You can use the Prometheus client libraries. Hooking into your applications and exposing the data which matters to you automatically. Prometheus is the stuff which does the metrics storage and collection and basically all the computation of the data once it is in. You can query the data. You can alert. You can dashboard everything through one single language. For those deeper into math, it is a functional language for doing vector math on your observability data. In more simpler terms, basically you can do actually add scale math on your data. And this is part of why Prometheus is so successful. Of course, this was the first time that this was actually available to anyone with an open source and they could just use those advanced curing mechanisms. You can use it for everything. If you still have a data center, you can use it for your diesel generators down to your microservices and everything in between, which is also very much different from previous generations of observability and monitoring tools and APM and all those things where you're usually limited to just network or just infrastructure or just applications. Here you can do your whole stack with one single solution. And also, it is the absolute default for Kubernetes, for LCD, for everything, like everything which does anything with metrics within Cloud Native or speak the Prometheus Exposition format because that's the standard across all of this. So short history. Started 11 years ago, was fully open sourced nine years ago. We joined the CNCF as the second project after Kubernetes. It was just called CNCF, but it was not even called KubeCon when we joined, so that early. We released version two in 2017, currently working towards version three. And we are also the second project to graduate within CNCF. If you want to watch the documentary about Prometheus, there's a QR code and the link. I hope this works here. That's actually the point. I don't know. But yeah, this is, it's nice. And CNCF spent a bit of money on making this. I'll just give people the time. Okay, there's one more. Okay. So a little bit about our growth. I also work at Grafana Labs and with Grafana, we actually see how many instances of Prometheus are being queried or how many Grafana instances have a Prometheus back end defined. And this is one of the very, very few numbers we as Prometheus team have about actual adoption of Prometheus, cause except for GitHub stars and such, there is no way to know who's using this. Like for example, here at the project booth, lots of people came up to me, hey, we're using Prometheus for years and we're successful with it. Few years ago, no one in China was using Prometheus. We don't notice, we don't see this, cause we don't have any installation numbers except for the ones through Grafana Labs. So in here you can see the growth over the years and we have a little bit of a tradition to update those numbers once a year at PromCon, which is starting today, German time in Berlin and I'm flying back tonight to there. So this is why I'm also showing this updated number cause there's ones per year where we make this number public. There's a ton of contributors. We just came over 50k stars, I believe this morning, but I couldn't update the image on this Chromebook here. And if anyone here wants to join and to help with development of Prometheus, more than welcome. Kubernetes has like a thousand people and it's the largest project and Prometheus is second in pretty much everything except contributors. We have 20 people and maybe 10 people do work. So we have really, really, really very few people. On the positive side, anyone who does join, anyone who wants to contribute can make an absolutely outsized amount of impact on open source by joining one of the non-Kubernetes projects, ideally Prometheus. So yeah, we are always working to expand the team but having more people would be even more better for obvious reasons. But again, if you are interested in any of this, just contact us. You'll find the Prometheus developers and the Prometheus users mailing list. If you just search for it or just search for Prometheus community, you'll find all the links or just talk to me afterwards. We really need more hands to do work. So how does all of this actually work? Let's say you have your web application, you maybe have some API server, whatever you have a bunch of microservices. In this case, you would have the client library, either the Prometheus client library or also you could use the OpenTelemetry instrumentation libraries and put this directly into your application code and start exposing data from that. There's also a concept of exporters which are basically reverse proxies which are translating from various different languages and systems into something which Prometheus understands. We do this for efficiency reasons. Of course, by doing this at scale at the far edge of what we scrape, it's a lot easier to horizontally scale the load of doing those translations because if we would do it centrally, we would have massive computation problems at the center but so we can push this out to the workloads towards the edge. So you have those exporters and they can just translate between, for example, what your file system in Linux looks like or the MySQL or whatever. And then you have Prometheus and Prometheus comes along and scrapes all this data. Scraping basically just comes by, takes stuff from a web page and stores it. And that's it. So where does it know where to look for monitoring data? Easy. There's a system called Service Discovery which is a way of telling Prometheus what it should be getting its data from. If you're using Kubernetes and most of you probably are, you have a direct integration in Kubernetes for Prometheus where you literally just tell a Kubernetes, there is a Prometheus which is allowed to talk to you and you tell the Prometheus, there is a Kubernetes you should be talking to and they start talking to each other and everything which is running on this Kubernetes cluster is automatically started to be scraped by Prometheus and you don't have to do anything. And you restart 10,000 pods and everything just happens for you. You don't have to do anything manually. That's part of the power of Prometheus. If you're using any of the major or mid-sized cloud providers, we have integrations for all of them. So whatever you're using, as long as you tell the systems, they're allowed to talk to each other, it just works. We also have Consul and like we have literally dozens of different service integrations or services cover integrations and also we have a file system based or file based service discovery mechanism where you just have a YAML file and you can basically do free form. This is something which we see very often in physical infrastructure where oftentimes you don't really have proper inventory in a database and so they just have files and that's what we use to scrape all of this works and you can combine this as you want. And then obviously you want to see what's happening. So you would have your Grafana or you would have the Prometheus web UI. You can also put automation all against your Prometheus, all written in this one language. This again, promql, this functional language for vector math. And also you have something called the alert manager. And that's where you can make the picture or take the picture, that's the last slide. So the alert manager, because part of what Prometheus does for you, it does a lot of computations and you can run really advanced queries, not just am I over 90% disk or something. You can also say things based on the trend of the last 24 hours, what does the next five hours look like and would I be above 90% or 95 or 99 or whatever in this timeframe. So you can basically do predictions into the future, which is really useful with system capacity like disks or with SSL cert or TLS certificates and things like these. They can predict what will happen in the future. So you basically predict an outage. You predict something failing and then you can alert before it actually happens. So you don't have to scramble or something is already broken. You get told before it breaks. Very, very powerful. And obviously you can then look at stuff in your different front ends and also do automation based on the alerts. So let's look at a few things of what has been new in the last few months or so. So basically this year. And this is to be clear only in the main premises. We have also the agent. We have the various client libraries, blah, blah, blah, blah. This only looking at premises proper. First and foremost, the most important we have native histograms. So for the last 11 years, our histograms were very, very coarse and they did what they needed to do but they were not easy to work with. So if you knew precisely what you were doing and if you knew precisely how your system was working, you could use them very, very effectively. But usually you need to kind of get up to speed and basically iterate towards a better understanding of what your system is doing to have proper histograms which is honestly not ideal. So what we now have is we have so-called native histograms. If you look at the left side versus the right side, you see that there's a lot more resolution in your data because we for basically the same cost, even lower cost, give you much higher resolution into your data so you know where precisely your latency spikes are or something like for example here, this red line. You see precisely on the right hand side that this is where a lot of the queries are coming in. Whereas on the left side, it's so broad you don't really see what's happening. This is really, really, really powerful. It's really awesome. You can also have fun with this. Like this image is drawn within Prometheus and Grafana through a native histogram. Just, of course, we can and it was fun. And a little bit of a tech demo, but this is like you can also have fun with this. There's also like we once re-recorded one of the developer summit videos into this system and just uploaded it, of course. Again, we can. We recently had massive reductions in memory usage. If you're still using an older version of Prometheus, you basically want to update to the new version tonight. Of course, you will literally save 50% of your memory. This is, again, huge. We saw this in really large clusters at Grafana. We saw this at various other users of Prometheus and throughout the user base, we see those roughly 50% of memory reduction. And the larger your workloads are, the more your savings are. We also have something which just improves your quality of life deep at night when alerts are firing. We have with the keep firing for attribute, you can now say how long something should be alerting even after the alert stopped firing. So let's say you have some really important thing, your error rate of your online shop, of the actual where you have the payment. This is a super important thing, of course, without the payment processing, you won't get paid. So let's say you have this error rate and it goes above a certain threshold for a really short time and then goes down again. So you only get one quick alert and when you look, it's already gone. That's not nice. So you can say let's fire for at least 15 minutes so you see, hey, this happened. Also, if you have something which always goes under, over, under, over, under, over your alerting condition, it always goes away, comes back, goes away, comes back and always has new alerts by saying, hey, just keep firing for 20 minutes. Anything which is coming and going is just, stays there and you don't get new alerts all the time and it's just, so your phone doesn't explode. We also have an OTP endpoint. We support native ingestion of the OTP format. While we still recommend people should be using the premises exposition format and scraping for a variety of performance and data preciseness and stability guarantees, we do also acknowledge that some people just prefer push and OTP. So we have the OTP endpoint now and we are currently working towards basically becoming the best backend for anything with OTP in it for a simple reason that we are the cloud native default and we want to stay the cloud native default very bluntly. Few things which are coming. We are currently working towards having a completely new UI for the alert manager. We want to improve the metadata where for example, the type of metrics and when certain counters have been created is not only put into memory within Prometheus but also put into storage. So if you restart your Prometheus, you still see the same data and it doesn't just go away. Exemplar improvements, that's a good point. Who here knows what exemplars are? Great. So exemplars are a way, basically exemplars are the ID of a trace of a, can be a distributed trace, can be a classic trace, does not matter, and or a span. And you put those IDs directly onto your metrics. So you don't have to search through huge amounts of trace data to find something which is interesting in your tracing backend. You can see, okay, this is a high error rate, this is a high latency bucket, this is something which is wrong and you jump directly with the knowledge of what is wrong into a trace through the exemplar and you know what is wrong, what you're looking for. So instead of looking at 20 traces and discarding 19 of them and having to think through every single time, you jump directly to the one thing where you know you have the problem. Just saves you really lots of time. And basically we are improving the exemplar support, Prometheus is the first thing which actually supports this. By now, Loki and Grafana and Mimir and such also supported open telemetry is working towards supporting exemplars. And basically we want to enable you to work even better with exemplars and also to save them so they're not just a memory. So when you restart, you still have your exemplar data retained and a more efficient remote write engine, which is the way how Prometheus server sends data to other Prometheus servers or to other storage backends like Thanos, like Mimir, Victoria metrics, like all of those. Yeah, as I said earlier, we need you but this is actually the wrong one. Okay, sorry, I should have deleted the other one. But we need you, like help us with code. For example, here we have a perfect thing which no one is currently working on. So yeah, again, if you want to get involved, if you want to get started with development of Prometheus, please talk to us, like just catch me afterwards or send email to again the Prometheus developers mailing list or Prometheus users mailing list and just say, hey, I want to just try and work on stuff. We can also do things where like through LFX mentorships, you can even get like support from LFX to work on things. We are open to pretty much everything. We just need more hands. And that's it. I don't know how much time I have left but that's for me and now we have questions, hopefully. So who has questions? I don't bite, promise. That says to. You can't. Sorry. Hi, thank you for sharing, it was really insightful. I have a question about the improvement of the memory reduction by 50%. How did that happen? That's a lot of memory saving. I don't know the technical details to be honest. Brian, co-worker of mine just looked at stuff for a very long time and was like, okay, that doesn't look like the right. Unless I'm mistaken, he read it, how we store the labels in memory and basically found a way to de-duplicate because we basically had the same lookup table twice and it was the majority of our usage. But I don't know the details. You already have my email, so if you poke me, I can send you the PR and you can read through it and just look things up. Thank you. No worries. Any other questions? Hi, thanks for your sharing. Actually, I have one problem when using promules for monitoring. As we know, promules is a pooled state theorem. I mean, your target is there and your metrics there, so the promules server will put the metrics from your endpoint. But when I use it daily, the metrics there won't, especially the legacy metrics, won't be garbage collected automatically. For example, as for our user case, we have many tenants, each tenant has its ID and so we have the KPS monitoring for each tenant. And above it, each tenant will have the variant for its deployment. And so as time grows, we have more tenants come and go and their variant will increase by time. So for example, after seven days later, the stored metrics will increase by time. And sometimes when you pull the metrics from an endpoint, the data will be more than 100 megabytes. And so it's really both for me and I wonder, is there any solution for this scenario? Thank you. So if you have 100 megabytes of metrics, you should probably look at making something more efficient in how you create the metrics, maybe find ways to reduce them. Also for some data, it might actually make more sense to store it in logs. And then do analysis later. If you have super high cardinality data, that being said, there are a few approaches. Like for example, there are storage backends for Prometheus, which are made for more scalability. First and foremost would be Thanos and Mimir, which just scale very horizontally so you can put much more data if you want to. The other thing, which I'm not quite certain what you asked, you said it isn't garbage collected. Of course, did you mean that on the metrics endpoint, it is not garbage collected? Or do you mean within Prometheus it's not garbage collected? The endpoint itself. Okay, then... For example, if I upgrade the deployment from version six to revision seven, then the metrics for revision six, we don't need eight anymore. Then you need to look at your metrics endpoint and you need to remove everything which does not exist anymore. Because Prometheus doesn't care. As long as it sees it, it takes the data in. But once you start removing the data from your metrics endpoint, Prometheus doesn't see the data anymore and it is not put into storage anymore. So once you have this, as the time goes past, you just reduce the amount of metrics which you have. The data set becomes smaller, but if you look back into the past, you see it again. So it still knows 10 weeks ago you had version one to three? Yeah, I know. So as a business ourselves, we need to take care of the garbage collection of the leximetric, right? To put it more positively, your metrics endpoint should only be exposing data which is currently useful. If you still show data which has no use anymore, Prometheus cannot notice. Yeah, I know it. But actually, to identify the leximetric, it's not that trivial because there are many series and we need to identify the... Okay, this tenant has upgraded the deployment and we need to drop the lexi data. And there are many, many other scenarios. We need to identify each case and handle that. That's a little bit of a control plan issue and all control plans have this fundamental issue. We can also talk more later if you want. I think there's more questions for the public part. Okay, thank you, thank you. But you can catch me later if you want. Thank you for sharing, Richard. So business, I think Prometheus defined a really good de facto protocol for business as to share data and share on telemetry. But one question I have is when we do this monitoring at scale, like in TSTP, the performance of TSTP, the scale of community of TSTP is not so good. Like for example, like Google build more like to handle these things, but like for us, do you have any suggestions for how to handle this business data at scale? So it depends on how you define scale. If you run Prometheus, vanilla Prometheus, our recommendation is to not go over a hundred million active series. So this is roughly where we say you shouldn't be going with Prometheus. I know of people who run with much more than a hundred million active series and it worked, but it needed constant massaging, to be honest. So once you cross this boundary, it's better to use the storage backends for Prometheus. Of course, Prometheus itself is very much geared around, I have limited set of data and I need to create alerts from this and this is why we optimize for this up to a hundred million roughly. If you have things like Thanos Cortex Mimir, there you can put stuff in at much higher scale. You can go into the billions of active series. So you get easily 10 XD the scalability with different backends. And this is by design from Prometheus. We don't, with my Prometheus head on, we don't want to deal with one billion active series. First, thanks for sharing. And I'm very interested when you're talking about what's coming and the remote ride version too. And can you like detailly describe how it improved performance and the bandwidth and also like in the future, will it be possible to customize the serializing protocol or like a protocol buff is not, like there is still a performance downside, not very high-fission like that. Do you mean protobuf for remote ride or do you mean protobuf for scraping data? Remote ride. So currently what remote ride does is very inefficient because it was initially just a proof of concept and then it worked so well, we kept using it. Very fundamentally, we sent the same label sets all the time. So we sent the same data about the same labels again and again and again and again. This just takes up a lot of space because labels are very large compared to the metrics themselves which are obviously very small. So the biggest improvement which we have is just don't send this as often. Reminds me of HTTP v2, like you hash the headers like that. Not quite, but also not, maybe a little bit. But there's other things like for example, and there we are back with streaming HTTP v2. Remote ride version one is written completely statelessly. So you send it, you don't have to care about what you sent before, what you sent after, you just send it. Whereas remote read write version two is going to have limited state because we also have scalability issues so we need to find good trade-offs. But where we know what we sent so we can even not send you the whole raw data, we can actually serialize into the in-memory format because we know what you already have or serialize from the in-memory format. So those are the two main improvements. For more details, you need to look at the actual PRs. But this is the- Because I don't see any like a blog or documents yet. It's not done yet. No, no, it's not done yet, but we are having the Prometheus Dev Summit just this next Saturday, German time in Berlin. And if you look at the Prometheus developers mailing this, there's also a link where you can join online if you want and you can just listen in and participate and everything. So this is fully open, fully. We have this once a month online and just next Saturday we have it once a year in person. So yeah, this is- Okay, thank you very much. This is very exciting new feature. I have some questions below the above questions. So in some storage of Prometheus such as Victoria Matrix, we could find that they manually optimized the deserialization method of ProBuff. So I think ProBuff maybe has poor performance to on the deserialization side. So are there any, do you think customized, let users customize their remote write protocols is a good question? So it's open source, you can customize this already as of today if you want to. Easy, but there's more to it. Okay, there's no, Victoria Matrix is deliberately not compatible with Prometheus. With like any way you look at it, if you look at the remote write, if you look at the storage, if you look at the query language at everything at the query engine, it is deliberately not compatible with Prometheus which is a completely fine engineering decision to make like Victoria Matrix has many nice properties and they have some really good technology. I'm not saying anything negative here. I'm just saying they have different design goals than we as the Prometheus project and we as the Prometheus ecosystem. V value stability and V value long-term guarantees and everything. Which maybe not everyone does and again, I'm not saying that this is a bad choice. I'm just saying it is a different choice which is valid if you make it. I didn't look at their implementation of the remote write so I can't tell you what precisely they do. I know that for example in storage, they lose precision on purpose because it is more efficient and again, this is a fine trade off to make if you do so on purpose. But it means the data is never going to be the same as in any Prometheus compatible system. So if you need precise like in large numbers or something, you just don't get this and I don't know what they do in remote write. All that being said, it's open source. If you want to change stuff in Prometheus, just do it. If you want to adapt your Prometheus installation to the adapted format of Victoria Matrix, just do it. Thank you. Any other questions? We still have time. So I really, yeah, there's one, can you? Thank you. I have a user case. I deployed the two tiered Prometheus. So I have a testing production cluster deployed and my second tier is each cluster has a Prometheus instance running and as well as I use my second tier to store the Prometheus metric collects from all. So with your V2 remote writes, am I allowed to write on the local cluster and as well as pump the metric across to remote my second tier Prometheus? Is that doable? This is already doable today. You can already use it with version one. You can send to more than one storage backend if you want and done. I mean, it's more efficient with version. Why did I go here? It's more efficient with version two, but with version one it already works. I have never tried this. That's why I'm interesting to see. No, just work. Okay, thank you. Of course. Any more questions? Now is your chance. Again, I optimized for question time. I can also talk about other stuff of Prometheus but now is your chance to actually ask what you wanted to know. We can also wrap up early and you can have a break if you want. Three to one. Okay. Oh, there's one. Thanks for your excellent presentation and I'm looking forward to contribute to Prometheus. Could you provide some general recommendation for the new contributor? Yeah. So the first thing is there is a Prometheus developers mailing list. Just Prometheus-developers at Googlegroups.com. And the first thing which I would be doing is I would be sending email to there and say hi, I just want to get involved. My interests, like if you have interests and if you say I want to work on front and I want to work on query engine. I want, maybe not query engine as a first thing, but like I want to work on this thing and you have something you want to work on. Put this into your email. If you're just open to whatever, I'm just looking to contribute. Can you give me some suggestions? And then it depends on how well you know go, how well you're able to read the code base yourself versus how much you need help to understand it. Things like these. And you just open up the conversation with all the different maintainers and we together find a way where you can just start contributing and we find something which is a good fit as a first project, as a first issue, as a first PR within Prometheus. So that's the one way. The other way is if you say I want to do X. Like for example, today I talked to someone who wanted to put GRPC or Protobuf scraping back into Prometheus. We had this in version one, we don't currently have it. And he was like, I want to do this specific thing. Great, okay, then write to the mailing list and say I have an interest in implementing this specific feature, which I already know I want to be doing. Can you help me get started on this? Then the next step is usually writing a design document. If it's a small thing, then you can write it just in an issue within GitHub and we can discuss it. If it's a little bit longer, maybe write either one pager or something and send it into the group and just say, okay, this is the problem, I looked at this option, I looked at this option, I looked at this option. I believe this option is the best one for those reasons. I would like to implement this. Does anyone have concerns, any tips, any help? And then again, you have this discussion phase where people actually talk about what you do and how you do it. And then you actually go do it. And then it's just normal GitHub where you open a PR, you get a review and either it's merged or you fix something and you just go through the cycle and then it's merged and you do the next thing. Thanks a lot. And actually I also participate in the contributor on Kubernetes. Sometimes when I submit the PRs, maybe the reviewer is busy, so they can't review my code in time. So how can I, as for the promiss, it's also a famous project and maybe some reviewer, they are busy as well. So how can I be sure my PR can go smoothly and quickly? So you can never be sure, just to be fair. Like it would be, I would be not honest if I told you, yes, everything always. So there are a few tricks, there are a few things. So A, as I said earlier, Kubernetes has many, many, many, many, many more people than Prometheus, which is a good thing and a bad thing when it comes to new contributors. Because they have more people who can help you but also you build less of an immediate relationship with core maintainers. So maybe the person you build your initially relationship with has only limited knowledge of a certain subsystem. So if they hit an issue, they need to ask someone else, they need to ask someone else. This chain doesn't exist in Prometheus. Like you talk directly to the person who knows this thing best in the whole world. So that's very convenient. So that's part of this. Also, they could be busy and maybe you don't get a reply for a month. Of course, I don't know. They get sick or something. But the ways you can improve this and you can improve your chance of success and how you get things through more quickly. Talk about it before you actually start doing the work. In particular, when you're new, talk about, hey, I want to do this thing. Of course, maybe someone already had a plan. Maybe someone already thought of an approach and just didn't have time. Maybe there is some issue you didn't even think about and they can just tell you, hey, this exists as a concern. Also think about this one. Things like how the code should look, how the formatting should be, what functions to use. Like all of those things are better to discuss upfront and then you do the work and not you do some work and then everyone has more work to do review. Put differently, the best way to make your PRs merged quickly is to put as much thought ahead of time into writing the documentation, into writing the issue, into having the discussion and then when you file the PR, write a detailed description of this is what I did and why I did it because this makes it quicker for the person who reviews your code to think through. Because let's say you send small PRs, not huge PRs because what you're optimizing for is make it quick for the reviewer to just do it. If they have to take, let's say three hours to do full review of your PR or they have to take 20 minutes, what will they, like if they have two PRs to review, one will take 20 minutes, one will take three hours. What are they going to do on their coffee break? Yeah, short, yeah. No, so this is the best way. Thanks for your clarification, thanks. No worries. Any more questions? Yeah? Yeah, so it's just by the way, the previous question. So the Kubernetes does not have something like, for example, CAP in Kubernetes, which is Kubernetes extension proposal. You don't need to create something like this and then your feature goes to alpha, then beta, then GA. It's much simpler, right? It is much simpler. We have a system of design documents and for larger things, we usually require someone writes a design document, but this is for large things which we sometimes discuss for over a year because they are so large and because our stability guarantees are so hard with intermissives because like monitoring must work. It just must work. So we are very, very careful in what we do and if this is a bigger change, we sometimes literally take years just to make certain it's correct. Yet for normal stuff, you would usually be looking at weeks to month of just like send something, go back forth, back forth, it's part of a resilience candidate and out it goes. So there is no like three step process or anything. Much quicker. Okay, thank you very much. I was just wondering about is there a formal process like because this usually takes a lot of time in those formal processes like Kubernetes extension propose out then the feature goes alpha, then beta, then finally free release later, only then it's general availability, right? Now, okay, thank you. Yeah, no, that's part of the problem of Kubernetes having so many people because you need all this overhead. There is with, again, Prometheus has roughly 20 maintainers and roughly half of them actually do work. So you have a group of 10, 12 people and you talk to them directly, it's a very quick process. Yeah, sure, thank you. More questions. We still have up to 20 minutes, probably more like 15 to not block the next speaker. Okay, going once, going twice, thank you very much. Thank you.