 Hello, my name is Björn. I work for a company called Grafana Labs which you might have heard about but for a much longer time I've been working for a project called Prometheus that you almost certainly have heard about. If not, what are you doing here? This is promcon, right? All right, so if you know me, you know that I have an infamously favorite topic in Prometheus land which is histograms and once more I'll talk about histograms. A new kid in histogram town which you might know is a reference to this very first, almost very first, commit in the Prometheus repo November 2012 ancient times, right? Matt Proud himself committed this and the first sentence here is a mysterious reference. I mean in the days of web search nothing is mysterious. You just paste it into Google and then you know what it is but let's keep the suspense here. I'll tell, I'll talk about the first part of this mysterious reference in a mysterious language as the last thing in the talk. So you have something to look forward to. Now let's talk about it. New kid is in town. That's where I took my title from, right? So back then Prometheus was a new kid in town. Nobody or at least not very many had an idea how to do monitoring in that way and then Prometheus became very famous and it's not so new anymore so this is probably why this sentence is not in the ring me anymore. Anyway, back then the interesting part is that Prometheus was new but it was not like unheard of. There were a few people mostly working at Google I guess that knew this kind of monitoring and yeah and then there was theory about this how monitoring works and everything so this is not like Prometheus essentially made this available for everyone in the open source world but it's like it wasn't fundamentally new that nobody had heard about all those ideas before and that's a similar thing like fractal self-referential repetition of history or something with histograms, right? So they are super new so new kid in histogram town will have very very new histograms in Prometheus but those histograms are like not really new they are based on ideas that have been around for long longer than Prometheus actually and just very basic idea it's just the challenge is more like how can we make them fit into the Prometheus way of doing things and that's a lot of my recent work about and also another idea here is that yeah I've been talking about histograms forever right so this is not really new but it's still very new and yeah so I've been talking about this forever especially you could watch this trilogy of talks interestingly the first part of the trilogy is newer than the second it's kind of a prequel but we know this right this is about kind of why histograms haven't materialized yet the new style histograms kind of the history how we knew that the existing histograms are kind of a prototype but we got stuck with them for some reason and then certain events happened that made it even harder to to like follow up with the old ideas so this is kind of this first part it's kind of bit funny recreational the second part is the most important one as often with trilogies it was supposed to also talk about the past it didn't which didn't happen so that's why there's a first part now a prequel so the second talk talks about where we are present date as of 2019 prom con two years ago and the future what options do we have where to go so that's very important to understand where we are and then the third part of the trilogy is kubecon last year kubecon u but it's online anyway so you can watch it this is essentially after some research what i proposed what to do quite concretely already and i also announced a famously i announced a design doc that will come out just after the talk in reality it took a bit longer so this got published in february this year not so long ago but here it is it's epic 26 pages or something so totally read it if you are this type of person but if you think this is too much effort just watch the talks and it will be also okay right so this went through various rounds of review and we discussed it at the pomytheis depth summit and agreed that this is the way to go so this is kind of the the baseline on which we are working now i mean most it's me but i hope that i get help i need help from various experts and also like contributors were asking what can i do and i told them i first have to come up with the master plan before we can even do anything right so this is the master plan here it is okay what does this talk about this talk is kind of a showcase what already works and what are the kind of yeah very very imminent next steps to do so there there's a slide from the original trilogy third talk but as said i will not repeat those talks and actually if you have never seen any of the like if you have no clue what i'm talking about right now it will be also hard to follow the rest of this talk but this is online right you can stop the stream now and you can watch those other talks or read the design doc or you can just keep watching and then later read it up and understand everything but in a nutshell right very shortly this is how the new histograms look like they have they embrace sparseness so empty pockets might happen all the time but they don't cost anything in turn they have an infinite amount of pockets right so they they they look like the same all those pockets but this is only because the x axis here is logarithmic so it's exponential bucket from zero to infinity and actually from zero to negative infinity but i haven't drawn this here there's a resolution parameter which tells you how many pockets every power of 10 has so after every power of 10 you have an guaranteed like around um bucket boundary and then the mathematically inclined might already have noticed that around zero you will have an infinite density of pockets so there's a zero bucket of a certain very small width where just all the observations in this very small interval are falling into this so called zero bucket that's all they look like and that's how we encode them in the exposition format so the actually populated bucket are marked down as spans and then within each bucket we we just have deltas between one bucket and the next we don't record the absolute value in each bucket they those numbers could grow quite a bit but the deltas in between are usually smaller because buckets just like i mean in a very chaotic distribution this might not work out but in reality you have kind of smooth distributions so going from one bucket to the next is usually a small step and smaller numbers if you do some kind of fine encoding can be encoded in fewer bytes and that's how it works out on the wire this is by the way kind of built into protobuf that's one of the ideas of protobuf which is also the reason why we kind of bringing back protobuf format for at least experimenting with the new histograms we'll certainly not just bring back the old protobuf format in the final release code but like you should notice that openmetrics has a protobuf option so this is probably something more like the direction to go in the end but for now we just experiment with old protobuf format extended for those new histograms all right this is enough about old slides now i want to show what is already there the as you could already have guessed the development is kind of in flux so we don't do this in main it is also quite invasive which is another reason why we can't just hide this behind a feature flag or something so there will be development in a separate branch for now and the branch will be called sparse histogram in various repos at some point this will be ready to go into main purpose this is the point where we cut prometheus 3 but we don't have to decide that right now right this is why we experiment in a branch prometheus prometheus is where the meat is right so it's tstv promql ingestion but of course we have to start with exposition and we have to pick an exposition library for that and as i already hinted we are using protobuf the old protobuf format and the only official exposition library that still is backed by the original protobuf data model and format is client going so we just use that one also we all like go right so kind of nice coincidence the original protobuf format is described in the client model repository which also will have a sparse histogram branch right so that's the idea here all right let's look at the changes incline model that's really very little these are all the changes and now you see how easy it is you have this sparse bucket resolution you have the width of the zero bucket you have the count of observation the zero bucket and then you have negative and positive sparse buckets which both are just a series of spans with offset and length and the deltas between one bucket and the next and as said if you pick the right data type here and protobuf you'll get va in encoding for free on the wire so this is a very efficient way of encoding those sparse histograms without doing a lot of special super fancy bitwise encoding or something so that's super easy to to get done and it's already quite efficient and might actually be in the sweet spot between encoding effort and and size on the wire this is all discussed in detail in the designer but these are all the changes we need incline model pretty cool changes incline goaling are fairly involved so i can't show them on a single slide but i can demo them okay so here we are this is a simplistic minimalistic little program that exposes just a single histogram to do that i'm not using the default registry i use a custom registry here so that i don't have any weird default metrics just a histogram you see this is completely standard this is nothing new yet and the buckets i define i'd use the exponential buckets helper here to define 13 buckets and i use i calculate the factor here and this is the mathematic behind the width of the buckets in the sparse histogram so i use the old histograms but define bucket boundaries that will coincide with the bucket boundary of the sparse histogram you don't have to do this i'm just doing it to demonstrate something i'll talk about in a second register this now simulate observations so i simulate an observation every 10 milliseconds so it's kind of as if we are serving 100 requests per second from this zudo service binary or something and i use random numbers that are normally distributed 200 milliseconds plus minus 50 milliseconds standard deviation so that we get a bit of a distribution right and we run this in a goal routine serve matrix that's it right okay let's look at this how it will work out go run demo.go and already runs right let's check this out let's curl the matrix endpoint matrix there it is i mean so far nothing new right this is the classical histogram ideally as said this should have a round bucket boundary at every power of 10 floating point precision blah blah blah it doesn't really work out but you you get the idea right you can already see how we accumulate observations right and most of them are kind of ending up in in this bucket here this is cumulative like it goes up and up but like the delta between bucket is kind of what is actually in the bucket it's also delta and you see this is such a low resolution that most of the observations just end up in this one bucket here and we are going to change that right so let's go back to my code here i've prepared this so i don't typo again mistype anything so this is the only line you have to add and i told you i want a default resolution of 32 for these false buckets right now in the code the default resolution is the default behavior is don't use false buckets right and now this just adds false buckets to the normal histogram in addition to the normal buckets so it tracks everything in parallel so we can look at it in parallel but this is why i used this interesting bucket layout if the buckets in the conventional histogram are just coinciding with boundaries in the new sparse bucket the library could totally just use sparse buckets and then render the conventional buckets in a certain way i mean it can't render an infinite amount of buckets but it could render like those 13 buckets if you want them based on the data in the sparse histogram so it only has to track the sparse histogram right now the code isn't doing this optimization you just get both in parallel so you could still do the old stuff okay now let's run this one you know what's happening here stop it run it and of course the existing released client calling library doesn't know anything about sparse bucket resolution so what we have to do is use go get with go modules to tell go to fetch something from the sparse histogram branch this is the way to do so remember this if you want to play with it on your own you just do this and then go modules creates the zudo version it's not 1.10.0 anymore but 1.10.1 it's also pinned to a certain commit hash you can actually look at go mod to see how this works right so now it has pinned this here and even it has noticed but this will happen automatically thanks to what's in client calling that it also pins to a certain version of client model which is not in the main branch so that nobody's confused who is actually using this normally okay now it should just work go run demo and it seems to work so let's curl this again and as you can see nothing has changed this is kind of what's the way anticlimactic right so the reason is that the text format is very very not suited at all to represent those sparse buckets in an efficient way it's all done in the protobuf format which as I told you we are bringing back in Prometheus just for this experiment and client goleng never left this path of virtue so we can just tell to the current day that is possible right any go binary that instrument in Prometheus client goleng can can be asked to serve protobuf and you do this by adding this little header hit enter now curl is very friendly and tells us this is horrible binary stuff if I dump this onto your terminal it will all beep and go whatever haywire so it is not doing this and that's cool but how do we how do we look at the new stuff right so there's a little known secret I mean there's the text format of Prometheus and there's the protobuf format of Prometheus but you can actually represent protobuf in a text form so now it's it's quite getting deeper right this is the text format of the protobuf format if you just say encoding equals text and that's also works with every client goleng instrumented binary you can do this at home if you want right so now you see the protobuf in a very verbose form on the wire usually this is the binary form it is very dense right so you see here in in even more verbose than the normal text format this is the conventional histogram and now you get the new stuff so you get the sparse bucket resolution of 32 the threshold you see we have two spans here one is just one bucket like one very low value observation and then we have 10 consecutive sorry 34 consecutive buckets 10 empty buckets later and here all the deltas and you see as promised they are all quite small they only get larger around the peak here right and then the minus means this is where the buckets become less populated again yeah this is this is how they look in protobuf we can't even see explicit boundaries and everything because this is all implicitly encoded in the convention what does the resolution number means but they they more or less coincide with those buckets like I said you could make the library render the conventional histogram at no additional cost essentially and the scrapers them on the old histogram could still have this but you have to pick pockets you cannot expose an infinite number of pockets okay that's the demo let's go back to the slides Prometheus Prometheus that's where we are all interested in right so the problem here is that this is actually the hard part right I'm currently working on ingesting the protobuf format not in the super optimized form that is has been implemented for the text format it could be done by hand coding a protobuf parser but yeah not doing this anytime soon this is just for experiments and as said we will we will find out what's the final choice of how to represent the new histograms on the wire and how to ingest them so that's easy to work on just check out if you watch this there might actually be something working in the sparse histogram branch the hard but also very rewarding part is to teach the TSUE to save those new histograms in an efficient way and this is where I hope for like TSUE experts to help and then here be dragons that's how promql should do all of this to be like to to make full use of the new histograms you kind of need new concepts in promql where you actually act on an histogram with a potential infinite number of pockets like yeah you cannot just represent them all as individual time series from the point of view of promql but to be again compatible you can again create this kind of view of a new sparse histogram and and and present it as a conventional histogram with a selected number of conventional buckets so that's kind of a migration path but yeah in the like shiny new world we have a promql that can actually act directly on on sparse histograms and recording rules that can spit out new sparse histograms and all those things right so that's really really tough to get right so don't don't be too like this won't be done next month or something all right we are done here and as promised resolution for Bedecki-Dain-Himmelssoys that's of course German I think Matt back then was really into German and what's the greatest literary figure of modern German that's a person called Goethe here I like him I just sometimes put Goethe easter eggs into my talks you might not have noticed so now it's very explicit right but it's not even my easter egg it was kind of Matt who put this easter egg into it it's an awesome poem awesome mega rant against the gods mostly Zeus so you could now think why this what this has to do with Prometheus there's one obvious reference of course the title of this poem is Prometheus so perhaps Matt just wanted to tell us I'm all now also writing a Prometheus like like Goethe did it everyone should write a Prometheus once in their life or something anyway that was it you can see all my talks and how to reach me and we still have Q&A so see you on the Q&A bye bye