 Okay, then let's start. Hey everybody, welcome to the first public Prometheus developers meeting. Today we have a couple of topics and we'll kick off with Brian, who will talk to us about the bug fix moratorium. Brian, can you take this? Yep, you can hear me okay? Yeah, great. Fancy Mike's helps. So since two zero has come out, we haven't really had a good release of Prometheus, one that doesn't have some nasty bugger rudder. And we've been in a situation where there were, what appears to be multiple bugs down in the STB, a few other and other places. And we were in a situation where these weren't being fixed. These weren't being investigated. And when users are reporting bugs, they just report a different issue on an existing bug, which you can get of. So us trying to keep track of what bugs do or don't exist was close to impossible. I think the record was, was one issue, which we resolved in November before two zero even came out that had seven different bugs reported on, at least seven different behaviors. And when we've got 15, 20 such things in the system, it's impossible to just figure out what's going on. Now a few months ago, back for two two, we said informally that we just not taking any feature requests for a week because normally bugs show up inside two to three days. So he said, let's be safe, let's wait a week, get everything going, fix the bugs. That helped a bit, but it didn't get us a fully stable Prometheus yet. So then for two, three, we discussed this amongst Prometheus team and said, let's wait two weeks and actually fix these things. So in the end, it was about 10-ish bugs that were identified and we fixed like eight of them. And a lot of these, these vary from, hey, that's basically data corruption. Now things like shutdown being unclean, which isn't a major problem in and of itself. However, it causes lots of confusion among users because they see oldest this error, therefore they found a bug on the wrong, an issue for something else and causes confusion. And in practice, you know, a group like us can handle maybe two to three simultaneous bugs that we can keep in our heads, oh, it's that issue to report, and would you mind going over here? And that's fine, but when we're up at 10, 15 issues in the code base, that just breaks down. It just takes too much time. So we're now in a place where it looks like two, three, two is good. We had one false alarm this morning. And in fact, the bug was two, three, one already fixed. So all going well and good then. Probably Wednesday, we'll start taking in full requests again. The reason why it's Wednesday, rather than Thursday is Thursday, I'll be extremely busy and true to Monday. So at least Wednesday, I can take care of all the pull requests I have to merge in unless a bug comes up in which case we'll fix it. And then the hope would be then sometime next week we can get two, four out because, you know, it's basically been a month and it's not nice to have users to wait overly long to get their changes in. And so hopefully now then two, three, two will be a good release for me to use that people can use in the future. Does anyone have questions or comments? I don't see any questions. So maybe I will continue. Go for it. Yeah, I don't see any questions. Hi everyone as well. For the people that don't know me, I work with Red Hat as part of the Prometheus team. I work mainly upstream, trying to keep all the Prometheus users happy. And I will talk about the decision that was taken about adding new service discovery mechanisms, how and why it was decided this way. And basically in theory, we want to cover as many use cases, as many as most of the popular use cases as possible, but when it comes to service discovery, there are way too many. And usually I'm quite keen on adding new stuff. As long as there are enough use cases, but with this decision I can completely say that I'm really on board, especially after trying to troubleshoot few of those on different service discovery providers. I mean, even now we have a pending bug with the Kubernetes provider. I have added a link in the doc, which basically happens only when we have a high load of over 500 targets. And I try to replicate it, but there is no way I could do this on my own laptop. So it has been pending for quite a while now. There is a, I posted another link, which links to a possible solution that is upstream in Kubernetes, but I'm not entirely sure what's happening there. And by the way, we are very close to getting some free credit with Google. So soon we should be able to try and replicate this one on the load that we need. Let me just have a look what else. But anyway, my point was that as it may sound, that it's good idea to have these built in. It's not fun at all to troubleshoot and know all the clients for all of these different providers. And basically, as Brian mentioned, we are a small group of people, so we can't possibly handle more than what we already have. And instead of what we decided that instead of adding and maintaining new discovery providers, we will think in a direction to enable users to add their own mechanisms without breaking Prometheus. And the solution we have so far, and I also added link to this, is that we added an example to the official documents in the official Prometheus repository. So the example shows in code how to implement a sidecar that you can use to implement your own way of getting targets and just speeding out like a JSON file. So you create the JSON file and then we use the file service discovery Prometheus to import all of these targets. And this so far seems to be the safest and the safest way to implement it without breaking anything in Prometheus and also probably the easiest for users to implement their own service discovery mechanisms. And as I mentioned, I already added all of the links for the things I'm discussing here in the document. There was another discussion that I personally started because I'm still not 100% convinced that file service discovery is the way to go. And I started some discussion in the developer's mailing list. But to my surprise, it seems that, I mean, so far we haven't had too many complaints to file service discovery. And it seems that it solves, if not all, most of the current use cases. So I think so far this is the best solution we can offer for adding and maintaining your own service discovery provider. And all the links are in the document from the Meetup. And maybe this is the place to ask for questions. Let's hope we have at least one. I guess it's easiest if people put questions in the trash. There's a question about how to get access to the document. Yeah, I should have mentioned this in the meeting invite. You just need to be part of the Prometheus developer mailing list and then you will get access to it. But also the first link that I put into the invite was incorrect. And I replied with another link which everyone has at least read access to. Okay, there's another one. Will the existing service discovery migrate to the new method? No, there are no plans for this right now. So basically we will wait for feedback, but we haven't even discussed migrating anything else. I mean, it's too early. We haven't had even, I don't think even we have any users using the new way. So we don't have any feedback whatsoever so far. So to be clear, this isn't a new method. It's just an easier way to write a file service discovery because the extra code is basically just a framework that happens to use the exact same APIs we use internally inside Prometheus. Because it's pretty easy to do a file service discovery. What's interesting about using the same APIs is it means that if someone writes against these APIs, we can then move service discoveries in and outside of Prometheus. So as it stands, there are no plans to remove any service discovery mechanisms from Prometheus. It might happen in future, but not around the table. And it's also possible when we're in a better state in the future that future service discoveries will be implemented to third party via this mechanism can be moved into Prometheus. But neither of those seem like it happened anytime soon, but it means we have the capability in the future. Yeah, so basically when we're not blocking users for adding their own mechanisms, and we also enable an easy integration if it happens that we have enough manpower to do so. But yeah, read the docs, see the examples, and just give us feedback and post issues or just give us links to methods you have implemented. And we'll, I think there was an idea to add the somewhere in the official docs, maybe you can. Yeah, under the operating section, there's an integrations page, and that's where it lists all the integrations that are inclined libraries or explorers. So all the file SDs go in there, all the alert manager by hook receivers. There's an other section for weird stuff like prompt rocks, so push rocks. So you can just send the P orbiters, we'll probably use the same rules to try and encourage people to work together on one. But under pull requests. So basically the first problem we're solving is instead of blocking users, we're just enabling them to implement their own. Or basically encouraging them the way that is easier for us as well. Krasi, the concern that you had about file discovery was a performance concern or? No, the main thing was that not all environments allow access to the same file system as Prometheus. I think that was the main blocker there. Did I explain it properly? Yeah, there were some users who have very odd setups like in Kubernetes. I've got a customer, for example, who bans any use of shared volumes on Kubernetes, which is not a same setup. And hopefully it will change your mind at some point, but there's always going to be weird setups. The more common case, because I think that was only one user, is users just don't want to run a sidecar. So a lot of it comes down to more preference rather than technology. Out of curiosity, that they're not allowing shared volumes among containers in the same pod? Yes. Interesting, okay. Okay, anyways, thanks Fazi for the thorough explanation. Next up is Richie with a status update for PromCon. Yes, that is correct. And now the document doesn't want to open. Anyway, so PromCon is going well. We released the schedule. I put the link in the meeting notes, so you can just click and you can see the schedule. We have way too little space. We have a total of 220 in seats and we have 200 plus people on the waiting list, which is kind of bad. So next time we will definitely have a larger venue. We will be aiming for like 300, maybe 400 people. What will be nice, this time we will have a lot larger reception area. If you have into any of the two PromCon's earlier, it was always really, really tight and the reception area and food was always really, really tight. This should hopefully be a lot better. I didn't get final confirmation yet, but it's looking good. This year around we were able to spend around 15K on diversity, which is awesome. Yeah, and we will probably spend even more diversity next time. So that's also very good. And roughly 10% of our attendees are female or identifying as female and not all of those are coming via diversity sponsorship. Several of those are coming by themselves. By themselves is wrong, but you know what I mean. We didn't do any deeper stats as to other types of minorities, but that was the easiest and quickest one. And else, yeah, everything is looking good. It will probably be, oh, by the way, we will at least try to have a live stream. I can't promise anything, of course the video people are not set up to do a live stream, but we'll try to basically steal a stream from them and put it live probably on YouTube, but no promises made. Yeah, I should talk to myself in bed about Dash because I think both of us have experienced there. Yes, I must be, but yes. There was one question about what do you mean by diversity? Well, basically money, which we spent on getting people to Munich paying for travel, paying for the hotel, paying for the entrance fee, which is not a lot of the entrance fee, of course. Yeah, that's what I mean with that. Okay. Any other questions? Yeah, any questions for Promcon? Maybe we should have started with this, but in case anyone's not aware, Promcon is happening on the 9th and 10th of August or in about a month. But we were sold out anyway, so. If you don't have the ticket yet, unfortunately. Okay then. Our next topic is relevant for Meteos talks and meetups. This is kind of a section where just for future meetings, people can insert talks that they're giving meetups that they're hosting their local meetups or just link to slides or anything like that, just so we can give you a spotlight for any work that you're doing, Prometheus related. And just one that we were aware of is happening at Oscon. I believe that this specific talk is happening tomorrow, if anyone is attending Oscon. Wednesday. It's on Wednesday? Okay. And there, Priyanka is talking on Prometheus Open Tracing and Envoy, the Observability Movement and Open Source. Yeah, so if you're at Oscon, do check that out. I believe someone is trying to add an additional one. Does whoever is writing this want to speak up? It was Julian. Yes, I will give a workshop of Prometheus at the Open Source monitoring conference in November. Awesome. I will put the link now. Cool. Okay. If we are looking this far ahead, there will also be something for meetings in the KubeCons in Seattle and Shanghai. Yeah, I think we can mention it. That we can probably mention in each meeting that's coming up. But yeah, for sure there will be something at both of those. Or has anyone, are Prometheus submissions acceptance already out for Shanghai? Kufkan, I mean. No reply yet. Okay. Then we'll see when that happens. Okay, then the next thing we have on our agenda is just... We'll just put something in Cloud Native Long in 2018, two Prometheus talks. Tom, do you have sound voice? Yeah, hi guys. I'm talking about Prometheus and chat from Improbable is talking about Thanos, Bartek from Improbable is talking about Thanos. That's in September, 28th of September. Then do make sure you attend that if you are in London at that time. Great topic. Okay, then the next thing on our agenda is just a general Q&A session. So if there's anyone in the community or really anyone on this call who has any kind of questions about Prometheus development, how anything works, or just wanna start a discussion, this is your platform. Or maybe even usage, because on GitHub we always try to enforce usage to the main English, but maybe we can take few here, no? Sure. Okay, any questions? Anything anyone wanna speak up? If there is no one, then I guess we can move on. We can keep this kind of slot open for the next meetings. And people can insert questions maybe ahead of time. And then we have basically already come to our last point of the agenda, which is since we've been, like this is the very first time that we've done this kind of a meeting. It would be great if people can get their input. Even before the meeting started, I already added two points, which is get the link to the document right on the first time. I did get a lot of requests to share permissions on the wrong document, which actually didn't exist anymore because I moved it to the public space. So yeah, I'll try to make that happen next time. And we were discussing earlier on IRC that it would have been helpful if there was a global invite that is sent to everyone, instead of everyone having to go to the main list and accepting it. We did this consciously this time, but we also said that for a one-off meeting where we just want to try this kind of format, we don't necessarily want to invite a thousand people that are on the mailing list at once. So yeah, but I think unless anyone thinks it's not a good idea, I think for the next time we can do that. I see there was a new point. The time zone is good, you said. It's good for most Prometheus developers, I think, because most of us are located in Europe or somewhere close to European Central time zones, unlike most of Kubernetes, I think. So yeah, I chose this time. I'm happy about it, too. Once we hit 1,000 participants, we can make it 24 hours. Like a big, big long session. Sure. Session has a different meaning in Ireland. Okay, okay, okay. Well, in relation, by the way, I just hit something else in mind. In relation to the bugfish moratorium, I'm not sure if everyone here is aware, but we're also getting some, I already mentioned it, but we're getting some free credit from Google and myself and somebody else, we're working on automated benchmarking, which should help making Prometheus even more stable, but it's still working progress. It's just, I wanted to mention because we're working on other things as well. And basically what the prompt bench, the automated benchmarking will do is just we'll be able to run automated testings for sensitive PRs, like significant cold changes, or before every release, which should help cache bugs even before release or before merging a PR that is breaking Prometheus, which the current unit testing doesn't really cover because it requires a little bit of a high load or basically deploying Prometheus from like an end-to-end testing in an actual working environment. Okay, I guess also one more feedback could be that this time we scheduled this meeting for an hour and we're about to, I guess we're about to end it. So for next time, probably we can do 30 minutes. Yeah, anything else? I would say let's leave it for an hour and just see how this goes over a few iterations. It's always okay to close the meeting early. Sure, yeah, I'm happy with that as well. I'm sure everyone's happy to end meetings early and keep them productive. Okay then, if there is nothing else, I would say have a great local time and see you next time. Bye-bye. Are we going to, Fredrik, are you still there? Yes. Are we going to post the recording somewhere? Yeah, that's what I'm doing right now.