 Hello everyone. So yeah, how not to use Prometheus? I'm Shivangi and I'm a software developer at ZetaSuit. I've been using Prometheus for around a year now and in this talk I'm just going to share my experience. So it's more of like how we should avoid beginner mistakes and save us some time in the long run. So yeah, in this talk we'll see like at each stage of using Prometheus, what are the kind of mistakes we can make and how we should be avoiding it to save up energy. So maybe like setting our expectations, right? Or while we are instrumenting our code to querying or setting up the alerts. So setting our expectations right. What I've seen in my experience is that one of the biggest mistake that we can make while using Prometheus is to expect it to work as a logging or tracing solution or at times expecting it to give us 100% accurate results. Prometheus is good enough for us to make operational decisions but expecting it to give us like expecting it to be used for the use cases where we care about each and every increment would be a bit more. So we should try avoid that and when we have these expectations set right we can avoid like lot of mistakes in long run. Along with that we should be careful about thinking Prometheus as a standalone solution for a long term retention. Maybe we can look at, look to integrate Prometheus with solutions like Thanos or Cortex or M3, right? So yeah, like once we have our expectations right, most likely our next step would be to integrate Prometheus, instrument Prometheus along with our code, right? So one of the mistakes that we made while starting off was not using our matrix names, right? It's, it's very simple mistake that we can think of but it can cost a lot. We are registering our matrix once but we'll be using it way more often and it's not just you who would be using it, the entire team would be using it. So having a non-clear matrix name can create a lot of issues there. Apart from that most of us face the cardinality issues lot of the time. I think the best way to deal with it initially is to have a rough idea of what cardinality you would be introducing with your matrix and just have it like maybe if you are creating a PR or in your comments just mention that this is what cardinality you are expecting. That saves a lot of time. By this time like most likely you would be having Prometheus setup and you can see your matrix in some end point but you have to configure the targets later on so that Prometheus is able to scrape your matrix. You can play around with the scrape interval and retention time as you like but just keep, just try to keep the scrape interval and retention time within the limits otherwise you can see like broken graphs which would eventually not be very useful and you have to go back refactor and like just spend a lot of unnecessary time over there. Along with that relabeling has been something which comes in very handy. You don't need to deploy your code again and again if it takes time. You just can use the relabeling in your configurations and you are good to go for some time till you deploy your code changes. So this is like once you configure it it's the best part because now you are ready to see like what your application is doing. You can query your data right. But when we query Prometheus there are multiple chances that we can overload Prometheus. By the way we are querying our matrix. So just try to like avoid querying high cardinality matrix with lot of variables that can put lot of load on your Prometheus and your queries can time out making your entire monitoring thing and effort go useless. So just just be careful about it. Along with that try to split your dashboards as much as possible based on your use cases instead of having everything in a single dashboard. It's better for both the ways like when you try to visualize your dashboards it would be easier to understand what's going on there. Along with that it will be like way it will it would load up pretty quickly. So yeah try to do that. So while we were setting up the alerts we had way too many noisy alerts and one thing that we learned was every now and then we would figure out okay the timing that we set between two of the alerts was way too less or either we were not setting a proper dependency. So for example if I'm getting alert on that my success rate of the application is pretty low but I'm also getting the alert that my application is not working. So like having the dependency between two can help you not get overwhelmed with the alerts and as important as it is to get less noise in alerts it's equally important to not miss out the alerts. So if you know like how you can avoid having the missing matrix try to do that but at least have some idea beforehand that you have missing matrix for whatever use case and try not to have alerts on top of it or expect alerts to be triggered on top of it. So you yeah I think like these are some of the use cases which we can take care of at the initial stage and if we avoid it we can save a lot of time going back refactoring our code. These are a few of the resources that I have found pretty useful in my journey of learning Prometheus and if you would like to dive deeper into these topics they are like pretty interesting to go through. Yeah thanks and like feel free to like use the slides if you like. So yeah thank you.