 All right, let's get started. Welcome, everyone, to our talk about TANOS, Absorbing TANOS Infinite Powers for Multicluster Telemetry. I'm Frederik. I'm the founder and CEO of Polar Signals. And today, I have with me Kamal and Bartek. Hello, everyone. Hello. Yeah. My name is Kamal. I'm a software engineer at Red Hat. I'm working for a platform observability team. And I'm also a TANOS maintainer. My name is Bartek. And I work with Kamal, a software engineer as well, in Opacity Monitoring Team. And I am produce maintainer and co-founder of TANOS. And I also am a Pechlik of CCOPS observability, CNCF. All right. Super cool. So let's get right to it. So for some of you, you may be entirely new to TANOS. So we want to give you a quick overview of what TANOS is. And this is kind of a reoccurring slot at Kupcans, where we kind of talk about introduction to TANOS for those who are new to it. But then the second half is kind of what happened in the recent past in the TANOS project, kind of as an update so that everybody doesn't have to necessarily follow the GitHub repos and follow what's happening, but can just every now and then join some of these sessions and kind of see what's new. So that's kind of what we're going to be doing today. And I'll kick us off with the introduction. Then Kamal and Bartek will take over with what's new. So when I talk about TANOS, I often like to refer to it as kind of distributed Prometheus plus plus. What I mean by that is Prometheus is kind of intentionally a monolithic application, right? The storage is in the same binary. The scraping is, the querying is. Everything is intentionally monolithic to kind of increase reliability that we can have on this process. That's really great for Prometheus, but it kind of limits Prometheus in a little bit in terms of like horizontally scaling Prometheus. Because obviously you can only scale as much as a single machine can. And that isn't to diminish the need for Prometheus. You'll later see how Prometheus and TANOS harmonize really nicely together. But TANOS is kind of the additional bits and pieces to turn Prometheus into a global scale monitoring system. And so a couple of things that TANOS provides on top of Prometheus is a global view so that you can query data across all of your Prometheus instances, long-term storage so that you don't have to only rely on the local disk that a Prometheus has available. And then a couple of other features that are probably the most interesting one being down-sampling, which is kind of related to long-term storage, because when we talk about storing long-term metrics data, we tend to also query that data long-term. But when we scrape data or collect data at 15 or 30 second intervals, but query it over a year, we're not actually interested in that super high resolution data. We don't even have enough pixels on our screen to show all of this data. And so down-sampling comes in really handy in those kind of situations, because if we query data over a year, it's plenty enough to query data at a one-hour resolution, for example, and still get an accurate picture. As a matter of fact, you would still get exactly the same picture, but down-sampling drastically speeds up these kinds of queries. So this is just a very high-level overview of what TANOS provides. And I also often, in terms of the analogy that I mentioned earlier, what I also like to say is that TANOS essentially pulls Prometheus apart into its individual modules. So the query layer, the rule evaluation, the storage, et cetera, and puts these into individual components so that we can then horizontally scale those. And as a matter of fact, TANOS tries to implement as little as possible and make as much reuse as possible from Prometheus. So TANOS tries to just kind of fill the gap and not reinvent things, but build on top of Giants essentially being Prometheus here. So what that kind of results, and I like to describe it as a toolkit, because TANOS provides a number of components, and you can really pick and choose these components to build the monitoring stack your organization really needs. TANOS gives you all the components, and then you can choose what really suits your organization. And we'll see in a little bit what that can mean practically. But the components that are available are the query year, the store component, and we'll see in a little bit and clarify a bit more. The rule component, the compactor, the cycle component, and the receiver. This is just a list so you can refer to this list later once we go through examples of uses of all of these components. So let's dive right in and get to the first example usage. And this is probably the most common thing that I see people starting out with TANOS. And if you're just starting out, this is what I would probably recommend you trying out. So you may just be running Prometheus already, and you want to run Prometheus in a highly available manner, obviously, because this is your monitoring system. You kind of are relying on your monitoring system to monitor the rest of your infrastructure. So you want that to be available. And then the typical thing when we're not using TANOS, just with Prometheus, what we tend to do is we put a load balancer in front of Prometheus, and we query our two Prometheus servers through that load balancer. Now, this is a little bit problematic. But why? The problem is because Prometheus is a pull-based model, the Prometheus servers that we're seeing here are scraping the same targets, but it's slightly offset at intervals. So that means while for alerting purposes they're close enough, when you're querying data and graphing it over time, this can lead to inconsistencies. And so this tends to be problematic or at least confusing to users. This gets more problematic when we talk about rollouts, for example. There may be slight gaps in one Prometheus or the other. Or maybe one Prometheus has downtime and the other one doesn't. And then there's kind of gaps in data. And this is exactly where the first TANOS component comes in super handy, being the TANOS querier. And the TANOS querier is essentially a layer that you can put on top of these Prometheus servers to have a global view. And what the TANOS querier does here is it queries both Prometheus's for all the data they have available for a particular query and then merges these using a de-deplication algorithm and presents you with one consistent result. And the way that this integration kind of works is through the TANOS side card. The side card is really just a shim between Prometheus and TANOS. And it really just converts TANOS API calls, GRPC calls, into Prometheus native API calls. And this is already a really, really useful example of how people can make use of TANOS. And it's a very common thing that people do. You may just stop here. And this is already extremely useful. And I want to take a moment to talk about something that I personally find very exciting about the TANOS project, which is the Store API. Pretty much everything in TANOS that can serve data, serve time series data, serves what we call in TANOS the GRPC Store API. And so this is essentially a data exchange format that everybody can implement and then serve data through that. And this is exciting because every component in TANOS implements this API. And something that I personally found really ingenious about TANOS when I first started working with it is that the query year itself implements the Store API as well. And so what that allows us to do is actually, if we want to, layer TANOS in a distributed way, essentially. We can have regions of TANOS clusters, let's say one per data center, or because we're at KubeCon, one per Kubernetes cluster, and then have a global query layer that we layer on top of that. And that works because the TANOS query year implements the Store API as well. And so this is why I often refer to TANOS as a toolkit. Because we can pick and choose and architect our monitoring system to exactly the way that our organization needs and only to the extent that our organization needs. Because we could build an arbitrarily complex and arbitrarily featureful monitoring system, but it's so much better if people can actually just have the complexity that they truly need. And so I think that's what's really powerful about the TANOS project. But these are just two examples of what TANOS can do. I talked about long-term storage being one of the primary things that TANOS provides. So let's talk about how that can look like as an example. Long-term storage in TANOS always revolves around object storage. And so the way that that happens is essentially that these sidecars that I talked about earlier kind of converting the TANOS Store API calls to Prometheus-native API calls, it actually has a second kind of responsibility, which is whenever Prometheus produces data on disk, it takes that data and uploads it to object storage. Now, pretty much every object storage provider from any well-known cloud provider is supported by TANOS. And many of the kind of the ones that you can run yourself as well, basically anything that's Amazon S3 compatible works. And then the way that you can actually query this long-term data is by using the TANOS Store component. This component, again, implements the TANOS Store API. But instead of interfacing with a Prometheus server, it reads that data from object storage and then provides it to the courier whenever you'd query a PromQL query. And then in this kind of scenario, the last component that you would make use of is the compactor. If you're maybe already familiar with database technologies, compaction is kind of the process of post-optimizing data in a database. And so what that means here is that the compactor looks at data that is in object storage and sees where there are possibilities for merging data to make it more efficient to improve compression or deduplicate things. There are various things that the compactor can do. And I won't get into too many things about that. But you can think of it as a data optimization post-processing component. The compactor itself doesn't actually serve any data. Whenever it replaces some data in object storage, the store just loads that and from there on serves that optimized data. So that's how we do long-term storage. And one thing that I forgot to mention is the compactor also takes care of down-sampling. Remember when I said that down-sampling is really useful for querying long-term data? Well, this down-sampling actually needs to be computed somewhere, right? So the compactor is one of those things. What is the component that does that? So this is probably the next step that you would take if you go with the previously mentioned architecture of like introducing TANOS into your organization. And here we can also see the nice kind of iterative process that we can take of introducing TANOS into your organization. You can start with just the courier and the sidecars, right? But if you want long-term storage, well, then you add object storage, the store component and the compactor. And just like that, you can kind of upgrade your monitoring system based on your actual organization's needs and not just throw a bunch of processes into your organization that you may not even need, right? So I think it's always important to kind of evaluate what your organization really is looking for and then kind of architect the monitoring system to those needs. And this is where we come to the next architecture. And this is really a very different type of architecture than from what I've been talking about so far, which is something that I was very heavily involved in, or still am very heavily involved in the TANOS project being when you have a kind of service relationship with your Prometheus servers. So you may have totally remote Prometheus servers that may be on edge infrastructure or something like that. And you want to push that data as opposed to having all the TANOS components pull data whenever it wants to query something. In this kind of scenario, you may want to push this data so that it's available at a low latency essentially. And what we have for that is what we call the receive component. So this component implements the remote write protocol from Prometheus. This is essentially a generic, almost database replication type protocol that Prometheus implements that you can just use to send off all data that Prometheus writes to disk off to a remote storage. And the receive component implements exactly that. And I won't get into too many details of how that works but it's essentially a dynamo style replicated hashering. And then once that's kind of received and stored by the receive component, you can configure the TANOS query or to query all these receive nodes, merge all the data at query time and again, present a de-tuplicated result to you. So this is a very different type architecture but also one that has become increasingly popular because you can very nicely kind of separate the responsibility of running this TANOS cluster and people just running Prometheus servers and just pushing all their data for long-term storage for long-term analysis into this service type relationship. So with that, this kind of concludes the architectures that I wanted to present today as an introduction to TANOS. And just to reiterate, and I hope this has kind of become clear to everyone, TANOS is really a toolkit that you can use to build exactly the monitoring system that your organization needs. And you can pick and choose from all these wonderful components to build exactly that. And at heart, the thing that really powers all of this is the store API, this kind of generic API that we have for reading data in TANOS. And again, this is kind of why we can have this toolkit kind of approach because we can just swap out implementations even, right? So I think that's really powerful and that's one of the things that makes me really excited about the TANOS project. But with that, that kind of concludes the introduction and now I'll hand it off to Kemal to tell us what kind of has been happening lately in the TANOS project. All right, I hope you can see my screen. So let's get to the news. So we've been busy for the past months. We've implemented a couple of new features and we have a lot of optimizations regarding all the components over the board. So let's start with Queryer. For the Queryer, now we have the ability to concurrently execute select queries. So this would really be helpful for the queries that for the complex queries that you have with multiple select statements. And moreover, now we have a couple of like multiple layers of caches, especially using memcachedy. Using this functionality for the stored gateway component, we can actually cache the metadata files and chunks itself. And this kind of helps us to reduce the traffic and latency between the object store and your like data cluster. Moreover, now we introduced a new component called TANOS Queryer Frontend. This component will be splitting your queries for certain intervals and then it will be caching the responses of your queries and that will be kind of hope that this will improve your query performance significantly. And I'm gonna demo it in a bit. Moreover, we have a lot of UI enhancement. Now we have a new component for what we call bucket view for all the components that actually serves blocks and with using this UI, you can actually introspect your blocks. And for the last update, now we have an EVE UI based on React components. And we are like reusing components from Prometheus itself. And we also plan to publish a components library in a long run so that we can reuse these components in the other projects as well, thanks to our mentees. So for this part, this is how it actually looks right now for the query. One of the cool features that we have recently implemented is the enabling the store filters. With this, you can actually select which store do you want to query to debug that particular store API component. And this is how the new bucket viewer looks like. You can see your blocks with the different intervals and different sizes of blocks. And you can see their metadata information. And you can actually download the meta JSON file itself if you want to further debug things. With that, let's get to the demo. And in this part, I'm gonna, like we will talk about a simple architecture, which we use Sidecar. Fun of Sidecar instead of Prometheus's. And for each cluster, we will deploy Prometheus. And then we will, to have a global overview, we will deploy a query here and user gonna read that. What makes this demo special? We'll be there are new components query frontend and we will try to demo you how we can actually reduce the latencies. So for that, so for our demo purposes, we will use CataCoda so that viewers can later on visit the same tutorials and they can actually interact with those things themselves. So let's start. So in our demo, we will just first deploy Prometheus's. And for each Prometheus, you can see, we kind of configure Prometheus to scrape themselves. And for each cluster, we will have one Prometheus. And for each Prometheus, we will deploy Prometheus plus a ton of Sidecar aside. This could take a minute. Okay, this was fast. Nice. Let's see if everything's work. Yeah. So for this demo, we are using Docker to just Docker to make things simple. And you can see all the processes are kind of running. From now on, we will deploy the Thanos query for like global overview. For sake of demo, we are also deploying a Sidecar proxy to kind of inject some latency to the query because we don't have enough data to actually create and load in this environment. And now we can actually access our query front-end. And for the last part, let's deploy Thanos query front-end. For that, we have a configuration for the cache, which tells us to actually cache everything in memory. And we kind of deploy it now and let's go to, yes. Okay, I hope you can see that. This is a Thanos query front-end UI. It's actually the same with the Thanos query. So when we execute, let's make it a little bigger. And when we actually execute a query, a query to be specific, a range query, it's supposed to get more than five seconds because we're also injecting some latencies. Yeah, for this query, it takes over five seconds, but now it should be cached. And then we execute this again. This should be a lot faster. Yeah, as you can see now, it took only one second. And one of the other, like one of the things that we specified over here is the split interval. So when we are executing this query, we actually split this query by a minute. So behind the scenes actually execute as five different queries and cached all the results of it. We also specify another thing called max freshness. And this max freshness actually says that like it splits the query, but it doesn't cached the most recent one. So to actually demonstrate that we have the same query, it's fast enough, but now we can kind of shift these things a bit and we can see it's still fast. So yeah, this is a relatively new component for us. So it would be great for if you can just use and give us feedback and so that we can improve and work more on this component. With that said, I'm going to pass the microphone to Bagak. Thank you, Kemal. Thank you for Derek. So Kemal mentioned about a few things we created over the last months, but that's not everything, right? We actually did much more. And from the high level things, it's worth to mention kind of stuff around APIs and the user experience. So first of all, we kind of, because well, thanks of the fact that we used Prometheus code, it was as easy as upgrading the few dependencies to get shiny new PSDB isolation mechanism, which allows, well, kind of appends and queries to be kind of isolated and that's pretty sweet. Furthermore, we are active on the analytic API side. So we are actually with the seek observability. So a special interest group on the CNCF site, we are collaborating on kind of exploring the use cases and APIs that an integration that would allow us to better leverage metric data for analytic use cases. So for example, this is like our POC, Opslytics project, which allows you to convert Prometheus and Thanos metrics into the market file. So this is pretty convenient. We are planning to add more APIs and formats like Apache Arrow or maybe Arrow itself and Pandas and stuff. So all of this, well, if you have any ideas, please and feedback and want to help and join us here. Yeah, please visit this repo. We are, your help is welcome. One thing that I want to focus as the last thing for this talk is kind of multi-tenancy aspect because especially when building centralized monitoring system like Thanos with long-term storage retention, you really think about like long-term use cases when more teams will use the same system and how they can collaborate. Do I need to create another Thanos cluster just for a separate team? What if you are a SaaS provider and you have customers that are not part of the same organization, but actually, their data have to be securely isolated, right? We actually built Thanos in mind with that. Obviously, with the multi-tenancy features being like in more advanced part of the story of Thanos. But well, here we are after third year of Thanos products. I want to demo and kind of showcase you a way of making Thanos a multi-tenant system, a configuration and deployment model describing that. So let's go for that. So I will describe what we will see in the demo. So first we'll introduce kind of two set of, let's say tenants, Prometheus is tenants data. We call first team fruit and second vegetable veggie. So you can see they have separate collection path. They already have set up Thanos with sidecar so they can query their own data. And because you have the separate kind of infrastructure for both team, for each team, you kind of expect the isolation. And so without, this is no multi-tenant infrastructure. It's rather a separate infrastructure for each tenant. But yeah, technically it works, right? But there are problems with that. And especially with bigger systems, you get introduced, we call it tomato problem. Tomato because if you're aware, this is actually both fruit and vegetable in the same time. So if you are from, let's say, tomato team, you would like to have access to both fruits and vegetable data. So this particular problem is called cross tenant view. So it is actually important to have a secure way to allow joins between different data sets from different tenants. So this is kind of something you cannot easily achieve with this particular deployment. Additionally, what if you have more teams, right? And how do you scale if you need to set up a separate cluster or separate infrastructure for each tenant? That doesn't scale well. So ideally you want to raise more and have some multi-tenancy system. And TANOS was definitely thinking about that use case for a long time. So let's do first step. Obviously we can join this data within single global view. So put like multi-tenant query and allow accessing the data from both to team fruit and to veggie. However, because we believe in TANOS in the Unix philosophy where you are doing one thing, one functionality and doing it good. So you don't want to spread your focus. That's why there is no like direct out or airbag system built into the TANOS. However, we integrate with and we kind of build ecosystem on top of that. So to achieve this, Red Hat actually built from label proxy project which is part of the Prometheus community org. So you can add that as a sidecar and this properly understand Prometheus API. So actually TANOS APIs as well. HTTP ones for querying and accessing metadata and all of those sort of stuff and injects the proper tenant label of your choice into the query system to ensure the data isolation. The critical path here is kind of how we separate tenants between themselves, how we identify them. And we do, we use the same mechanism for as for anything else, as for serious, as for these blocks, we use labels. So tenant is just another label for us and you will see that on the demo. So thanks of a prom label proxy and some authorization proxy offer choice because we don't want to, you know, force you to use IDCO or whatever. So with that, you can easily set up a multi-tenant read path that isolate queries depending on the password or maybe port that that is exposed to the certain team. So we can either have dedicated views or a cross-tenant views for any of use cases. So let's try to actually demo it quickly. For this, we, I also use CataCoda. Hopefully we can expose that after KubeCon for the public use. So let's go for that. And let's start with just starting those prom uses. We have prom to use for team fruit. Let's copy configuration for that and two replicas for team veggie. Let's copy those files. And now let's prepare a directory for our prom to uses. Let's start the prom to use. It might take some time. Actually, well, we started. Let's create a sidecar up to that. Let's now create team veggie prom to uses and sidecar and another prom to use. And sidecar. As you can see, I can, I'm using just Docker machines for simple showcase. And let's start with no data tenancy model. So this means that we start a separate query for veggie and for fruit teams. And so it should look like I presented before. So let's try to access our queries. Let's see how it goes. So as you can see on store page, we can only see one sidecar from team fruits. So it's kind of obvious that we will have only data from the team fruit. Yep. Now, I can quickly show you the veggie one but it's pretty similar. You only have data for team veggie and there are two replicas. So you have two values because there are two prom to uses that are scraping themselves and kind of each other. Okay, but we have tomato problem as we describe and we have also reuse infra. So problem. So let's try to build a multi-tenant read path for channels. So let's stop our queries because we want to set up a one multi-tenant query that will be safe to use. Let's start that. The difference here is that we just point to all the stores. So all the sidecars we have for both the veggie and team fruit without anything else as we describe this is like single-purpose thing. So as long as you have all stores, store APIs so two sidecars from veggie and one from fruit we will have all the data, right? So this is kind of our admin tomato view whatever you want to call. Yeah, we see all the data which is not great if you want to ensure some isolation. So let's actually go for prom label proxy project, right? So it is as simple as, yeah, this is like a status proxy with where you point the upstream and URL which is like where your multi-tenant query is and then what's the listing part and the key part is that what's the label and because as you can see our primitive uses have three labels and one of them is Tenant describing critically what team this primitive use is part of. We can kind of isolate Tenant based on that. So once we create the proxy let's leverage this proxy with some kind of auth and I choose the caddy server which is like a fancy engines and there is some configuration, very simple configuration which exposes two ports 39091 for team fruit. So in this port it will just append the parameter to the URL portion with Tenant equals team fruit and there was a second port which will inject Tenant equals team veggie. So depending on what part you are using it should inject the correct parameter and also it points to the prom label proxy that understand this parameter and kind of knows how to inject that into query into other critical primitive use APIs. So once we start that we should have a query for fruit team which should actually give us yeah, only team fruit data even though this kind of query or have access to all of the stores, right? And kind of the same will be for veggie team just it isolate for, it has all the side cars, even team fruit however we can see only only go routines, let's say or like, yeah Mem sub GCC CPU fraction for team veggie. So that's the whole point of this demo. Let's go back to the slides. And I think we don't have a long time here to describe all of this and hopefully maybe on the script cause we can demo further multi-tenancy parts because we talked about read path of the multi-tenant reads isolation but there is much more. We already solved storage soft tenancy where you maybe include upload order blocks all the data into same a bucket. You can totally do that. It's still multi-tenant. You can have separate buckets for each tenant. That's okay. We call it hard tenancy and the same for receiver. We build receiver with multi-tenancy in mind. So you can have soft tenancy where you have just same ingestion, the receiver nodes used for multiple tenants building separate blocks still but same ingestion path. But you can have hard tenancy where you distinct ingestion nodes to make sure it's just much safer and you can have greater SLO on that. So that's it from the multi-tenancy and I know many people really looking forward to that and we are leveraging that at Red Hat as well. So this is pretty exciting to actually prepare a demo so to showcase this for you. And last but not the least, we are not stopping. We have lots of stuff to do. Quickly mentioning bucket viewer. We want to contribute up to prompt use. We want to have finally proper deletions backfilling in tunnels query of death safeguards. So we are working hard on making sure your infrastructure is stable, especially in the multi-tenance scenarios. Performance improvements like really to lots of help from community to make that happen. So it's pretty amazing. And yeah, some different cashback and supports. And I want to also mention the mentorship. Like we spent an amazing couple of months of mentoring like multiple amazing people and some of them are becoming tunnels maintainers or like, you know, helping us with the community actually starting some CNCF meetups as well. So yeah, if you want to mentor someone or mentored on be mentored, yeah, please be, yeah follow our Twitter and prompt use Twitter to get more info. Yeah, we are happy to help in some way. Thank you. That's it. And yeah, we are happy to take questions.