 Okay, let's start. Hi everyone. Today, we're going to talk about new key, which is our new project. We started around zero matter a few months ago So I'm going to first introduce you to our speakers today First we have one year Say hi With a software engineer at Fredat and it has been our dear beloved PTR for the last six months And it's going to be our next PTR again for the next six months The nice year to she walks out me around this as a software engineer to and she's one of four more recent Syllometer contributor. She's been a core developer for the last cycle And myself I walk as a software engineer at Syllometer. I've been the Syllometer PTR for a year Before I went and I'm driving the gnocchi initiative for six months now I'm trying to out with integrating it into Syllometer So before we start talking about gnocchi, I think it's a good idea to Take a step back on Syllometer itself or it started And what what the motivation behind it so when we started Syllometer about two years ago a Bit more than two years ago actually we have a lot of problems. We wanted to solve First use case was billing but obviously we soon discovered that what we wanted to do was metering So we started to meter open stack and to meter everything in open stack We didn't want to change most of open stack. So we started to pull things receive events notifications and build a lot of data from this and We decided back then not to do any kind of trade-off That means we didn't want to we were not really Sure or people were going to use the data So we decided not to lose anything not to do any kind of aggregation not to do anything fancy Just which of the data put them into a store and that's it then build a query around it providing an API for people to use this data and To well do whatever they want with it doing billing monitoring analytics of any kind The problem is that this was a lot of data and it's it's Very hard to use this amount of data when you don't try to do any kind of optimization behind so we did Probably made a few things not totally correct in Syllometer The data model we picked back then Having nothing aggregated nothing Computed on the runtime on crime right time was Very hard to have good performance basically a Lot of queries are optimized in Syllometer For common usage pattern like if you want to a total list of resource is being optimized by most of the storage Driver behind but if you do some fancy queries most queries are Big O and meaning the most simple you have the longer the query will be When you have cloud platform with a few hundred of nodes and you have a few thousand samples per minute You end up very quickly with a few my land millions samples in your DB and doing fancy query on that is very very long The data structure we picked Which we named sample In Syllometer is freeform that means that it can has anything in it It's not entirely of fault When we started Syllometer, we relied a lot of notifications and the notification system in Syllometer in an open stack Is free from two so we just Didn't try to fix it enough in the open stack itself We just use that and we ended up with an API which has no consistency meaning Most of the data that are returned are freeform so it's very hard to Know in advance which kind of fields are going to have some fields are part of the API So you got a contract with Syllometer and it's API. You're sure to retrieve This kind of thing we have this name this type, but sometimes you just don't know So it's out to be queries. It's out to retrieve the data to manage it And to build application around it and last thing is that There's a lot of use cases that we saw especially in billing Situation when you want to retrieve things like one one instance has been posed and resumed one instance has been Resized or these kind of things which are not really meters And it's complicated to have to solve this kind of issue in the API currently You don't have a magic query that can help you Seeing that this instant has been resized or has been moved or has been posed, etc So we decided that it was a good time to solve this for him and to start something Maybe from scratch. So what's what's that's what we did with yaki We change or I think I can say the paradigm behind all we store things The first thing we started to do in yaki is to track resources Like I said in Syllometer of the only data structure we use is sample and sample is just I'm each or something like I get a measure from Cpu chill on the instance and I send everything back to Syllometer The value for the CPU chill and all the information I have about this instance and I do that every time I'm mature anything about this instance That's a lot of data and this doesn't help you knowing when the Resource the instance has been created for real You can guess that from when you see the first sample and when you see the last sample But not sure it's really the last sample or if one sample is going to arrive 10 minutes later all these kind of things So we are about to build Resource list dynamically, but it's it has huge cost and it's not very A good way to tracking it. So in yaki we track resources from scratch like we have a dedicated Part of the API for that We have separated two things so in Syllometer we gained recently the support for storing events Based on notifications. So this is something we did not have two years ago We have that for I think Maybe have an hour I saw something like that So it's pretty recent But we're not able to use that to say okay. There's some kind of data some Patterns on use cases that you're not going to solve with samples We're going to solve with events and in that case in the case on yaki We're going to say we don't meter this because this is an event Like if an instance has been resized. It's not a matrix. It's an event something happened You want to retrieve that you're going to ask Syllometer for these events It's not something we're going to store in samples like we used to do in a Syllometer We have metrics in yaki obviously so we have resource and metrics metrics are time series Data, which means it's only time some value a list of time some value Nothing else nothing fancy like we have in Syllometer with a lot of information you get from the instance when zoom when you meter the CPU till you got everything about this instance Now you don't get that with no key you just get the time stamp and the value of the CPU utilization What we do is that we link the metrics the list of time some values to a resource So you got a list of time some value Which we name the entity a one will explain it to you just after and You got this list of metrics and you link that to the CPU usage of this instance You link that to the network in usage of this etc Another thing we do in yaki. It's that this time we decided to make some kind of trade-off on data We do aggregation and we do it eagerly most Drivers are doing this on the right time. We talked a year ago I think in Hong Kong about doing a location in Syllometer It never happened for various reason many because we think we don't have and many people to work on that but I think we all Noticed that having millions of sample for a year in the Syllometer database is not going to scale very far and most people have I mean, I don't know many people who want to have fine-grained Data from a year ago most people are happy enough with aggregation So we do aggregation as a first feature and yaki I'm going to let Owen talk to you about new key more deeper cool Thank You Julian and so I'm going to continue on this and discussion by doing a little bit of comparing contrast basically identifying some key aspects of Syllometer which might be interpreted as shortcomings and Explaining at a high level initially and then we'll drill down into a bit more detail as to how we're intending to address those shortcomings in The case of no key. All right So very simple kind of and table here the before and after shall we say so One axis in which you can kind of and really see the difference between the approach taken by classic Syllometer and The new approach that and no key is adopting and Julianne alluded to this earlier is this notion of How heavy the actual samples are so a sample in Syllometer terminology is our basic stock and trade It's it's the basic piece of data that we store and we manage and we make available Available so that insight can be surfaced into what's going on in your cloud And the thing about Syllometer surface or Syllometer samples is that are actually quite heavyweight things Right now at a high level really all you're interested in here generally is a number Right, so say it's CPU utilization at one moment. It's 40 percent the next moment That's 39 percent the moment after that is 41 percent So the key piece of data there is a number which is a relatively Lightweight datum right now. It's not going to take up too much space. You can store a lot of those Now in addition to that actual base number We store a bunch of identifying information such as what's the uuid? That's a bigger thing of the resource that it's associated with What's the uuid of the tenant and the user and so on so already this thing has kind of grown and growing right? But at least it's standard for each sample regardless of what the source of the sample is You know you're always going to have an associated resource an associated Project ID Representing the tenant an associated user and so on but in addition to that we actually store a snapshot of the resource metadata associated with or as it as it existed at the time in which the sample was taken so if you think about an instance It would include things like the instant state. Is it active was it suspended has it been resumed that kind of thing But it would also include things that change very very rarely if at all Such as the identity of the image that the instance was booted from The user metadata that was associated with that instance So these things are very in for either static or very infrequently changing But the way salamander stores them via this kind of snapshotting approach means that this effectively the same data is Duplicated over and over and over now that actually gives us a lot of flexibility as to how this data these data are You know interpreted and used going forward, but it actually Creates a very very large storage footprint that has shown to be a problem in reality Now contrast that with what no key does In which case it really strips it down to the bare bones Right it says really all we're talking about here is a number and that number had effect at some particular time So that's all we're going to store right every time we take a new measurement We will store the actual value and the time stamp at which we took that measurement Right and then all of this other data that we had previously snapshot it on a per sample basis We're gonna magic that data up in a different way, right and we'll talk about how we do that later Right, so certain use cases in salamander currently that depend on these sample data being repeated over and over And we'll see that the strategies that we have in mind for actually Enabling those use cases going forward will allow us to do it with no key in a much more lightweight way So that's the first kind of axis of comparison Another point that Julianne alluded to as well is that well how long are you really interested in keeping this data around for? Right forever that might be the case for some data But for a lot of stuff like say CPU utilization its currency is Really key Yeah, so the data that's fresh that's related to say the last hour or the last day That's kind of actionable data You could drive for example alarming off it and use your alarms to trigger auto scaling actions But the data from last week not so much last month last year Again, it's becoming much less useful that that would be the case in CPU util But you could look at some other types of data that was used to drive billing such as the existence of a certain number of instances And that's the type of stuff that you probably want to keep along around for a much longer amount of time You have somebody comes and queries their bill from six months ago You need to have the data there so that you can back it up, right? so one of the problems with salamander is that our initial stab at expiry was completely global All right, so basically data exists in Full resolution until it falls off the cliff and then it no longer exists All right, and that's done in a way that's not selective on the basis of what type of data it is All right, so clearly you want to be able to store different types of data for different amounts of time But also you really want that kind of fallen off the cliff behavior Do you want it to be like full res full res full resonant nothing? Or would it be much more convenient to keep a small amount of full resolution data? Then roll that up at a certain granularity Right and store that for a longer period and then maybe roll it up to an even more coarse-grained granularity and keep that for an Even longer period again So what you get is this gradual aging out of the system of these data as opposed to a sudden expiry And that's exactly what and no key allows us to do We can choose individual time series set an archive policy on each individual time series Selectively and then that archive policy allows it to be gradually rolled up in ever and Courser grain until finally we discard the data or we keep it around forever if necessary But if we are keeping around forever the assumption would be that we're doing it in a in a fairly coarse-grained form Okay, so let's look at another couple of axes of comparison here between classic salameter and no key Again, Julien alluded to this idea of aggregation when we actually do the aggregation right in classic salameter It's all done on demand so by on demand I mean in order to satisfy an individual query So if you query the salameter API and you say the period I'm interested in is early What we will do is in the back end will go and say in Mongo will do a big old map reduce and we'll take these data We'll stick them into buckets based on the granular you chose and We'll say ah the average for that our was whatever the average for the next hour with some other value And we'll do all of that now if you emit exactly the same query Five minutes down the road. We'll do that work again. Yeah, because it's all done on demand Yeah, whereas what no key does is As the data is being ingested right or soon thereafter it eagerly does the roll-up Yeah, so you identify a number of aggregation functions like averages minima maxima Even more exotic things like standard deviations right and as that the data are being ingested basically the aggregation is happening continually Now different storage drivers do it different ways and I'll talk about the pluggable storage driver there Later on some of them do it absolutely eagerly as in as the data point is literally received It's aggregated others do it in a slightly kind of laggy fashion But the effect is is is the same For for queries that basically don't extend over the very recent time period You don't repeat this work over and over because the aggregation is done on demand or sorry done eagerly as the data is received Okay, so let's talk a little about some The basic kind of lingua franca of no key, right? So we got a certain kind of terminology we use with salamander We talk about things like meters and samples and so on so that to distinguish our kind of new way of Imagining this we've used different terminology, right? So first off we have of course our natural concept of a resource So this is common we talk about resources in classic salamander We talk about it no key also a resource basically is just a thing now usually it's a thing in the cloud right a user visible thing such as an instance of volume an image and a Load balancing VIP something like that something that a user can reason over but sometimes also it's something in your infrastructure like a host or even a IPMI sensor, right? It's just a thing, right? So we store some representation of these things and separate it off but linked we store a representation of Data about some aspect of those things right now those aspects we refer to as entities So we wanted to keep the naming ultra ultra ultra generic Yeah, so what would be a typical entity? Well, it would be something like say if the resource was an instance the entity could be CPU utilization Yeah, if the resource was a VIP the entity could be the number of open connections If the resource was a image the entity could be the number of downloads Yeah, so some aspect of that thing that you want to kind of store data or gather data about and generally the mapping would be One to many yeah, so you've got one resource and you want to look at lots of different aspects of that resource and gather data on it now the entities in the kanaki kind of realization of this are Identified by you ID and the standard kind of opens that way But we also have a way of identifying them by name for convenience. So you can say if you know that the Resources X you can find out which entities are associated with that resource or you can just say Associated with resource identified by this you ID give me the entity named CPU util Yeah, so we've got multiple different ways of identifying them And then the last key concept is this idea of a measure a measure is an individual data point Associated with an entity which is itself associated with a resource and Measures in no key and this is the kind of key idea are ultra ultra ultra lightweight. They're feather light All you're talking about here is the number Okay, the value and the time at which the measure was taken. So effectively. It's just a couple It's just a pair a timestamp and a value and that's one Value within a time series and the fact that we stored them in this kind of ultra lightweight way Means the additional storage required for each individual measurement that's taken is very very small and also these values are very convenient for manipulating using libraries like say the pandas Analytics library or for using specialized data stores such as open TSTP or influx. They fit very very closely with with that model Okay, so going a little deeper and some more slightly more advanced concepts in in in no key And what is this idea of an archive policy? So recall earlier? I talked about this kind of falling off the cliff Versus the gradual aging out of the system and it's the archive policy that drives that notion of gradually aging out So an archive policy basically is a set of pairs and each pair defines a granularity Yeah, how fine-grained or coarse-grained these data are and also a retention time span Do you want to keep it around for a week or a year? so typically at a high level you could think of early data being kept for a month and Daily data being kept for a year per second data being kept for a day that kind of idea Okay, so basically the archive policy drives two aspects of of what no key does and At what grain the values are rolled up and how long they're kept around for in effect Okay, so here's just a quick kind of visual representation of the kind of aggregation approach that that that no key takes So if you look at this picture if you look at the kind of the timeline, right? The actual far right-hand side of the slide. That's the here and now, right? That's where we're at Okay, and time is kind of progressing in that direction So all of these bars are looking back over some historical time frame. Yeah So basically here. We have a case where an entity has an archive policy associated with it that has three different granularities per second per minute and per hour and Usually the way you set this up is the more coarse-grained the Archive policy is and the longer the retention will be yeah So you keep coarse-grained data around for longer you keep fine-grained data around for a shorter period in general You don't have to do that, but that's a typical case The green bar at the top is kind of our live window. That's our kind of buffer. Yeah So as we receive data It's going into this green window and that extends over some period the period is actually configurable And we keep that it's at least as long as the move one period of the most coarse-grained But generally you want to have a look back window is a bit further than that right because you might receive laggy data slightly lagging and Backfill and so on so basically what we do is we maintain full resolution data And then as these data are received they're Aggregated into a number of buckets. We got a per second per minute and per hour in this case But you can create archive policies and as an administrative function with whatever retention you'll or whatever granularity you like and Basically depending on the storage driver either that aggregation is done in an ultra eager fashion as you receive each data point bang It's re-aggregated or it's slightly laggy For example influx DB the way we've approached that or we will approach that driver as we complete it will be to use Continuous queries where it's kind of lagging by the period equal to the most coarse-grained in this case and one error And and then the actual aggregation itself is non cascaded That's another point that we're trying to make from this visual So we don't kind of roll up our full resolution data into per second buckets And then roll up from per second into per minute and then roll up per minute into per hour Instead we use non cascading aggregation So we don't get these distortions that you get when you take the bean of beans for example Yeah, so full res is is aggregated independently into each of the the Levels of granularity that you've configured Okay, so at this point I'll hand over to my colleague Dina and She's gonna talk a bit more about performance and so on next question I'd like to cover up is what will we do with all of these data points we've collected because Actually after the cloud has run for a while there are thousands and actually millions of these data points stored So and measuring different entities is the main concept of nookie but all these data points have no actual meaning and Well, they're not really useful without association with entities and resources so actually a nookie index here is something responsible for the connection of All these data points to resources entities and linking all the stuff together so and the key of How nookie should be performant and quick and well behave as we wish it to behave is The fact that the resources are well typed. They have just defined there were well some attributes and All these things is indexed if the resource type is unknown Well, okay, we can use generic tile, but anyway, if we'd like to store some information we definitely will define some kind of resource in nookie and Actually by separating the samples idea. Well measurements here in nookie and Mapper resources to these measurements we get rid of one really interesting problem We are having now in CELOMETER because let's imagine we want to Store information about CPU till for some VM well forever And then we'd like to grab some information from Monday to Wednesday and in this case We need as my colleagues said to just extract all this information stored in the database And then perform the needed request and then just write on the aggregated result but in nookie with this concept of index with concept of separated data storage for the Time series data and just index that links all the things together. We get rid of this Useful useless. I mean thing of extracting all this data from that the base so When we're speaking about nookie, of course we're speaking about providing some kind of time series API for the different kind of storages and For now we have implemented Swift and pandas library based Driver in nookie and Owen and I are working on in Flux DB and open this DB drivers respectively when a fighting through the reviewing process to make it work as it's supposed to and Actually, I'd love to cover one also really interesting question. What's about performance because currently CELOMETER after years of development is not so bad actually and It can process not so well not so small amount of data per minute per hour well, whatever and Let's imagine. We just want to write some data to the nookie plus CELOMETER Here is the result of just writing samples are collected in 100 batches of these Measurements for different resources. So actually that's kind of natural situation We have different resources all these metrics are not connected with each other because all let's say the ends are different and We are writing these information to the open t-stdb driver actually using dispatcher I have just Well, I'll go through all these things like dispatchers we're writing now a little bit later and Okay, so that's kind of Six hundred to seven hundred measurements per second being returned to the just open t-stdb But actually that's exactly the same average number where I have enough or just one collector processing Samples in CELOMETER with MongoDB or HBase back-end. They just do the same number and Here is something we'd like to be close to well in really nearest future because this blue line is actually the Result of the same test the same database open t-stdb But we're writing directly for from CELOMETER dispatcher not nookie dispatcher but CELOMETER dispatcher to the open t-stdb via just well quick socket pushing all this data and That makes about two thousand of writes per second and just before this presentation was met with Owen and Juliana Discussed how should we make this green line closer to blue one and we found lots of law-hanging fruits to be fixed Just well the next days and that will make this green chart really close to the blue one and This result is much Better that we could ever achieve with CELOMETER with the current architecture of CELOMETER because Well, we could do nothing with this Flexible but heavy structure of samples were stored at CELOMETER, but this lightweight samples were well measures We are posted to the nookie is something we can operate with really high speed and Okay, so let's go through current CELOMETER infrastructure to understand How should we hack into this workflow to Integrate nookie and CELOMETER together. So actually this kind of usual workflow We're having amount of open-stack services running in some cloud. If something interesting happens there, okay They are pushing Notifications of about the events that happen to the notification bus. Also, we're having polling agents that actually Appalling these services once a minute an hour Well, whatever you'd like to to collect the information about metrics You'd like to collect and all this stuff is also pushed to the notification bus let's say to the Rebiton queue or some other queue and Then all these samples notifications pushed to the queue are processed by notification agents and They transform to something that can be eaten nice by collectors and the collectors are writing all this data To the better base currently we're having separated storages for the events matters and alarms actually And this was the first step down to Separate these things from one huge storage driver We had and this was step to integrate CELOMETER and nookie because actually we're interested in metrics for that case of nookie and Also, we have an API that can perform different queries Well in the way we just described for some of the cases there is alarm evaluator that is well some Let us call it periodically the API trying to understand what's going on if some alarm has happened Well, whatever and if happened, okay, let's call my alarm to fire and defy services interested in this alarm for example heat about some well some threshold has being well So for sure was received received well, whatever so and The question is where should we hack into these infrastructure these quite complex workflow to make nookie work inside of this scheme not only Well effective, but also performance and in quite of simple way and really here's the answer. We are planning to translate all nookie API to the CELOMETER version free API for the all time series related requests to operate effectively with time series information we are storing in open tstdb or influx db or whatever and Move all these times series database interaction to CELOMETER collector that is actually writing things to the database and The steps we're doing now is integrating nookie as separate piece of code a separate project for now. It's code is Stored on stack forge actually Inside the CELOMETER workflow using CELOMETER database dispatcher mechanism actually that's something collectors using well Just write data to the data store So that looks like a nice place to hack into just as proof of concept of how all this thing might work so Actually, I guess that's all from my side Owen Thank you, Dina. Yeah, so I'll pick it up again and just continue with a couple of kind of concluding Points to talk about so basically I mentioned earlier that one of the key things that we wanted to do here was to make the actual and Unit of data that we store much much more smaller and much more lightweight But as it happens, we've got a number of kind of use cases with CELOMETER that do depend on the more heavyweight more highly Decorated data being stored and one of those is the alarming use case Now the reason why effectively we have alarming within CELOMETER The the kind of motivating use case was the requirement to drive autoscaling in heat Yeah So he basically generally wants to scale out the number of instances that are hosting some particular service like a You know a web server database or something like that and it's generally driven on the basis of the observed trend in say CPU utilization That kind of thing now We've got a group of instances there that are somehow all collected together They're they're associated from a heat point of view. They're all part of the same autoscaling group But that notion doesn't actually extend outside of heat much, right? So there's not really a good way of identifying those instances except for Some of this metadata, right? So basically the way we work is when heat spins up a new instance as part of an autoscaling group It decorates that instance using user metadata and that user metadata includes an identifier for this autoscaling group And then we can query over the CELOMETER data, right and Aggregate over all of the instance data over a certain time period that matches The user metadata that was set on the instance to identify it as being a member of this autoscaling group Yeah, so that's one case where we've actually kind of embedded in our design of another feature a Assumption about how the underlying storage are how the underlying data is stored So how are we going to get around that? Well the approach that we're looking at basically is to support cross entity aggregation in no key All right, and then to identify the set of entities So this would be the set of say CPU utilization time series that we have associated with a bunch of instances All of whom are members of this autoscaling group that we will identify those using strongly typed fairly selectively chosen Metadata or attributes of those resource that we store as first-class citizens in the resource representation Not as freeform Unpredictable data that just sits in a dict and that dictionary may or may not contain those values Yeah, so for example in the case of this particular and usage of the Metadata what we're doing is we're adding a strongly typed server group Attribute to the resource representation and each of the resources in no key is going to have a relatively small number of strongly typed attributes and then we'll be able to kind of Agriate over all of the matching entities that are associated with resources with the same attribute value using this cross entity aggregation mechanism Okay, so second and common use case in salamander that also depends on these heavyweight samples is this notion of Reconstructing the resource as state timeline So taking a particular instance and looking at the span of time for which the instance was active And then there was another span of time for which it was suspended and then you resume that and it becomes active again Yeah, so you can reconstruct directly from the sample data and the actual major life cycle events and that gives us a clue as to how we're going to address this in The lifetime of that resource now as that happens these events are generally quite infrequent Yep, if you take an instance in your cloud, how often do you actually resize it? Not often how often do you actually suspend it? I mean it's not something that happens in You know on a daily basis or even a weekly basis or you have many instances that are never resized ever in their lifetime So it's much cheaper to actually store the events that basically represent those state and transitions that we're interested in rather than to continually snapshot the data that's either static or very infrequently changing Another question about Nokia is often leveled basically as well. Wait a minute lads. What are you doing here? Effectively you seem to be just reinventing the wheel in terms of time series storage There are a lot of dedicated specialized metrics oriented databases out there So why are you doing no key? Well, the thing is we're not really reinventing the wheel the idea as in much of OpenStack is for Nokia to be highly pluggable right to use a back-end driver model and To support a variety of different storage drivers Now we've a canonical one that's based on the pandas analytics library and Swift and that's intended to be something that can be run without anything external, right? So you can just use services that you know Are likely to exist in your OpenStack cloud such as Swift for storage and that brings certain advantages It also brings a lot of advantages to us in terms of testing these things in the in the continuous integration gate But going forward we've got underactive development as as Dina mentioned earlier two other storage drivers Both of which are based on specialized metrics oriented databases One of which is OpenTS DB and the other one is influx DB Yeah, and in this case you may well ask the question. Well, what does Nokia actually bring to the table? What value does it add above what influx can already do well what Nokia manages in that case is the key Entity to resource mapping which drives the cross entity aggregation and it also manages an abstract notion of archive policy So that you can define an archive policy and then have the storage driver map that on to say continuous queries With charred base retention in the influx DB case and map it on to something different entirely in the OpenTS DB case So no key presents this very abstract very standardized notion of that archive policy Okay, so very briefly I've almost from at a time as per usual, but Some forward-looking questions things that we have under active consideration These are the hot questions that we're trying to address this week in a lot of our design sessions So as was mentioned previously, no key has been developed somewhat at arms length it's been a stack forage project that Julianne spanned up and We've kept it kind of separated an arms length from the salameter core So it's not to be too disruptive on our continuing that Maintainership of the salameter core. So obviously we're going to have to bring those two things together somehow. Yeah We're going to have to merge the two core teams We're going to have to bring the gnocchi code into the salameter repo or to have a model where we're consuming from multiple repos Yeah, so we have to decide how we're going to do that. We got some gnarly migration questions Clearly there are salameter users out there in the wild. They've built up non-trivial data stores based on the old model So how are we going to transition or manage that migration? Are we going to have some mechanism whereby we can mine this more lightweight data from the previously built up more heavyweight data? Yeah, so there that that's a Non-trivial question that we've got to answer and the other thing is that It's in open stack in general. We've got a fairly well-defined way of deprecating things, right? You put something on the deprecation path It stays in the deprecation path for a number of cycles and then it kind of goes away Yeah, that's how we basically go from things that are legacy features the things that are no longer supported In the case of salameter because we've got such a discontinuous step here We may want to have that deprecation path unusually long We may want to keep the v2 API around for longer than two cycles Yeah, depending on what our our users who've actually got in deployment currently and Would like us to do so that again is it is a is an open question okay, so kind of out of time and Usually we've got this link slide as a as something to look at while questions are being asked And I think I burned most of the potential Q&A time by going on at length about what we're going to do So apologies for that We may have time for maybe one or two questions. If there's anything brief that's on anybody's mind Shoot. Yeah That's a fair point. I think we're concentrating more on the the pre aggregation route and Because we had the case previously where we did no roll-up whatsoever And we found that queries in general tended to apply to fairly coarse-grain time periods And the fact that we're doing this on demand meant there was a lot of computational cost in terms of repeating very similar Map reduce jobs in mongo and so on and If you want to follow a model whereby you do that aggregation on demand and That's something we could potentially accommodate. I've yeah, it's not been our focus to be honest So you have correctly read it the focus has been mainly around increasing the right and fish efficiency decreasing the storage footprint and using eager pre aggregation to decrease or reduce the computational complexity of satisfying queries More indexing. Yeah. Yeah, cool. So one last question Yes, exactly Yes Yeah, so that upfront upfront declaration is exactly what the archive policies are about So you go and an added administrative level create different archive policies and then associated them with other with individual entities And that's your upfront decision-making that you exactly talked about. Cool. Okay It's I think we probably have to move because there's another session starting So if you've any other questions, we're around for the salameter design track and you can approach this individually So thank you for your attention