 In 2021, the monitor of the reference staff will look at the slides. First of all, welcome to the day one of Yukon, and welcome to my talk, Journey to Titan. I understand that it's the revisions and resource origins in the world. I'll start with an introduction to me. I'm Ms. Preyanta Satham, and I'm familiar. I work back today as a journalist in the equation at GM, and I'm also a real estate entrepreneur in this 1.9-year cycle, and I'm also a technical lead for special interest for the connected experience. So, what are we going to explore today? This talk is going to discuss the school thing about how these resource versions in for that it is objects, maps and tools, something called revisions and exit. That is going to be our discussion for this particular talk, in what are revisions and exits, and how they take the direction of revisions all the way through these most versions. So, starting with this, this is how all of it is discussed. This is from the previous talk, and I'll go to our video. So, that's the point here. Here for this particular talk, what I want to mention is that every time we interact with a new discusser, on the only component we can interact with is the commandist API server. Either we are doing it through QCGL client, or web impoverished, managed discusser, or maybe using program language as the case, what we are doing is we are sending REST APIs, and REST API is coming to the API server, and every time there is a change in the command discusser that changes the system in persistent storage there, or in this particular case, XT, it could be blocked with another department in persistent storage there. So, for this talk, we are going to talk about XT, and what is XT? XT is a distributed library key policy for the most critical data of a distributed system. In this case, we are talking about Kubernetes as the distributed system. So, for this talk, we will talk about moving away from a Kubernetes object. Here, I have a screen shot from my range, and I don't know if you can see here, the environment, which makes the environment, and I'm trying to get out into the admin format. We'll move over the way of understanding where is this resource version coming from, going back to exactly where it's coming from, which is XT, so this is going to be the discussion. Let's start with what are the XT permissions. So, revisions acts as a snapshot for objects to be in the key value store. For example, in Kubernetes, every time we are creating any resource objects, or we are modifying them, we are delivering any kind of change that is happening. So, all those snapshots for us in a Kubernetes are stored in a key value store, and in case of XT, we are stored as revisions. Revisions are 64-bit clock that is maintained by XT, and it actually shows all the values that is changing the state of XT. So, the key, when we are done with XT as a key value store, the key is the revision, the global counter, that is always in Kubernetes. So, one thing to understand is that revisions are always implemented. So, with that in mind, key values for XT, revisions are in the key, and anything that is changing the state of XT, key values store becomes the value of that key. I want to show how that data is stored inside XT in an example. So, this is my setup. I have a multiple container here running from the bitnami XT image, and I am just exerting inside that over the container. I am adding a few tools here, called soup, also downloads another new called chicken, chicken curry, and that is what I am doing here. This is the Guarana dish, what I am doing is finding the latest version of chicken. We are pulling it down, we can exit together and we can develop it in that way. We have a command line, T-C-D-C-D-R, X-C-D-C-D-R, that is what we did. We should wrap up our key values store. Here I am setting up a command line for every point. T-C-D-C-D-R, A-B-I, that my client, I, I run the version 3A. Since we just cleared new container, our key values store is empty, and we can check that status here, what I do, I don't want you to look at it as a header section, that is the most important part for us. And even in that header status, that information that is highlighted here, currently we can see our vision is one, that means we don't have anything, all we have is we have created, and we have initiated others. Let's try to add something inside this key value store. Here what I am doing is I am adding five values, something like foo one, bar one, all the way to foo five, bar five, and I am trying to retrieve some information from here. If I just run E-T-C-D-C, C-T-L get dot, I don't get anything, but if I try to get it in a JSON format, what I will see is this output. Here we can see our revision has incremented, and it has incremented by five. That means every time we added a new key value pair, foo one, bar one, that incremented by one, and since we added five key value pairs, our counter was incremented by five. So now at this point, after adding five key value pairs, our E-T-C-D revision counter is at six. Here, if you recall, we added values of the form foo one, bar one, all the way to foo five, bar five. Here I am asking E-T-C-D to give me all the values that are prefixed with foo. So here we can see, we'll just here, in this example, we'll try to understand how whenever we added one key value pair, for example, when we added foo one, bar one, how that changed our E-T-C-D key value store. So it is going to show all the key values which are coming on that side. That's the key KBS array, sorry, for the space here. I'll try to show all the key value pairs that are coming here. So we can see, we have revisions set to six here, that's what we already discussed. And that is our first key value pair that was added. Here we can see, when we started, our revision was set to one. When foo one, bar one, key pair was added, our revision was two at that point because we added a new change, we changed the state of our key value and we got those key fills attached to our key value pair as well. Create revision, more revision version. So what create revision is, it's the revision of E-T-C-D at which this key value pair was created in the key value in E-T-C-D. So when we added foo one, bar one, it was incremented by one and it got the create revision as two. It did not get any modification so far, so more revision is also two right now, but if it had changed any modifications after that would be set to the revision at which this key value pair was last modified. And version is also set to one. This means this is the first version. If there were modifications the version counter would have incremented. If there is a deletion it would be reset to zero. And above and below we are seeing the key value pairs. So this was the first key pair we added. Same as for foo two, bar two, we can see that got create revision as three because that was the state of, that was the revision of E-T-C-D at that point when foo two, bar two, key pair was created and so on for all the way to the last pair foo five, bar five. And then also see it is also giving us the count of our key values or whatever data is stored inside our E-T-C-D key value store at this point. So that is five. Let's try to delete one key now. Let's try to delete the first key pair we added which is foo one, bar one and when I ran this command E-T-C-D, C-T-L, delete foo one, we see we got this revision bumped up by one. So even though we removed something our revision is not decrementing, it is bumping up. So again our revisions, the revision counter in XCD is always ever incrementing. We also see another new field here, deleted one. What it is doing is it is adding a tombstone value to our key value pair here. What that tombstone value will do now is it will tell us, tell our E-T-C-D, C-T-L client. Now please ignore. Whenever somebody is asking you to carry all the values in the key value store just please ignore it is deleted. So there will be some snapshot of that key value pair still be there in our key value store since we have marked it with a tombstone. We will not get that value now. So if I try to get all the data now prefixed with key foo, I will not get foo one because that now has a tombstone marker attached to it. So now we have deleted this key but we know that the key was present in our key value store at some point and that is the interesting part of E-T-C-D. E-T-C-D is maintaining all the changes that are happening in different revisions. So and we can use that to time travel back and check out what are the past state. And this is helpful in scenarios like for example we had a crash, we had a failure and now we don't know what was the state of our particular object at maybe what was the last known state and what we can do with that series it will just pull off whatever was the last known state present. It will assign that revision and we can just get the state of that object from there. So let's try to do that here. We know at revision 7 is that is a revision when we actually deleted our foo one key pair and we do not see anything here at this point. There is no key value array here but if I go back one revision, if I ask it to give me a revision the value of foo one bar one key pair at revision 6 I will be able to get my data here. So I can see there is it is giving me a key value pair which is set to which has a create revision of 2, more revision of 2 and that is exactly the value that is the time when we had created the foo one bar one pair for the very first time. So we know that this key should not be present. It has a tombstone marker. But now if we recreate that key, what will happen now? So let me just go ahead and run this command and let's say I gave another value, a new value to our old key foo one and I just set a value bar one new again we know revision counter is ever incrementing so revision is now 8 and I will use some tool that comes from HCD repo itself to just check the raw state of our database that the key value registry. So what I'm doing here is I'm cloning the HCD repo just going inside the tools folder. There is a tool, honey tool called HCD dump DB building that and I'm also grabbing out the database from the container I created. Now with this new tool that I have going to decode or just get a more cleaner version of a readable version of whatever database we retrieve from our container. So this is what I'm getting out of our container at this point. We can see all the way from bottom to up we have on the left side we have something called revisions ref equals and then there is on the right side we have value. So we can see revisions is our key here in case of HCD and that whole thing after value is the value assigned to every revision here and we can see our revisions are incrementing from bottoms. We started at two and we went all the way to eight with all the operations we have done so far and it is incrementing. We can also see here. Let's start with second that was the first time we created the pair foo one bar one and we can see it was created at second that is the value we get in create revision we are also getting the more revision version values. Now if we jump to the revision main seven this is the time when we deleted our foo one bar one key pair and we do not see a value here that is and our version is also zero that is how tombstone is supplied now you know every time somebody will ask for foo one key pair if they are asking for the data at revision seven they will not get it but we created the pair again at revisions eight and we can see now the interesting part is we can see the created revision is eight so it did not consider that foo one was already there already present at some point in the key value store now it just registered it as a whole new key pair and since it was created at revision eight it got the create revision and more revision both set to eight and the version is one since it is being treated as a whole new pair. There is another concept of compaction here so we just so our data is being stored inside at CD in our sequential manner and if we talk about any distributed system for example kubernetes we are talking about thousands of operations thousands of state changing operation happening in let's say in a minute maybe millions in an hour and we do not want to fill our key value store just with all those regular like day to day tasks that are happening we want to keep our data safe or whatever is required so there is a concept called compaction in at city and what we can do is we can ask at city to remove everything that has a tombstone value for example if I just go back here we attach the tombstone at revision seven and I can ask at city look before revision eight look for all the key value pairs that has a tombstone attached to it and all the occurrence is related to that key value pair and remove them that is what is compaction here so here I am doing the same thing I am asking at city to compact before everything before revision eight and what I get as a result of that is I took a new refresh copy of my database in the first month and down what I am getting here is revision second and revision seven is removed from our output because that is no longer relevant that has a had a tombstone attached to it and it can be safe to be ignored at this point and city CDL also has a watch so watch here means we can start a watch on all the changes that are happening on a key value pair and let's try to understand this here I have two terminals here let's say on the terminal side I am starting a watch and here I am putting up a new key value pair foo one bar one update so this is what I am doing is at revision eight we created our foo one key value pair again and now I am updating that and what I see on the right hand side is our revision is bumped by once now our revision is nine create revision is still eight because that was the time our key was last created again more revision is now set to nine because it now got updated and version is two because now we this is the second version this is a new update if I do it again if I set another value to foo one I will see the same changes here also revision is ten, revision is ten and version is bumped to three but if I go ahead and delete it what I will get is there is no create version anymore only thing we are getting is a more revision that this is the revision at which this key was deleted and value is some garbage there is no value at this point so this is the time where this key pair has a foo one attached to it it is tying it all together we just learnt about etcd how data is being stored inside etcd but our title of this talk is understanding Kubernetes resource version and etcd revisions so let's tie it all together to Kubernetes now here I am creating a Kubernetes cluster with kind just doing the same thing I have a clone of etcd just building etcd CTL and I am coping this binary that I just built on my host machine into our control plane of our kind cluster that we just created above and one more thing I am doing here is I am using our nginx deployment that is available from Kubernetes project I am just doing a apply action on that what we are getting is we are just creating a deployment with three replicas in our newly created cluster let's try to now inspect our newly created Kubernetes deployment object using kubectl first what I can see is this is a redacted version of the entire output but what we need is the value on the right hand side we see a field called resource version that is currently set to one six to six and we want to understand how this is coming back so here I am executing inside into my Kubernetes cluster control plane again I am adding my jq and that is a handy tool to decode our key value pair values and since this is a Kubernetes cluster and our etcd is installed with certificates attached to it I am just putting this entire certificate values into an alias so let's try to get the same deployment value that we just deployed a new enginex deployment into our new cluster let's try to grab that value from etcd now and how etcd store Kubernetes objects is somewhat like on that path slash registry slash deployments it's like the registry the kind of object namespace and yeah so we are asking it find out anything that's there in the find out any deployment that's there in the default namespace and give me the keys from there and what I am getting out of there is that again going back to the kvs arrays we are getting a key which has a string attached to it registry deployments default enginex deployment that is the deployment we just created it has a create revision set to some value 1533 but we know when we are deploying let's say we when we are applying a Kubernetes deployment YAML it has maybe multiple replicas attached to it so a lot of changes must have happened on it maybe a new replica was created and there were a lot of series of operation happened so that was the last revision at which the last change happened and it was stored in etcd and that is what we are getting out from here mode revision and if you recall the object we just saw back here in our last slide we also saw the resource version also has a similar value to what we saw here mode revision so we can conclude here is etcd mode revisions are what we are getting back as Kubernetes resource versions we can also try to check here with another watch we can run a watch on one side using kubectl another from etcd ctl client one thing to understand is whenever etcd runs a watch or implements a watch it is we need to tell it what is the next revision start watching from there so what I am doing on the right hand side is I am trying to grab the current revision adding current revision of whatever is the deployment present with the key registry deployments default and I am adding an increment to that and asking etcd please start watching from that next increment all the changes that are happening so this is what I am getting from kubectl watch when we applied our yaml this was the first event that happened our first replica whatever some object some resource object was added and that was the resource version said to it 2966 same is the case with etcd when the first addition operation happened this was the first event that was added in our key value pair that was added in our etcd key value store we can see both the create revisions and more revisions are 2966 which is exactly equal to a resource version which we got from kubectl and that is because this is the first event and there is a series of event that had happened after that so we are saying something was modified maybe another replica was created and it got that signal back so we can see that is also being reflected in our etcd watch as well create revision is still 2966 but our revision is now being bumped to whatever is the latest revision at which etcd state was changed and there is a version bumped here so let's try to grab this entire Kubernetes database that we are just talking about whatever values we are getting from here in etcd let's try to see what does it look like so this is what I'm doing here I'm trying to again copy the database from our etcd key values registry from our kind cluster control in container and I'm using the same tool again etcd dumpdb to check how does it look like this is how it looks like there are million of things happening and most of these things are regular things Kubernetes is deleting most of these things every maybe 10 minutes you can as admin configure etcd to even do more compactions most of these are I will not go into the concept of leases but there are leases happening leases being released most of these are those modifications so there is a lot of care caretting stuff happening here but there are objects here as well and I have just tried to grab one of them and that also looks somewhat like this and let me just try to trim it to a more readable version here this is how our nginx deployment works at least the first one when we just added our object to the key value store this was the first entry that was added to our key value store so we got a revision key here which is set to 2968 we got a value that value is also redacted it's a huge value mostly in a proto buffer format and just redacted it for the sake of readability here and we are seeing created more our 2968 equal to the very first resource version we got yeah and version 2 is it's the version so it was the very first change that happened on our nginx deployment resource object so that was the first version we got from there so continuing from this thing that I learned this is what blew my mind this was the first time I learned that the fact that my entire Kubernetes cluster is just coming down to this etcd dump and all my Kubernetes resource objects anything namespaces, deployments boards, CRDs everything is stored inside this persistent data storage layer in this case etcd that is what I am seeing in this dump so all I am seeing is that key value pairs that is exactly what my Kubernetes cluster is made up of so it's okay to conclude my Kubernetes cluster is nothing but a collection of YAML files which with all this data stored in as multiple regions etcd I want to thank Michael Gash I got the inspiration of this talk I learned about this concept from this particular talk there is a great article a blog article also attached to this that has more details so thank you to Michael Gash and with that thank you so much for the talk and before I forget this is the QR code please scan it for feedback thank you and I am open to questions the mic is somewhere in the middle if somebody wants to take it check one so I noticed using a number of operators that appear to use washes they'll often come back failing stating that the resource version is too old or is otherwise out of date I'm not sure as an administrator in Google has not been helpful if I should be worried about this or what I should be doing about this or if it's just noise as things are off sorry I did not hear the question properly can you repeat again so while using a number of operators picking on Rook specifically they appear to use watches internally and they come up and are constantly complaining that the resource version is too old on their watches I assume that's because it's getting cleaned up yeah so XD had a different implementation of not let's say skipping all the revisions that are happening but this was intended at least from what I have known so far this is an intended behavior we are intentionally skipping everything and going back to all the way to the revision where we know this was the best stable state of that particular object I think that's a design decision here and that is an intended decision thank you I need to check that I'm not sure about that first thing are we confining the HDD database to a specific site like what I see is 8GB is the max why is there a limitation of 8GB for HDD well I am not the HDD expert here so I honestly don't know why if there is a limit at all but if there is a specific use case you are talking about can you clarify more maybe there are people here who can add is there an auto defragmentation that happens always auto defragmentation as in what there is a defragmentation and then we generally have a defragmentation right anytime the HDD feels we generally do a defragmentation so you can configure HDD for at least I know auto like regular compaction I am not aware of there is a concept called defragmentation or fragmentation but I am assuming if there is a configuration available for yeah you can as an admin of HDD configure how often you want to do the compaction big question so if you can't answer that's okay when we look in Kubernetes we see a resource version which is the etcd revision which is really interesting thank you for explaining that it looks like an etcd though each key also has a resource version which is specific to that key 2345 so I guess my question is can you give some context for why when we look at it in Kubernetes we want to see the global the global etcd revision number instead of the revision number for the individual key you mean why we are getting the resource version as the more revision value and why not the one that is on the left hand side if we look at like foo 5 here when we look at that in Kubernetes we get a resource version of 6 because that's the etcd revision number the mod revision number for when it was created I could see another world where we would get a resource version of 1 because that is the version associated with foo 5 here in the etcd database so I'm just trying to understand why from a conceptual level we want to receive 6 there the etcd revision number and not the version number of the particular key in etcd if that makes sense so I'll try to repeat what I understood you are talking about why are we getting the more revision we did not understand like I don't know which line you took as an example like for example if I have an object I'm talking about the one which is rev main 6 for example there on the right hand side the mod is also set to 6 so that whatever we are getting as a resource version in Kubernetes that's basically telling us that's the last revision of etcd where that object was changed so whatever change that object state in etcd whatever was the revision of etcd at that point that is what we are getting out of resource version with that I think I'm done with my time thank you so much for coming today