 Welcome. This is the 29th of July. It's Google Summer of Code, Git Cash Automatic Maintenance Project. Rishikesh, do you have any questions? I guess I should ask, how were your exams? Good. Before we get started, I wanted to let you know that next week I wouldn't be able to attend the meeting. Okay. And then the next week I checked us on 10th of August. We can schedule that again to a Friday. Oh good. Yeah. By then my exams would be done. Okay, so let's let me make that correction. That way it'll be very clear to all of us. So Monday or Tuesday, Wednesday, August 3rd, we'll cancel, right, because that's your examinations. Okay, so deleting that one. Exam week and all the best on your exams. That's that's great. And then on the week of August 10, it would be better if we made it August 12. Is that correct? Yeah. Okay, so I'm going to move that to August 12. Move to August 12, so that it to be after exams are complete. All right, very good. Okay, so schedule has been adjusted then the next week that week of August 17. It's okay to go back to the 17th to the Wednesday. Yeah. Okay, great. All right. Excellent. Okay, so Rishabh, welcome. We just adjusted schedules. We won't meet next week. Because Prashikesh is in exams next week. And the following week we will meet on the 12th of August instead of meeting on the 10th of August. All right, that seems fine to me. Great. Thank you. Hi, Rishabh. Hi, Rishikesh. Oh, sorry, I should be polite too. Hi, Rishabh. Hi, congrats on your presentation, Rishikesh. It was really great. Thank you, thank you. Yeah. It was absolutely awesome. Yeah, I did, you know, myself for presentation, you know, the demonstration, it took up a lot of time. And I was demonstrating there. Before starting the demonstration, everything worked fine. Okay, I don't know what happened during the presentation. If you have not failed in at least one live demonstration, you've clearly not done enough demonstrations. That's simply the nature of demonstrations, right? It's, if you clearly need more, you need to do more demonstrations if you've never failed, I don't know anybody who will tell you, oh yes, my demonstrations always work unless their technique for demonstrations is because I always record them on video before, and I make sure they pass when I do the recorded video it's like, that's not a demonstration that's totally cheating. All right, so what topics, or do you have any topics I know that you're, you're in preparation for exams are there topics you wanted to discuss what would you like to review, etc. Oh, I wanted to discuss, you know, the agenda of, you know, based on what you saw, like what are we going to do, what are we going to do, and how would we proceed. Okay. Yeah, there are a few doubts, like regarding UI. Okay, how would we need to find the UI to make it user friendly. Okay, and then, yeah, there are a few points which I have, like, the first thing regarding the caches. Okay, so we decided to display, you know, data regarding how the you know how frequently. The maintenance has been run on which caches have been done in a table format right so how do we proceed with that. That was one thing. And the other thing was the, we are using, you know, cash entries. Okay, we don't have the name of the get repository which we are running on. So how would we show the administrator, which, which repositories taking how much space and how much execution time. So, so the administrator only knows the repository, if they know about it at all they only know it by its cash directory name, right so to the administrator they don't know which repository it, it is a cash. It is caching. And so it seems like to the administrator we need to show the cash directory name, but if we could using a get call or a J get call determine the repository that that that is caching that will help them. It will help me because when I see a two gigabyte repository. I won't panic if I see right next to it that it's stable dash Linux dot get. Right, I say, Oh yeah I know that's a monster, and, and it's always going to be a monster that is unavoidably large. When I see get plug in, and it's 400 megabytes, it's, Oh, something's wrong there because that repository should not be 400 megabytes large. So if it's possible for you to and I think it is with a J get call, or with a with a call to get to ask what is the, what is the remote that's, that's, that's being cashed in this repository. So each, you know, each remote has a different you are so different indexes would be, you know, like a bit of remote would have a difference in tax and then now, as you will get lab would have a difference in tax so how would we again differentiate between all of them. And, and for me the if the table if the table said, here's the cash directory name, and that is a unique name right every directory is uniquely named, and then on the same row it says, and here's the repository it is caching. And so they may then and even better if we then allow them to do things like sort the list of cash repository names they may say, Well, that's funny why am I caching a copy of this with get protocol and a copy of it with HTTPS protocol that seems wasteful. So we may then get some bug reports saying why don't you get smarter if if we're inadvertently caching two copies of it. And answer your question. Yeah, that does. And I have another doubt regarding the data, where are we going to store it like are we storing it in a file and Jenkins, do I have to create a file or is there something already and Jenkins which helps me write all the data which I need into that file and then read it again. And you know the maintenance device so that I can display. Also, how frequently do I, you know, remove data from that file because it keeps growing right so and what I've seen Jenkins. It's not uncommon for Jenkins things to completely rewrite a data file, as they generated every time. So, so now how frequently to refresh it. If it were refreshed. At the end of every cash maintenance tasks, meaning, okay we did GC, and at the end of GC if we updated that data. That seems like that's good enough if we did a commit graph and updated the data. That's good enough to me. Now, during the run of that GC operation, while it's proc iterating through each of the caches, the data will be out of date, but for me I think, I think that's okay. If, if it were, if we said, oh no we want to be even better. You might say at each sub task in the GC, we will rewrite the data but my worry is many of these caches will be quite small. There's a bunch of time rewriting data whereas if you just wait till the end of all of the, the whole end of the task, that's good enough. Rishabh, any, any thoughts from you. I mean to do it after the whole maintenance processes went over seems much more. And the incentive that you do it after each particular repository seems, so what would be the benefit would be the user would be able to see the data at that time but yeah I mean it doesn't seem that much of a big of a benefit as we tend to seeing it at the end of the whole execution. Particularly with, as we think about GC is probably an exception here but the commit graph operation seems to be relatively quick push a cash right if I remember correctly. It's not hours to generate a commit graph. Waiting until the end of all the commit graph processing is probably still not dramatically different in terms of time than doing it on everyone. So if we are going to do it like after all the maintenance after all the after the task is completed then we have to store it right, we have to remember the state of each dash what was it, did it execute or not, what was the execution time for each maintenance task and then write it. Yes. But at least that's what I was assuming is, is you want, but but we've already got. We've already got well know do we, I, yeah, yes I would think you want some data structure that says, here is, here's a cash directory name. And here is the here is the matching repository for it. And then the run the runtime information for the various things now, I guess it's a valid question what about the Jenkins controller that has 1000 or 10,000 caches on it. But that kind of controller since these caches come from multi branch pipelines right. The controller is probably already catastrophically loaded. So, so I'm, I guess I'm just on the assumption and in memory copy of that data is probably not heavy enough to cause us concern. I like Rishabh is wondering that's good. And I was thinking, so there's a thing on the trade off right I mean either you have an increasing size of let's say so you're spinning is storing it in memory that versus. In IO operation each time right so, so frequency of IO operation would, I would say, have more effect on performance as compared to as far as I can understand, having a growing in memory. Right. Yeah, I mean, I always what 10x or 100x slower any IO is 10x or 100x slower than memory access so, yeah, so you're if now there are plenty of things in Jenkins that write small files. Right, there are many, many things that do that and so we would not be alone if we were writing small if we wrote wrote every time. So, or if we said hey we're not going to keep it in memory will always read it from disk. But for me it's I don't see a lot to be gained by saying oh we'll not keep it in memory. Do we do we have asynchronous processes in our plugin. Do we do we do that. Like I like can I launch launch an asynchronous thread and let it do this, while the main execution thread is not affected by the IO operation. I mean, who should cash is already created threads that that are doing the tasks. I think what you're asking is could there be another thread which does the data gathering right and I think there could be if that helps. I mean, he is creating threads but if, if the, if the program is saying I mean he would essentially let's say after the execution task has been done he would make an IO call. So, so that thread would then whichever thread has done the execution would go for that one. And I don't know if that is then be so essentially what my point is that if we could separate out certain threads from the pool. We do this job, while majority of the chunk is forming the main execution of the program, then knowing that IO code. You know it should not affect the user's performance that it should not take resources from the main task that is present here. I mean, that is why we would have an asynchronous process right so launch a thread that is not time bound in terms of we it's not necessary for us to refresh that file or the status at the exact same time and the task has been done. But let's say loosely takes some time and you know does it and gives us the result. If that is possible I mean I haven't seen asynchronous threads. Well, and maybe do we even do we even need a thread in that case or is it that we take the concept of queue and at the end of, at the end of processing a task GC commit graph, whatever prefetch. It drops its data onto a queue to be processed whenever something processes that queue. I mean, Jenkins certainly has lots of cues, right that there's a job queue that runs runs its cues so if we were, if we were really concerned Oh, this may be just too much data. We could instead say hey let's drop it onto a queue. We won't, we won't write it to the, we won't write it to the final data structure will just drop it onto a queue and let, as you suggested something process it later. If it if it's if processing is expensive. I'm just seeing a case where I would expect this result data to be expensive to process right because Russia cash the things I think I thought you were describing was repository directory name. You are a lot of the repository, and probably time to execute each of the tasks in the list so. If it took 50 milliseconds to execute this task or it took 500 seconds to execute this task that will be visible to the user so that they know oh here's where you're spending time on these cash maintenance efforts. I mean, not that the, the amount of whatever you want to store on the file is and the processing associated processing time that is going, let's say even if it's minor. I think the whole point is we were discussing was that I have execution takes more minutes. It's a heavier. I mean, in terms of time right so to separate that out and since that being a back end process not something that the user has to be concerned about is why I was saying that if we could separate it out but again, I mean, again the option that we could let all the tasks run and then just do it at a single goal to make sense. So we don't need to. Yeah, we don't need to have mechanisms complex mechanisms there. So do I have to create like my own fight or is there something already in Jenkins which, you know, helps me do that, you know, which helps me store all this data and then read it into the UI. I think there are facilities in Jenkins that will store data data like that so the example. I think you've actually already used it is that there's an XML file that's created that tracks the configuration of a of a task right when is when does it execute. And there what's what that's doing is storing data for you. Now, now, I'm thinking now where would we point you to find an example of something that's storing data I wonder, maybe we should have you look at the metrics plug in, because it certainly stores data about things like well let's see I'm going to bring, I'm going to bring it up to show because this this is a hint maybe where we should have you look. Okay, so if I look at this thing you're okay if I share my screen. So here is share screen going to share this screen. Okay. So, on this screen what you see is the the CI server that we've used for past experiments. Right so it's got quite a collection of agents, some online, some offline, etc. One of the things that I can do is look at one of the windows agents. And if I then look at this label AMD 64 windows and click this load statistics. It's going to show me load statistics for the windows the all the nodes that have this label. And so what you see here is. A sign that I did something to my my CI server that I shouldn't have done. What you see is there were three agents available here for quite a period, and then we grew to have 11 executors available. And this shows when they're busy and the busy the red line. And this is time over so something is keeping this data right someplace this data is being stored. And so this metrics thing might be a place for you to look to see how does how how should we do data storage. Now that's that's not a terribly terribly fulfilling answer I apologize for sure yes I've not done the kind of data storage that you're looking for. If that's not enough though I'm confident we can find people that can help us big help us help us get the answer you need. Because I never researched about this area. Exactly this is, this is now in phase two we're saying oh guess what Russia cash, it's time to do some research this won't be code this will be find to find the technique that Jenkins uses to do this and and use that technique. It's it's unfortunately it's the same pattern that we applied for Richard a few years ago where halfway into the project, we realized there were things we just hadn't known. So in the end of the project we simply were ignorant of them, and it meant oh we've got to go do this. Rishabh has to go do this research it's not we in this case Mark certainly didn't do the research. So, same thing I think for you is this is a topic that I don't have an answer for. We see examples that there are things that do give answers to this, but I don't exactly know how it does it. I know people who know, so if, if it turns out that you don't find it in a reasonable time. Let me know and we'll find others to give us coaching. This is the metrics. I think so yeah I think, let's see if I can find hints as to where that's from. Your overview. Yeah I think well let me show you the other place where you can see that same kind of data that way you know. I don't look at the number of plugins I haven't updated yet I just look at them on my plugin updates I haven't restarted this in several days. Okay we need the thing that is monitoring of Jenkins. Oh load statistics here this is the top level load statistics for the entire system. So what we see here is there was a, there was a period where I had 100 available executors. Or 100. Yeah. Oh no no this is a queue length. I had a queue length of 100, and it finally taper down, because my executors were here to do the work. And there's the, there's the picture. Yeah. So if nothing else you could search through the code base looking for load statistics. Is that comfortable enough. It's sorry it's not a great answer it's not the oh yes this is how we do it. I'll have a look in the way. And there was another thing regarding like we discussed initially regarding repository exclusion. Currently we are running maintenance tasks on all the caches. So do we, so we want to be planned of adding a way to exclude repositories from maintenance. Okay, so and then we thought we'll do it based on regular expressions or based on the size of the cache. So, so the thing about regular expressions is, as usually we don't have the repository name. And I wanted to know like, are there any other way, do we want other ways of, you know, excluding repositories from the maintenance. So, for me repository URL already covers more more cases than I would initially expect I, I'm not even sure we have to allow that people can exclude cash maintenance. Right, it's this is this is for the health of their system, and we trust that command line get knows what it's doing when it optimizes a repository. Now, with get 1.8 on Centos seven maybe not. And, and maybe we just tell those people they have to upgrade to an operating system that doesn't use such an ancient version of gift. So, so you're suggesting, you know, not both words that they not implemented. I, I, I would not object if we didn't do exclusions but if you can find a way to do exclusions I'm sure there will be users that will be pleased users who say, look, I know that I want to garbage collect. Everything except this one monster repository that I maintain myself. It's 32 gigabytes of repository. I know it's bad. And I know, I know enough about it that I periodically flush it completely and we refetch the whole thing. I, I had one of those kind of repositories at a job five or six years ago and in the pain of carrying around that kind of baggage is awe inspiring. I think it's an enhancement. I mean, once you're functional, whatever goes that you have completed that probably it's something that you could look at not before that doesn't seem something that we should spend a focus on that. Right. So, because I want to get to my goals here with what exactly why, you know, what are the main things I need to focus on. I thought this was part of it, but so right now I think the main thing would be to display the execution data. And I think I don't have anything else redesigning the UI. I have a concern I was looking at the PR, not a concern that there are questions. I'm sorry I didn't want to interrupt Rishikesh if you don't have anything else I would then like to speak. Do you want to finish? So I was looking through the code and this is particularly related to task executor. I think there's there's a piece there where you're locking your initiating a lock and then you're the bulk of the operation that is performed by get is being locked and then unlock. So what so first thing that I was thinking I don't know if it's an exercise worth having or not but we should we should do an analysis of the threads or the state of the threads by taking a thread down on the JVM to see what is the state of what are the threads that we're using and the threats that we're not using what what is happening to them while, while all of the bulk of the operations, but say, you know, five tasks are configured for good amount of repositories, what is happening internally, because my what, so what I, what my concern is that when we're applying a lot to the whole operation, how do we define the behavior of the rest of the threads that come to that operation point. I mean, what happens to them are they waiting until the lock is released, or should they ignore if the lock is already acquired by someone. Do you come to that point you ignore it you move forward, because my, please correct me if I'm wrong with my understanding here isn't correct but what I understood from the code was that if I launch multiple tasks, there will be threats that will be at that point to acquire the lock it since it has been acquired by someone who is doing the gate operation and so this exposes not exposes but this is, so this is a direct relationship between the amount of time that an execution of gate is taking with the amount of time you're going to lock a resource. And if a resource is logged, the other threads won't be able to. I mean, we haven't defined a behavior that will let them wait for a certain time or not wait at all and move forward. So, this was something that I wanted to understand how we want to do it. My concern was, are we, is there a growing number of waiting or block threats, because of what we're doing. That is the first question. If that is answered by, if you if you know the answer to it it's great if you don't then just a simple thread and I'll dump dump of the JVM while it is while it is working through all of this would be enough to answer these questions. And if we don't have block threads increasing block test the threat over time for the gate operation or waiting threats then I think we have nothing to worry about but if that is the case then we should define the behavior during that because that I see is the critical piece of the whole operation that we're trying to do. So, yeah, I mean, this is more related to performance I would say but yeah, if it's something that we could attempt. Do you think that is something that we should keep within the goals or something more of a good to have if we have time we should approach this activity. And I think it's a good, I think you've got a good point to consider it let's do let's, if it's at all possible for you to fit it in let's do it. I think it's just a safety check particularly who should cash I am not a skilled thread programmer, and therefore there is every risk that will make some mistake and I will fail to detect it. So, techniques like techniques like what Richard is suggesting are very healthy for us because it's admitting Mark weight think single threaded, and he has a very simple way of thinking about things and, and the danger is when when when we're wrong about those simple ways of thinking about things we could risk taking down the controller, right we could risk bringing it down for just threat overload. So I think it's a good, it's a good, good question to ask Richard. I can share the steps to do it. I mean, as far as I remember there, there is a Java CLI to do this. This is pre installed in most of the environment, but let me let me confirm on that and share those steps. If you wish to do that exercise. I don't really need, you know, other, don't really need, you know, some other threads which access this bitcatches at the same time as I am running the maintenance task, you know, to know whether any threads are waiting or not. So, this is the part where I think I need to look at the code more but Rishikesh my assumption is when you have multiple tasks. Sorry, multiple tasks when I say tasks, a maintenance task, when a user is going to assign multiple maintenance tasks is a thing that is a single thread going to do all of this, or am I going to launch multiple threads with each new task. A single thread is going to run all of these tasks. So I'm adding all these tasks into a maintenance queue, and then I dequeue each task and one, one thread is created it runs the maintenance task, it gets terminated, then a new thread is created which runs the next maintenance task, if it is present in the queue, and then it executes all the maintenance tasks, and then it gets terminated, it runs the maintenance tasks from all the caches, and then it gets terminated. So then I don't see a case where other threads would be waiting for taking the execution since, so as I understand, each user's request to perform maintenance would be performed by a single thread, and it is not possible within that session that we launch multiple threads. Yes. And the only case where threads would be logged is assume any other plugin which wants to access this cache, which wants to do some operation, those threads would be waiting. So I am not sure about what, like they would be blocked, right, so I'm not sure how would that impact the performance. So assume a GC running for 15 minutes, and then there is some Jenkins job configured to run on that cache, but it can't, it couldn't run it because there is a log, and the did maintenance tasks has been run. So would that be postponed or how would that, what would that, how would that affect. So do we do we classify the kinds of. So, so the now you're talking about so I was talking about the log that you have created for people execution, but I think the log that you are talking about is for the cash directory that you acquire right when you want to get the cash in. Yes, yes, isn't isn't the isn't the lock you're talking about actually a lock that is acquired by command line get while it's running for instance the GC. It applies a it applies a lock on that repository that that locks all command line get operations for the duration of the GC and I don't know that that happens for anything except GC, but I thought that if you've got a GC running. You can't do other operations in that repository concurrently. Actually, there are two locks, if you think about it technically one lock is done automatically by the get tool by the get software, and we have another lock internally in Jenkins. Okay, so the, the lock done by the get to software that lock is added to prevent other maintenance tasks to run on it. I'm not sure if it prevents other get commands, but I when I read about it, it was to prevent other maintenance tasks. So, that is one lock. I don't think that log should be of any concern because that is internally done by the other lock which we have in Jenkins. I'm worried about that log, because if we are using, if we are running maintenance tasks and if any other plugin tries to access that directly, what would happen to them? Would they be in a waiting state or like they would be in a waiting state? No, so Rishikesh have so this is the log that is implemented by the abstract SCM, right? Yes. So I think what we should understand is I think I believe that the nature of the lock is, it is a re-entrant lock. Yes, so I'm not 100% sure here, but so there are ways for us to have locks where read access to the lock, even if a lock is acquired to a resource, threads can read it and they don't have to wait to read on the resource, but if they want to perform a right, then they have to wait to do anything on the resource. So I think we should look into that, what is exactly happening there. And Rishikesh, you are performing a purely a read operation on the caches. No, but that is when you're looking at the cache, when you take the cache log and you, so no, okay, so that is a right operation. There is a right code, so we can't, yeah. Yes. And then even though another plugin tries to access this cache, we are not like how do we differentiate it's a read operation or a write operation because assume that status command could be a read operation, but then when I do a commit operation, it could be a write operation. So you're worried about the fact that if Git plugin acquires a lock on a particular cache, and it's a GC operation, so it means that it would have that lock for a certain period of time. What happens if any other project or any other component of Jenkins is trying to acquire that lock as well. Yeah. Yes. Is there the only way that I have seen. I mean, how do you ensure that a single thread does not take more time. That would create a situation where you have, let's say a huge number of block threads that would concern you is that you. So you define a waiting behavior for the threads that have to wait on a log, but I believe then we would have to change the very nature of how locks are acquired for each individual cache at abstract Git SEM level. And that is something that we should be very careful about since that is used by abstract Git SEM is a contract that is inferred by a lot of plugins I believe. So go on that territory would mean that we would actually would want to be sure about what we think, change the behavior. But I guess a thread dumped in that sense. Rishikesh could help you to understand to analyze what is exactly going on in your in the JVM that is initiated the Jenkins you would you would be able to see. For an example, if you're on Marx machine and you know I guess this this scenario could be replicated where your maintenance task is running and let's say a multi branch project is also simultaneously running. Then you could look you could analyze the thread dump over time and see what is happening is there a concern. Well, and I, I promise you that machine has it. If you want to tell me when you're interested in doing that analysis, I can exercise it because there is there is a repository on that machine that's used for multi branch. That is 160 or more megabytes with 50 or 60 branches and commits that are arriving on many of those branches simultaneously and jobs that are defined to use that cash through five or six multi branch pipelines. So it's, it's horrible and embarrassing what I've done there, but it's a good stress test. So, so yeah it's, it's called the Jenkins dash bugs repository in case you're interested which one it is. If you ever see that one in any of my diagnostic stuff that's a, an enormous repository that's just filled with with data that's used to help me validate the Jenkins bugs are still bugs or are no longer bugs. So there's this tool called J stack, which comes within the mandatory back so you can take a look at this. I even have the command. So you need to know the process ID if you know that just need to J stack that and then it would basically do the analysis to the dump and store it in some text file you can then manage. I'll go through that. Yes. I'll try configuring some, you know, tasks on my site once on my system and check when they are colliding what's happening. But I mean, in terms of your priority, this would not be the first one, right? You already have some tasks that you need to do for it. Also, there was one thing regarding security. I am not sure if I feel it's a security issue. Like, we have a hash set, right? So we have a hash set where we are storing all the cash engines and assume any class is, you know, inheriting this after I get SCM. And if they iterate through that hash set, they can get the lock of that, you know, of that get repository and then and then think of a case where they don't set that lock or they use that lock in an appropriate way. I think that would cause a bit of a, you know, some kind of chaos on our system. I'm not sure about it, because we have a hash, hash map, concurrent hash map, which takes a cash entry as its key and a lock as its value. And we have a hash set of, you know, we have a hash set of all the cash entries. So if anyone iterates over it, and you know, you know, passes it to the concurrent hash map, they can get the lock and they can lock that repository. I'm not sure if this is a security issue, but I wanted to discuss it. If someone is able to acquire, I mean, get to that point, then even if they don't have the hash map, they would be able to use the API that is used to directly get all the caches and acquire a lock at that level, right? And that is already exposed. So I mean, the assumption is that they are able to somehow access the internally the Git plugin APIs. And if that is the case, then I mean, how would be able to stop if they can access our hash map and they probably can access the caches. Once you're inside the Jenkins controller's Java process, all bets are off, right? Yes, there are some things we can do to defend, but I can do ACL.az and become an administrator, right? So I don't, yeah, I don't think that defense is one we need to worry about. I was interpreting Roushakesh's concern as an API level threat that there was someone who inherits from abstract Git SCM project gets access to data that we would prefer they didn't have, didn't have available. And I don't know how to solve that when you might want to look at Joshua Bloch's material called Effective Java. He's got some of the things about things like that and inheritance design patterns. I am not nearly well enough first to be a good coach on that. I was also thinking of the, I had another concern regarding the maintenance task. So basically right now, assume some plugin is running on some command on a Git cache. And at the same time, the Git maintenance task wants to run the maintenance task on that cache. Do we want to skip the maintenance task if it's being executed by some other plugin or do we wait for that plugin to release the log and then we execute the maintenance task? Okay, good algorithmic question. So if we skip, there's a risk that we will never get a chance to do any of the maintenance on that repository. But if we don't skip, there's a risk we won't do the maintenance on any repository because we're blocked for a very long time on that repository, right. Now I'm not entirely sure I'm really confident which of those two is is healthier, which is not right because I guess sacrificing one repository for the good of the many is probably the better choice. Think of Mr. Spock and maybe, maybe that's what we need is to admit that the good of the many outweigh the good of the few in this case, maybe. But that may mean some large repository never gets garbage collected. Because if it's large and also very active, and the Linux kernel is an excellent example of that, right, for 15 or more years it's had an awe inspiring rate of commits arriving in that repository and they are small, very well vetted thoroughly thought through commits but there are so many people committing that it's just got a lot of history. And so we might conceptually if somebody's trying to garbage collect the Linux kernel, but on a very busy multi pipeline multi branch pipeline machine, they may never get the chance. But I think that's still probably better than, oh, we'll just block and nobody else will get anything done because we're waiting. Also, if you think about it, GC would be scheduled once in two weeks or once in a week, once in two weeks or something. Like the previous week it would have been scheduled and maintained, right, so it wouldn't be as worse as it was before. And if we skip a week and then do it the next week, I think it would be fine. Well, and, and I think those kinds of risks, if, if the UI can help people see those kinds of risks, they may thank us very much. Right, if you show them, hey, here's a table of when we last ran this this task and they sort it by date and say, whoa, the last time I garbage collected this repository was eight weeks ago, why is that. And that that may be a very helpful thing for that administrator to Oh, why is my garbage, why is my, why is that maintenance task not being run when when I expected it to be run. I think there has to be a consistent so if right now a multi branch pipeline is running and I go to the machine and I run the kid GC command. It won't stop that multi branch job. Right, and let's say let's say the GC I started the GC before the multi branch operation and then the multi branch project or the job started executing it won't stop it from happening. The job, the job and the pipeline as far as I know know, however, get access operations. I don't know about that right because if it's attempting to access the cash, and it honors the cash lock that abstract get a CM source has it may block waiting for that lock. So what I'm trying to arrive at is that there should be, I mean, if I manually run these operations without using Jenkins, I mean, this is just a manually get operation that I'm trying to run. And when I do that, I don't have to acquire cash locks. I mean, myself, right, I'm just, I'm asking get to do whatever it has to do. Yes, yes, man. No, I was just saying that if that is in within that behavior, I am my Jenkins operations are not affected, at least in terms of when I look at my job the execution time doesn't increase much. I think that is the consistency that we should aim for when we are doing this via get plugin, whatever logs that we acquire or however we do it, it doesn't matter but for the user it should appear the same. And that should, it shouldn't happen that to run the maintenance tasks, because we have additional logs or let's say we, you know, we block the cash, we can't, I can't even run my project because of that. That is something that would be new for a user is as I understand this. And that's one I think we'll need to check, particularly with one of these very large repositories right we may want to do an explicit test start the garbage collection on the Linux kernel, and then launch a multi branch pipeline job indexing that do they do they does it complete and does it complete promptly or is it in fact blocked waiting for the lock on the cash. So that would be a good exercise. Any other topics Rishikesh. So let me get this clear so regarding execution of maintenance tasks like the lock has been blocked by some other plugin. So do I skip it or I think you should skip it. Rishabh, do you. Okay, yes, I think skip skip is a is a reasonable choice and hope that we'll be back to it later. There was one last one regarding, you know, getting the bit version on in the bit plugin. When I was reading the code, when I implemented it, we are going, we are calling the native, you know, get CLI to and, you know, on the computer. But then Jenkins has various ways of getting different you can better you can consider different versions of the get off, you know, the get software and the get tool. So, how, how do I know that the bit version, which I am getting is the same as the one concert and Jenkins visit always the same or imagine like, think of a case like normally, you know, there is always one particular version set on the computer right so am I getting only that. So, so I think you want to ask for the tool that's default. You want to ask for the tool that's the top of the list of list of get tools and take that one as your as your belief this is the one that's available on the controller. Rishabh, this is where now we got to go back to your project. I think that was the kind of assumption we were making there as well right there is a there is a concept of the default get implementation. It's the top of the list and in, in some cases like on ci Jenkins that I oh oddly enough, the top of the list is actually j get. So you probably have to ask the question. What's the top of the list. Is it an implementation of CLI get API info. And if not, walk, walk the list down. Yes, that is what we did for the get to choose a class that we have. So there was so what we did was that we would we took the default installation and then we checked if it was what was the kind of instance that is it a j get or a CLI implementation and then we would check the compatibility per node. Is this compatible for this node or not and then if it's not that we would keep it ready through the available kit options that we have the two options that we have. And once we got something that works then we would take that and move forward. So what exactly is compatible here because when I've seen the client loving, there is no way of getting the version of the, you know, get, you know, the underlying get to which has been used. So. So, what you're saying is from the get tool instance itself you can't get the version right. For me that was not a question that I needed to answer. What I needed to answer was that a let's say if I have a CLI get or a j get whatever node I'm working on. Is it compatible or is it installed there. I mean I was answering those questions, whatever get tool I use is the correct one for that particular. Yes, but I don't. Yeah, I also don't believe that you can get the version from there. So for you to use the default but make sense right. What is default is what you use. So basically as default the one which is already installed on the computer like on my computer or is it something which is configured in Jenkins, but is not, you know, like bypassing some other path to the get one to the, you know, get to which is but when I run or get command on my computer, it is not the same as the default present on Jenkins. Need to see if it's if the default installation is what we have the system. Would it be something that the user. Yeah, if it is the default one is the same as the one in the system then we have nothing to worry about. But if it doesn't, then, you know, there would be like some variations because assume the system one has a 2.18 of it will get version and he the same system even has to 2.30 but it's configured on Jenkins. Okay, so we will be getting a 2.30 2.18 version but we could be would be running legacy maintenance, even though on Jenkins the default one is comfortable. I mean that is mark that is possible right for me to set a particular get installation as my as my default option for the operations that I wish to perform within Jenkins. It is as far as I understand it. Now I thought that and I haven't proven but I thought that the controller would consistently select the first get tool that was listed. That it would. So it would in the in the configure global tools page, there's a get section, and that includes an ordered list of tools. And as far as I understood it, it takes the top of that list for the controller. And I don't think it. Now I could be wrong but I didn't think it even applied any label selection logic or anything like that it just I thought takes the top but but again I could be wrong it's it's worth a it's worth a safety check just to be sure. It's true as far as I remember as well it takes it simply takes the first one, and then there is no sort of intelligence there that would, you know, right if if you see the. Yeah, if the administrator wants to use a different get implementation they must put it at the top of the list for the controller. Because that's, that's what I've observed on ci.jankins.io that that's where we intentionally put j get at the top of the list, and we see that it is used. Now that's a different. I was just saying Mark that that that is what I think what she is concerned is right if we don't know the version of that get. And if we know that within the list of options there is one that we could use, which is not a legacy, so that we don't just run legacy get maintenance tasks, then I mean that could be something that we are missing out on. Well, and that may be okay at minimum if we detect that condition can we log it until the administrator warning, you have, you have this invalid configure you have this sub optimal configuration it's not even particularly invalid right it's what you've described to shekesh is they're running we they've somehow chosen to run 1.8 but on that same node on that same controller. Here's 2.37 that we could have used and it would have been much better than that ancient 1.8 version that they have. Yeah. You don't have a way to understand that right to suggest I mean, how would we know that from the good tool itself right that is the information that is not stored. No, that is not real getting the version of although maybe we could do them a favor by telling them how old their get version is and that's all we do, and we always tell them at startup, your get version is lacks the following because Rishabh your capabilities checks were exactly that right it was. I don't want to ask I don't want to do this if I know the command line get lacks these capabilities and Russia cash you may just as a matter of on startup, put a nice warning in the log file that says your controller has such and such a version of get it will not be able to do the following things or its performance will be less would be less than if you were to upgrade to them to get to dot something better. I actually kind of added that check already in the UI, but I show a red version of. I've just been through a terrible experience with a bun a number of Jenkins users on sento seven telling me that we broke them with a security fix we just released because sento seven runs an ancient version of SSH that we had missed testing so so this is sort of hot on my list at the moment I apologize for banging on something like this. Anything else for sure cash. So, finally, the main objectives, you are going to be back in the UI. Okay, making it a bit better and easy to understand. One is to display the data execution data, like, wow. And one more is strengthening the fundamentals, I feel strengthening the fundamentals of how we are executing the maintenance tasks as I think these are the project goals. I like those goals. Yes, I believe that is the order of priority as well as you. Yeah, yeah. Great. Anything else. Then good luck on your exams, focus on those exams do well, and we will talk to you again when we next meet. All the best.