 has started. Go ahead, Rishabh. I think you enabled screening. Once again, thank you very much. Okay, you can now share your screen, Rishabh. Yes. The first thing I want to discuss is the agenda. Yeah. So the first thing is, so we have the fix for the Jenkins file now. So we're not running the benchmarks everywhere for every PR now. One question I had related to it was, should we create a GSOC dev branch? Where I push all of my changes in the Git line plugin, not in the master directly and work accordingly. Is that something we want to consider? Tell me what you would hope to gain by that. Is that, is your concern that you might be too disruptive on the master branch? Yes. It's an interesting insight. Yes, that's the only concern. And I don't think, I would lobby. Let's call for the others. But as a maintainer, I like that the changes are on the master branch because if you break something, I learn very quickly. And if it's on the separate branch, the danger is I won't learn as quickly. I won't be as useful to you. So my, my personal preference for right now is let's keep using the master branch and admit that, that that means there's, it's not free, but we will get better progress. I think overall, if we stay on the master branch for now, but it's a valid point. If we need to, we could certainly allow a separate branch that you, you push to, or that you submit pull requests to on, on the central repo. Justin or on car, any, any opinion there? Are you, I think maybe are you proposing that you would do it on master and another branch? No. I just don't want to push my changes directly on master on a separate branch and then, and that branch, it runs the benchmark so that it's, it, it is running fairly basically. It's just a suggestion. I, I, I really, it's not something I have considered too much. It's just a suggestion. Well, I mean, we, sorry. Oh, I was going to say the only thing that I could see an advantage of having it also in another branch would be that if you have changes that you need to make in that other branch that you could see it run and NCI, but I think I agree with Mark too, though, like it'd be nice to see it on master as well. If we were to do that. Especially help Mr. Mark. We could take a variant of that though. We could say, hey, we're not even going to run the benchmarks on the master branch, create a GSOC dev branch and only run them on the dev branch. But, but for me, it feels like, hey, I like knowing and I like watching the evolution. And if you break something on master, you get attention much, much more rapidly than, than if you break something on a separate branch, I'm very attentive to the master branch and particularly now as we're approaching the release of get client 3.3 and get plug in 4.3. So this is a great time for you to be there so that you're getting lots of lots of attention. Okay. Okay. I understand that. So the next thing then is, if something I was thinking about while thinking about this issue only the branch issue that we currently have my benchmarks for the get client plugin. And so I was thinking the scope of our benchmarking project is to basically understand if we have a faster implementation for certain scenarios and then enhance them in the plugin. And we also, so do we want, so will we also maintain the benchmarks on the CIA Jenkins infrastructure after we, we use the benchmarks for this project. We want to keep those benchmarks on the Jenkins infrastructure. So if we are doing that for get client plugin, what I was thinking. So my concern there is that for the get client plugin, I don't see rapid changes in that repository where benchmarking could help that much as compared to the get plugin, where the changes which would be much more which might, where the developers might benefit with a performance benchmark. But the concern there, another concern there is that we are conducting micro benchmarking here and when we shift to get plugin for an example, if I want to benchmark the checkout step, that includes a lot of functions and a lot of API calls. So it, I think the definition of micro benchmarking doesn't apply to that, to that whole process, but to have a benchmark which would, which would create a reference for developers when they change the checkout process for some reason and they have a reference and that would help them a lot. But conceptually, I'm not sure if the micro benchmarking principles are violated when we are trying to record and measure such a big step. Should, can we do that? Should we do that? Or should we break the checkout step into some stages where we can benchmark it? We are benchmarking the operations. We are doing that in the get client plugin, but I'm not sure if the get fetch step is going to be updated that much so the benchmarks, we have the results, but I'm not, I don't understand how the benchmarks help that much in that case. So yes. The, I think, well, there was a recent webinar presented where by, by someone from the, where were they from? It was about a week ago. There was a bench presentation was made where they, they were, they showed using performance data historically to help detect flaws in, in something. So I think your idea is valid. However, part of me says this is, this is probably the wrong time to scope creep your particular project. The, the benchmarking inside get client still has several different areas that probably justify investigation after we get fetch understood. And after we get it to the point where it's running reliably and that we, we've compared its behavior on our PC, on system 390, on, on windows, on Linux, and we've understood, gathered some historical data so that you could present, Hey, look, here's the behavior and all these different platforms have these curves that the shape looks like this. That for me is already so valuable to, to do that kind of thing that I would ignore get plug in for now, just because I think we've got so much to learn and get client. Okay, that seems fair enough. I understand that. Yeah, it's not that it's not that it's not a great idea. I think it's a brilliant idea. But my suspicion is that we've got an awful lot of things that have to happen still for us to say, yes, we got this all the way to getting the value out of the benchmarking that we wanted to get. Okay. Yeah, I understand that. Maybe that would be a good follow up thing if we end up getting far enough along that you would be able to use the principles you learned in this one and then apply it to the get plug in because I think I agree that I agree that it's nice to have metrics, especially long term, so you can tell like what kind of small things and I think this is what you're getting it also when you see a change that's made a big spike. Then, you know, that's more obvious than a longer term. Oh, I think at some point in this last year, we had a performance spike. Yeah, for as an example, the place where you were what you're suggesting might be particularly useful. One of the one of the problems we have in the project is that it supports a wide wide range of get versions all the way from the get that's shipped with Centos 7 1.8 that is now six or seven years old to the most recent get 2.27 and it allegedly supports all of the versions in between but you can just be confident that 1.8 has a very different behavior performance wise, feature wise capabilities compared to 2.27 and there are options that we don't yet use in the brand new versions because we're allowing ourselves to use the old version to support the old versions eventually I'd love to get to a mode where we say we're going to drop support for get 1.8 we're going to drop support for get versions up to 2.7 for instance but in order to do that we'd have to have a compelling reason which says because we gain these things and your benchmarks could tell us here are some of the things we gain so again why I think your idea is good but I think it's the wrong time for us to approach it let's get get the get client benchmarking understood and implemented okay I get it okay so the next thing is progress on the issue so first I think I should should we see the code should we see the code right now or is it something I can raise the PR and then you guys can review it should we spend time on that you're actually I want you to brag first about the Windows fix that you did so that before you show any code even the test case failures on Windows you've resolved right so you're talking about the assertion errors right right so that was resolved by why the thing you gave marked on I think in the morning you gave me the solution for it so I just I just did that I there was nothing novel in fixing that those test failures but I realized that maybe we can find a better way to assert the double that we don't have the double fetch now one of the simplest way I could I could think about was to not not check if get fetches there or not just check before get fetched logs and info that it is fetching the upstream repository so I think I can count that and that is agnostic to the platform that would be agnostic to Windows or Linux or any platform so that is the simplest thing I could think of we just count that I would have just one count after my fix and before my fix it would be two so so I was thinking of implementing that right now I have done what you gave basically matching the pattern and using reject so so I was do I was thinking of doing that and so with the so we still have one test case failure with Windows which me and Mark you're going to explore and debug and that's it for the test case failures right now I and yeah Mark so so the one remaining test case failure actually is not Windows specific it's I see it fail also on my Linux environment so but but that's a relief because the Windows specific one had had me completely perplexed why why Windows what's so magical about Windows and this the one remaining failure is both visible on Windows and on Linux I'm confident we'll see it anywhere because there's there's got to be some fundamental some fundamental thing that we're missing just like you found two days ago you pointed out to us that the extensions were just not being added to the clone command but but it was the ability successful in my look on my local machine macOS system it's successful no tests are failing in that try try merging the master branch just just to be sure try merging the master branch it may be that you're you're clean before checkout to avail I I don't yet know what it is but I definitely see it on my on my Linux box but it but it is also succeeding in the Jenkins instance actually Linux instances both Linux instances if you check are you sure all that's interesting I'll show you oh and master I yeah this is the branch maybe Mac is special not much kidding right it's it's especially a Berkeley Linux that's right that's about how special it is a special Berkeley Linux for sure interesting oh and you're right wow okay why do I see it consistently on my my Linux computer I just ran it minutes before this meeting and saw the failure no failure interesting it's not failing in my on my machine and on those Linux instances first I initially I thought it's something let it win those I could not understand because it's versions is it is it again is this one using the the system get or the jig it it's let's see the failure is using it's get us get a cm test so it's using CLI get right all the tests don't they use J get I don't know how it could since if it were using J get we wouldn't we wouldn't be able to assert on the double fetch because you were asserting for specifically the command get space fetch and J get does not echo get fetch but so I'm pretty sure that choosing CLI get but it's a valid question Justin and a really good one for me to double check to be sure that it's using let's see is get a cm test the one that is parameterized one thought that I have is that sometimes if you rely on the text of an output and that the thing that puts out that output may change over time and then you get fun little test things and then I'm especially kind of like concerned by what Mark had said in terms of the spread of get client versions which totally makes sense that you would support that wide of a spread but perhaps they've changed the text in in the get fetch output yeah in this case the fetch that he's the text he's asserting on it's a good point it's a very good point the text that he's asserting on is actually text generated by the by the get client plug-in itself not like a command line get so it's a no no it's a very good point it's a diagnostic output provided by the the plug-in that happens to be a copy of the arguments it passes to get fetch cool much safer well sort of yeah it has exactly you you hit it exactly right that it has yeah so I Rishabh I definitely see the same failure let's see what my status is yeah so I did do emerge from the master branch but all that changes is brings in some documentation changes compared to what you had so I don't think that alters your should alter the behavior at all so this will be a fun investigation thank you for for giving us an opportunity to look at something together that's going to be interesting I I don't think we should keep you awake significantly longer it's getting late in your day right we're we're probably it it's a 20 right now a 20 p.m. okay so it's Rishabh and I were Justin Rishabh and I were talking 10 hours ago and he's not slept since then I had a good night's sleep so me too okay so so this is what the test case failures after this the next thing is that I have to create test cases for the for all of the addition of extensions I've done and so I was I was seeing that we don't have any tests for pruning stale branches so there is no test for that so I would sadly add before and after test for that feature and then for pruning tags that for that also I'm going to add automated test cases and for the clean before check out do we need to add do we need to add explicit because I think there was an excess existing test case which pointed out to me that I am I'm missing something so do we need two more tests which explicitly one more tests or maybe two which explicitly tell us that okay this fix is doing this no not for me I think yes I'm embarrassed that we got lucky but I'm glad that we got lucky now you're going to reduce our odds of being lucky in the future by adding more tests to reduce the reliance on luck and instead rely on skill of tests okay okay so and also I yeah I'd say I wouldn't worry about retrofitting tests well unless it's something that's going to change to you yeah I hang my head in shame that there were no tests on prune branches but I think that's older functionality I'm really embarrassed that didn't catch it on prune stale tags because that was recently added that hasn't even been released yet that'll first come out in 4.3 so that's brand new functionality that I missed getting a test into it so thank you for catching it okay and after that I need to create a benchmark for to point out the performance difference between the double and the single fetch I actually have a benchmark but it's not working right now it worked before during the proposal I was creating it's not working right now so I'm going to fix that and so I have some so I after the fix I did a few so I system.nano time in the retrieve changes stage of the checkout I want to show using the code where I've done it yes so this is the stage where we basically call get fetched twice so I I just did this to check once I put my fix what is the time difference what is happening so the result was a difference of 20 seconds for so I ran it so I I built it for Jenkins IO repository which is around 40 mb right Mark yes yeah so for that repository the difference is around 20 seconds after after fixing it which which is okay right that's substantial that's I expected it to be near zero because I think of fetch to get nothing should be pretty cheap so that's that's quite impressive to be able to save 20 seconds lots of people should be very happy yeah but code is it is it possible that this is something related to my machine it could the time difference or the time it could be corrupted because I am using system.nano time and I am performing it on my local machine so we would have this time difference or so the answer to the could could a benchmark be skewed by something is always yes but in this case the difference seems large enough that it's unlikely any skewing inside your system is really impacting that number so yes could the benchmark be skewed absolutely I suspect if we run these same kinds of tests multiple times we'll get repeated or in multiple locations those kind of things we'll get repeated evidence that you've got exactly what you're observing is exactly correct taking out that second fetch has a very distinct detectable improvement performance okay and I also tried profiling the Jenkins for by adding a custom build of my plugin inside the plugin directory but I'm not able to get any difference in the thread dump there it's it's the same I'm not sure if Jenkins I saw the logs when the Jenkins instance was starting up so it was using the current the plugin Git plugin I provided to it but I didn't change the Git client plugin could it be because of that because my fix is also using some functionality of the Git client plugin which which I have updated because it's not available in the in this Jenkins repo right from where we're downloading the plugins well so how did you correct how did you launch Jenkins well so you can use Maven HPI colon run or some other technique to launch the war file I use Java Jenkins oh okay good well so then having done that you'll find I if you look in the Jenkins in the man in the system information for your Jenkins while it's running it will show you what it's Jenkins home directory is you find that home directory copy plugin dot HPI file into that that directories plugins directory as a dot JPI and restart Jenkins and it will now use your new get client so you you then have the convenience of you can put both a new get client and a new get plugin into that same location and it will use them both okay I did that forget plug in but I did not do it for the time so I do that forget client see I would expect that if you didn't do it for get client you wouldn't get the remove of redundant fetches would you because the remove redundant fetches is only implemented in get client plug in the good plug in oh okay great my mistake that's great so it's that's the implementation of that is actually in get plug in not get client get client yes great okay thanks excuse my room getting okay so so here you can see that we have to get fetches actually we have multiple get fetches I don't understand so I think I need to understand the thread dumps because I'm seeing two fetches there and then I'm also seeing get fetch after this just one minute right now I saw so I have two recordings this one before the fix so here you can see two more so two fetches there and then we go up where it is actually using fetch to yeah two more fetches so this is something I don't understand is it is it because it is it paralyzing is it because of multi-threading or why would this happen that I am seeing three four five fetches are there any some modules no so is is the is the job you're testing a pipeline job or just a freestyle job freestyle job okay all right so then then you'll you if it were had been a pipeline job it would have potentially been loading pipeline shared libraries so that might have explained multiple fetches in this case I don't have a good explanation for why it would do four fetches instead of one or two depending on whether you had redundant fetches removed or not so I don't know that's a good question okay so maybe differences are in the in the commands now well so I guess you I assume that you do not have wipe workspace extension added and you don't have calculate change log and you don't have any of the other that you've kept a number of extensions very simple yes it's no extension that's just to check out great so certainly worth investigating I I don't directly know oh yeah this is why you would see that I have I'll have to look it's it's a good question we do investigate more so so I think the tasks I now have the first one is to raise the PR with the automated tests the second one is to create the benchmark so that we have a good visual report of how this this fix has improved the performance and the third one would be to find out it's the same I had on Wednesday is to find out a way to implement the opt-in feature we want to include for the performance enhancement as an extension which we were talking about so yeah so I'm going to do that during this weekend hopefully I'll raise the PRs yeah I wonder okay this is where Justin and Chai Min part of me is tempted to say hey should we implement it as just a single global switch which is go back to the old way of doing things or keep doing things the new way so so there is there are global switches you can set on the plugin that are available through the manage Jenkins page and there are relatively few of them right now so I'm going to go back to the name whether or not it should create new users and one other but but is it get to get to get to get to get to get to get or j get no it's this is this is this is this is at the here maybe maybe what we can do is if you don't mind I'll just share with you so that you can see it so you know where to look and what maybe this is so valuable that we should make it the new default and only allow people to go back to the old default the old way of doing things if they find some catastrophic problem because making people turn it on will slow the adoption of it being switched on the but if we don't allow them to turn it off we may break them catastrophically and it will be upset right exactly so if you want to stop sharing briefly I can show you where where they I saw that those had stand air STD here and STD out I wonder if those are binding to two different streams and Java maybe that's why that's showing up twice Oh yes yes you're right look at what it says there standard out copy or standard error copy and they are in fact two different file descriptors which is very good Justin you win bonus points I think you're right that's great you might want to double check that because that's me at about 8am and it has to read two different file descriptors right because they get command line get really right some things to standard out and some things to standard error and we buy in both of them so we get everything very good nice another error and it should be good yes so I would have four pet statements there okay yeah okay so here's my share screen this one okay so what you see here you should see do you see my Jenkins instance yes okay good so if we do manage Jenkins system configure system and this particular system has several thousand jobs on it and a bunch of global configuration so it'll take a while to load so while it's loading what will appear if we search for the string get space plug in on this page get space plug in you'll see this section here this block of global configuration settings so I can set the two who should what should the user name be for anything that is committed and this is one example show the entire commit summary and changes of a place where a new behavior was desired and we allowed people if they didn't like it to check this thing off so it was so and this is so if you look for that string in the in the in the source code you'll see a pattern that's used to implement that kind of thing and it may be that at least my initial take having having seen the work you're doing and the results you're getting that we think removing redundant fetch should be completely transparent to the user and if it truly is transparent to the user then there's no reason not to just have a switcher which says enable redundant fetch and it defaults to false it's it's it'd be the slow me down please mode for compatibility okay yeah we could do that yes although I don't understand why would people choose it if you're not breaking any use case well the notion is we we broke their use case and they don't have time to wait for a fix it would give them an option to escape from right from it rather than having to this leg of fall back or something yeah I guess one of the things I don't know I don't know if it would be nice to have it both in admin and in user space just in case it's a particular repository that's breaking and all the other ones they want the performance stuff but then we're starting to get into goal planning and stuff like that right Justin that's Omka you are welcome to drop you don't need to remain on I think we've finished Rishabh anything else we need to do no I think it's I've discussed what I wanted to for today excellent we will meet again Wednesday hope that we've got solutions and we'll talk about the next steps then okay thanks everybody bye guys okay