 Let's let's have a good record. This is the get plug in performance benchmark or performance improvement project. Rishabh you're on you're on recording now go ahead. Okay, I'm going to share my screen. Mark please enable. Oh, oops, oops once again security says you can't share your now you can. Okay, so the first agenda in our meeting today is is is discussion on running GMH benchmarks on the infrastructure Jenkins infrastructure. So it is the new PR that I was actually hoping that we would have this bill complete and I could show you how it's just something happened with the inside thing. Yeah, we had a very serious outage yesterday we had a massive outage and, and that massive outage would affect many different places for instance it was breaking my builds, because the builds depend on Java doc Jenkins.io for Java doc generation. And therefore, every build of any of the plugins I maintain that generate Java doc, we're all broken. So, so yeah, but that it's mostly resolved now and the critical services are back up and running. Okay, that's that's good to hear. So, so I did this with the local Linux instance I have the same thing so I'm going to show you first the change I perform. So, so in a benchmark test, if you can see here, we can divide the benchmark test into two into two parts. The, this is the part which is important to us which is where we're testing this is a benchmark we're testing this whatever we want to test the operation inside this function. Before this all of this is this is a place where we it's it's the overhead cost of create of performing this operation. So, the previous GSOC project, they the role strategy plugin, guys they what they did they they created a state a static class of a customer a customizable state which was which is called Jenkins state which I can, I can, I can take it from the test harness it's available there. So the Jenkins state what it does is it provides me a Jenkins instance, which I can use if I want to test anything related to if, if, if one of my operations use uses Jenkins instance for our operation for gate operations. I believe we don't need a Jenkins instance right now so that or we will not need it as a comparing CLI get and check it. So, so with the state the stat so a state is a static class and what happens is so we have to function inside to methods inside this class which we which we use to create and then destroy whatever operations, whatever variables we've created. So the scope I've set for this is iteration. It's actually I needed a combination. So what so what is happening is let me let me rephrase everything. So I have two states now I had one state. So what I needed was I needed a gate client. I need I need a new gate client for each operation. And that click get client should have should have a fresh local repository to for it to fetch or clone or do anything it should that should be fresh for each invocation of the benchmark it should not they should not share that that local repository, which which is used to make the get client. So, so that means that I need to invoke the setup function at each iteration. But if I have just one state where I'm also cloning the upstream repository and then providing it to the client. That was what, which was making our benchmarks. The duration was to it was too long. So, so I read the documentation and I and some and I got to understand that I can create multiple states for multiple scopes for my For an example, I created a new state called clone repo state. So this state is specifically it's it's created for providing the client with the with the upstream repositories are a local clone of the upstream repository, the four repositories we need to benchmark the get fetch client. So operation. So, so this, so this state, it basically runs for a trial, the scope is trial trial is it's it's the number of times I'm going to run the benchmark for one fork of the JVM. So, so, so now what happens is, so I'm going to show you how the results that the new way we're generating the results. Now this is how the results are going to look if you now if you check the json file in the artifacts, this is how they're going to look so the The first thing it's going to tell you is the parameters we've set the first is we're testing it for get for a particular repo which is the first ever URL we have. So I have set to four, which means the process of benchmarking the iterations we have it's going to run twice. And for the first book you can see the default iterations for warm up our five and then we have five iterations where we're measuring the get operation. So it also tells what the execution time for each iteration. So it's really helpful to go along. Now, a side note here. One thing I noticed yesterday was this these results are seen right now it's it's from my local machine my MacBook, which seems pretty different from what the results we've been seeing. This is a 50 millisecond per operation for a less than one MB repository. So I so and then if I compare it with the results of my Linux instance it's it's clearly very different. This is from my Linux instance if you can see the iterations are in the order of 98 milliseconds and this in a MacBook it's it's it's much more than that. So I kind of understood the importance of not relying on the results from my local machine because and the reason I think this this is happening is because I was actually running the jmh benchmarks as well as profiling the Jenkins instance using the Java flight recorder. So my machine plus Google Chrome it takes a lot of space so everything I think it created a lot of I think this is that this is the reason why we are having a skewed result here. Do you have a question. I'm not even sure I called it skewed. I think your experience on your Mac OS is every bit as valid as the experience on Linux but it hints that there are many and this is a good thing that you're already showing. There are many different things that affect the performance of an operation on your Mac. I assume it's affected by the max choice of file system and the max choice of of of how they operate against files. They're probably closer to the free BST type people where they say look preserving your data is the biggest most powerful thing we must do and they're willing to slow down file operations a little bit to do it. My free BST machines have a similar similar kind of thing where I get these amazing reliability, but at a cost on file system operations. And so, so the fact that we're sampling across multiple systems I think is very very useful and for me it's another reason why CI Jenkins that I was a good choice because there we've got access to at least four different machines for different types of machines Linux on AMD 64 Linux on power PC Linux on series 390 our system 390 and Windows and and I expect radically different results from all of those and in interesting ways that that this work you're doing is is leading us to learn those things great. Yeah, so, so then I, and one more thing I saw with the local machine tests I was, I was running was that at some point at the time. One of the tests. When it was cloning the 300 MB repository, the test. It gave me an exception when when it was fetching the repository. Which is something I think it was it was down it was enumerating object and in between it was interrupted and and I and I try to. I try to search what was the problem here what would have happened and as far as I can understand this is because of the net network or probably this is because of the network. So this is this is a concern I possibly we could have that our benchmark is going to rely on the network a lot and stability is going to rely on the. Although with the Linux instance or on the CI Jenkins I've never seen it fail but with my local machine I've seen it fail once or twice. It's maybe I need to run it multiple times to make sure that this is maybe happening once in 10 runs I don't know. I'm because I have just started once or twice. So this is something I would have to see. And I've definitely seen failures like this. And so, so this is what you're seeing is looks similar to things that I'd seen. And for me it's another hint running in multiple locations and collecting the results will give us will increase our chances of detecting these kind of failures. And I can't predict why they're failing because to me it seems like we'll and oh it's clone upstream. Yeah, okay so if this is the clone upstream locally, then that really is a network operation right, and that's, that's truly copying from and sometimes things are just going to fail on that. That's okay. We know to rerun it then. So this is one of the things I also saw. And so these are the results from my Linux instance now. So as you can see, we have two forks and it calculates for a particular repository the results and then it gives us and I was also talking about if you remember Mark confidence intervals providing it calculates all of that. So it does that for us assumes it as a normal distribution and yeah. Now, good thing that you include the data there on the JVM, the JVM version you're running on that Linux machine is quite a bit out of date. So Java JDK 1.80 underscore 131 is about two years out of date if I remember correctly, and there were major changes to the Java eight environments about or there were important changes major maybe the wrong word, but there were important changes to about 180. And we're now at 252. So, so this is a good, another good indicator. The data you're putting out is exactly the right thing to do so that we see those and say, Oh, we need to bring that Java, the JVM inside that thing up to the more recent revision. Okay. And Mark, would, would this also be a parameter for our benchmarking strategy, different versions of JDK or we yeah. I was not concerned about different versions of the JDK, just because if someone's not running the current version of the JDK one of the first acts will tell them is run the current JDK telling someone, Oh, we're going to do special effort to support an outdated JDK. I'm not willing to make that special effort personally. I barely am willing to make the effort to support all the different versions of command line kit that we support. Okay. So, so as I was saying, so we can, you can see the progress of bit by bit, each, each step with each repo URL, it takes out, it takes an outer product of the parameters so we have get and J get and then we have another parameter which is which are the four repo URLs. So it's going to. Well, on that one, it looks like pretty consistently the, the variability during the warm up iteration, and during the iteration is not huge when as I just look at those numbers. The, the, the warm up iteration stays in this case within a within easily a millisecond of each other. In fact, much less than that they're like two tenths of a millisecond of each other. And likewise for the actual iteration so good that that nice to see that the data feels right. Yes, and this is this is good implementation so this is expected right. And then we have all the four repositories here and then we go to J get with J get we should see a difference. And we are seeing a difference. Oh, so this one it really matters the preheat. Yes, it does. Okay, and now that all right that's, that's okay so so this is. Yes, that's okay and you would you had told me that earlier that you had seen this, this really reinforces it that absolutely there the preheat phase the warm up iteration is crucial for getting believable and repeatable J get results. And it may indicate that we need to forewarn people if they're using a one shot executor. So they're using a Jenkins agent that comes up does one task and dies. They will never hit warm up state right they will never, they will never be preheated. Yeah, for them. Okay, this could be something. Yeah. And then, as we progress further the repository size increases and with a larger repository size if you can see, still we have. The preheat the preheating of the JVM it doesn't. It is it's not that much noticeable the difference with with maybe one iteration it is but the others it's almost same. And also, I think it's pretty obvious that she like it is performing way better than it comes to large repositories jg does not because and in the end we have the results, the average results so right now you were seeing the results and now we have the average results for the two folks. Ideally, when I was reading about performance benchmarking in general they the jmh the developers they said they were recommending that we should have as much folks as possible. They're also saying someone saying five folks is a good number so we have more observations and does our data is probably more reliable, but it's something I think with our test to currently the way it's configured it's going to add a lot of time is basically going to double the time because we're cloning the repositories. So it's something we might have to think about because this this is I think this is a hyper parameter the number of folks we want to we want to select because it's not something we can decide it's not correlated to get operations, but it's something we need to we need to experimentally test and find a good number. So it's a parameter where we need to experiment maybe with one and then two and three and four and five and then see where are we getting the best results. So, so I think I should differentiate parameters with hyper parameters parameters which are something we can only we're sure about them once we experiment with those parameters and I think you have a lot of gmh parameters in that sense, like the number of iterations we are performing. So we have five warm up iterations five, the execution iterations, we can change that number we can have 10 warm up 10 normal iterations. These are things which I think I should, I should do well and you it seems like you have data here that might even guide your decision there so so just this as I look at for instance the on your screen now we've got the warm up iteration 12345 and notice how already after iteration two. So from three through five the variation is becomes minimal. Right, it's as as I see warm up iteration one to 239 iteration to 236 and now we're at 2318 and 2309 and 2312 we are now within one second of each other. And this is on something that's taking 200 seconds so so the variability is dropped below one in 100 it's it's one in 200 variability. So it might be that you say look to warm up iterations or at most three warm up iterations are by this evidence enough to do it so that we don't have to waste the time running the fourth and the fifth warm up iteration. Now is it allowed to go ahead. I was just saying, wouldn't that vary with the size of the repository. We are testing because I don't know this for sure but if we have a large repository the warming up for each iteration it would it would warm the JVM mode is that is that something valid let's let's let's go back back on your screen a little bit to one of the tiny repositories. So, okay. Let's go to the first one. The first one the variability. You're right, you're right look at that. Yes, it is. There's no. So there is. They're absolutely well although even there okay on this one the difference between warm up iteration two and five is significantly less than the variation between one and two. One and two yes. So, this means when you're warming up, as we're moving up the JVM mode check it is getting confident. It's performing better right. Right. Now, okay, so now if we said, if I when I look at that data I said okay warm up iterations iterations one through five of the warm up. Look at the variation inside the actual iterations and they are. They are about the same as the variation between warm up iterations three four and five. Yes. Yeah, so in terms of variability right right right interesting. Okay, so, so there's. Yeah, so I think you have the right question. I don't think we have the answer yet. The question is how many warm up iterations do we need. And what are the things that control the warm up. How many iterations, we should use is it, is it the size of the repository is it's something else. And, and if so that may govern then okay we could reduce or we need to increase the warm up iteration count. I think a good indicator would be to see the variance between two consecutive results right right that is how I would get to understand how many iterations we would. Once I'm seeing a constant amount of if there's no variance then we understand that okay these are the number of required iterations. Okay, so this is how each of our benchmark is going to run now. I think this is a good display of information when it comes to understanding the process of our jmh benchmarking is working and as far as the visualization is concerned I think for that I have to test the plugin, but do we need the plugin when we have this type of result mark. I would, I would I love having a visual representation because I learned many things when you showed the visual representation. But for my for for yours and my and the other mentors evaluation purposes this data seems sufficient. Right, I think there's, I like, there are times when the visual presentation can make it even clear than the numbers here we see, we see bars and we see the relative bars without being distracted by the digits in the numbers. I agree that is a great yeah, so I'm going to locally tested and yeah, yeah, yes. Yeah, so, and my thought was I can certainly install that you can install the jmh plugin on your Mac, I can install the jmh plugin on my environment and we can do parallel experiments to see. Are we are we getting what we expected out of this are we learning what we wanted. Then, if we reach the point where we say yeah this is valuable, we can deploy the plug into ci Jenkins.io. Okay, so for now, the first coding task I had was to run a benchmark on an event back into and from on Jenkins infrastructure on this year Jenkins so I think. So because of generating report we were done with that process now this the second step of this task would be to explore if we can integrated with the jmh visualization plugin right. So that seems like a good step for me yes. Yes, and also okay so the next sub tasks. Before this was the j the jfr profiling step. I discussed this before because I was pretty excited what happened there so I think now I'm going to discuss my experience with this was the second agenda I had. Okay, for the meeting. Great. I don't have a lot of time but so it's. So I, I ran Jeff on JDK 11. On the Jenkins wall, let me just show you my observations. Okay, so how did I profile the Java application. So we provided an additional argument. We start a flight recording and we give some options to it the file the file name for the recording the size of the recording. And what are we doing with profiling it. And then we specify the jar, which we want to profile and then to then I profile it with the with running Jenkins, the SCM checkout step I did that I did that with the cloning with checking out. See I Jenkins, the Jenkins IO repository. And, and I did it both for CLI get and then I switched the plug into the jkit implementation. And then when I had the recordings I use the Java mission control to understand recordings. And so here I have two recordings this is for jkit. And one of the first things I understood from the threat stack here is that I have to look for the executors, because those are the threads which would be which would be working on the build. So that was where I was looking and as I, as I, as I was looking at the stack trace I could understand okay jkit is working here. And what is jkit doing. So I could understand it's performing in the retrieve changes where we are actually figuring out your download we're fetching to get objects pulling them. So, so I could see okay the method the method being used here to do to do fetch operation. So, right now I'm, I'm actually not very well versed with how to understand the data I have here. I could see I can see that I can see the threads which are taking a lot of IO time. But I still have confusions on on things I can understand from the SM checkout step profiling it just reinforces the fact that, okay I have I know that get fetch is something I need to test, because it involves network in IO. And then I have get checkout, I have get LS remote. And so, so one of the concerns I have is okay I could run it here with get SM checkout but I need to expand the profiling, where I maybe I am scanning multiple branches and I'm doing things, which may show me different places of the get plugin because right now I'm just, I'm just focusing on the checkout step of the get plugin. And, and I think I, I, I already know the operations, which I can, I can get from, I can get. So the objective of profiling is to find out what operations we need to benchmark. And then with checkout, I think I understood that but if the object, but also if the objective, the objective is also to profile and understand the hot code parts where, where our code is, you know, taking more time and things like that for that I think I need to study this data more I need to understand profiling better because the initial knowledge, whatever current knowledge I have I could, I could understand that okay this is something that I'm concerned with, or these are the methods I'm running but I know that get fetch is going to be called twice checkout is going to be called once and so, and I also compared it with CLI gets profiling result and here I can clearly see get fetch is called for something I can, I can see that for 14 minutes, get fetch is, get fetch is called because of course it's a large repository we're fetching here and we just, yeah, for 14 minute 55 seconds. So I know, okay this is, this is an operation which is taking a long, you know, taking a long duration of time. So, so yeah. So, Mark, I'd like your input on this, how should I proceed with profiling. Okay, this is amazing. So what, what you've done is you've now have, you've now found a tool that will let you highlight which exact get operations are be that that's really quite impressive reshop. So, so what I see of this is, okay, for the CLI get implementation, you now can see exactly which command line get invocations are costing time. So, and then we could reverse, go back and look okay who's calling command line get this way. Yes, we know about get fetch and we know that it's using to x the number of fetches. Right, we know about that one and we're glad to know about that one. So, so this reinforces the focus on taking out one of those to get fetches is a big win. It made it but this also gives us a sample that says, okay, after taking out the second fetch did did the total IO time improve. Did did we get something better so now in terms of this to me looking at it the profiling feels like getting this maybe early and to my sense, too early, it may yet be too early to do much more with the profiling than you've already done, because you've you've confirmed yes, the hotspots are exactly where we're working right fetch is is the hotspot there's no question profiling says so. Yes, and so therefore, I would, at least my sense would be put note this noted in your noted in your report that look here's the evidence from Java flight recorder. Clearly that get fetch is the dominant operation, and it's dominant by it looks like by one or two orders of magnitude dominant right there's nothing else that has nothing else which is remotely close to 14 minutes it's it's it's maybe seconds some seconds I haven't seen something. Okay, we have another get fetch is up to minutes 39 seconds. Okay, which which reinforces which makes it even worse right now the story is even more dramatic it's like, guess what we we thought get fetch would be the dominant thing at the time on these on a larger repository and we were right. Here it is and here's the evidence and captured this evidence. I don't think your focus on on the JMH benchmarking is the is the high value focus for now because now we're going because you're going to use that to tune what options we pass to get fetch how many times we call get fetch to improve it. Yes, Mark and and I think the following question to this is that. So I move forward with with the current tasks I have in my hand but also do I now when I understand that okay get fetch is an issue so we have an existing performance issue we are going to solve that is the double get fetch redundant issue and but let's say I also try get checkout and so what should I do should I first should I try out operations with JMH and note my observations try to understand why that is happening should we do that first or should we also look at how we're once we know where are the places we want to switch the implementations we should know how to do that the implementation of implementation part of the result you have that is the performance enhancement the actual performance enhancement so do we select an operation select an operation and work towards implementing its the performance enhancement or do we first make sure that we have covered the operations we thought are are are are a blocker for the get plug in performance and and now we've realized these are the these are maybe some options we have and then we move forward with implementing maybe the idea I had that let's keep it as as an opt-in feature performance improvement and we work upon implementing that and so how should we or maybe it's a good strategy would be to parallely do both to because I think we'll have this is a research kind of a thing to find operations to understand why an operation is taking more time maybe maybe taking more time than get for a certain particular scenario so we're doing that along the way and also we're looking at the implementations to consolidate performance enhancement so what would you suggest Ma? So my personal preference is that you've done you've done an initial survey across several things and shown one thing is the problem get fetch and it's it is clearly the most interesting. I think we would benefit the community and benefit you and your project most if we found a way to deliver the to measure and deliver that that improvement all the way. So get the performance benchmarks in and be sure that that we understand them that we believe that we've seen across multiple, multiple platforms, multiple environments and put in the code to production that tell that lets us do the switching so that then we can we can get it all the way to the users for that one capability. The, for me the benefit of doing that is, it means you'll have to do a lot of things at many different levels. Ah, you got to figure out how to do the how to do a switch in the code you've got to figure out how to do, how to do, get it shipped to production how to get it released as a new release of the plugin. And all those things on the vertical are are are intensely valuable and will give immediate benefit to people. Yes, and that seems like a logical step as well to me so okay so we're going to do that so. I'm much less interested in attempting to do things in parallel. For me it's, it's, if you've identified something that is 80 or 90% of of the performance focus should be and therefore let's put everything we can behind getting that thing, all the way to all the way to users. Okay, that sounds great so we're going to do that. I think the next thing the next after we saw the get so the get redundant fetch issue we have so should be that is something we're doing parallely right. And that is, well there. So, I would, I would describe that one as your work on jmh benchmarking is proceeding in parallel while the code review of parallel fetch of the the fix for redundant fetch is happening. So your your jmh work is is not blocked by my my needing to review the redundant fetch removal and not blocked by friend needing to remove review redundant fetch removal so with those those things there is some parallel work happening but it's, it's not you attempting to work in parallel. It's you allowing others of us to work in parallel to you. Yeah, but that would be great. So, so the next step I'll I'll create is to think about how I'm going to implement to dive into the ways I can implement this, the opting feature and, and I think I guess one of the things one of the main major things is to figure out how to calculate the size of the repository and then switch jg.org it because that is the most important parameter we have here. We have to decide. Yes, the measurements the measurements truly indicate repository size is compelling. Now you may you might want to let's see the sizing we did was you had a very tiny repository, a one mega repository or it's it's less than one MB it's then and the next one is five MB then it's 90 MB and the final one is 300 MB. Okay, so, and so we've we've got you've got several points along the way I had I had forgotten about the 90 megabyte repository so yeah so so that's really good we've got points along the way to help you identify what is the tipping point when we should instead of silently using jget we should switch over to command line get get yeah. Okay. So, I think I guess we're at 840 now so the last one. Yeah, the last thing was the status so I have I'm looking at the test to failure on the test we have for the fix I created. And Frank gave me some suggestions on on the tests so I have added those suggestions. And now I'm going to look at the tests and we talk about the clone options I'm going to do that as well. So, yeah. That's what I'm doing with that PR. Excellent. I owe one of the expectations that's been set for me is that I should be spending six to eight hours a week at least on helping you as a student and in the project and it feels like right now the most crucial thing I can help with is reviewing and analyzing the double fetch performance change that you've prepared the redundant fetch change so that's my objective for this week. I hope to I hope to have good results to report to you to say hey, here's what I learned by my assessment, and we'll discuss it either Friday in our next meeting or next Wednesday. Sure. Okay. And if I if I learned something that I can describe by text I'll do that in the in feedback to the poll request as well. Sure. Okay. All right. Thank you, Richard. Thanks very much. Excellent results. We'll see you. Okay.