 So hello everyone, I think everybody is already a little bit tired from the long day, so I will not make it too long. Make it a little bit light-hearted, so some of the slides of courage were a little bit depressing. So let's cheer a little bit now. Okay, let's first do a poll because I really want to know. I'm thinking somebody was maybe not exactly honest during the EasyBuild survey. So, with that, if you think the right logo is better than the left logo, put on your right hand. If you think the left logo is better, put on your left hand. Do we have that on camera? No. Let's get on to business. So the motivation of this talk was, as I cannot already explain during this talk yesterday, we cannot keep up with the number of poll requests. So as you can see, it's going up here. We are now at about 800 poll requests that are currently open, and this is way too much. And although the number of EasyBuild maintainers has gone up a lot, we are still not keeping up. We need to do something about that, I think. And one of the things that can be done is educating the contributors and let them know how they can help us. So the poll requests can be merged faster. So we are, of course, very happy with all the extra new contributors, but that also gives much more work. And if you have any ideas how we can improve the process to make it faster, then, of course, those suggestions are very welcome. So who am I? I'm Sam. You can find me at Smors. That's my tag on GitHub and Slack. I'm also an EasyBuild maintainer. I work at the HPC facility of the VUB together with Alex. Alex is good. I'll try to talk a little bit louder. So I'm working at the HPC facility in Brussels at the VUB together with Alex and Wart. Alex and Wart are also EasyBuild maintainers, so maybe you already know them. And there on the right is a campus VUB campus, so it's a very nice place to study. So why should you contribute back? Well, probably all of you already know why you should contribute back, but let's reiterate. One of the nice things is that EasyBuild contains lots and lots of software titles and it's going up. It's kind of showed and even increasing a little bit of the last few months. And you will become part of the open science movement. So that's really nice because, I mean, everybody's talking about open science and this is the way you can contribute to that. Make science reproducible, reproducible. And of course, the community is very welcoming. You can meet many HPC experts in the field from all over the world. And you will get feedback from those experts when you make a pull request and you can make it even better. And so you can support your users even better. So reporting issues on GitHub. What are the things that you should do when you report an issue? The first thing is write a good issue title, of course. So if you already write exactly what is going wrong, that will help us a lot. So we don't have to open issue and see what's the problem. It's always a good idea to include all the relevant info. So the operating system that you use when you encounter this issue. Easy build version that you used, configuration and so on. And then it's also good to include the steps so we can reproduce the issue that you encountered and use code looks for formatting. This is something that users often forget. But it's much more readable if you use code looks. Many of you already know that, of course. And this is also something, an idea for the future. So we should probably create an issue template. So it's easier for users to include all that information when they report an issue. Contributing easy config files. There are a lot of things that you can do to make our life easier in that way. One of the things you should do is use a recent tool chain generation. If you submit a PR for a very old tool chain, you can always do that and they're welcome to do that, but we have a limit. But even then, it's always better to use the most recent one because it's more useful for the entire community. So there will be more chance that your contribution will be looked at and be merged. We have a policy and that says that there's a single dependency variant per generation. There is a white annoying rule, but it's also a very handy rule. It's annoying because sometimes you want to have multiple versions, but if you use these multiple versions, then you will might end up in the dependency hell. So that's why we have always that rule. There are some exceptions. But try to stick to that rule as much as possible. If you stick to it, then the github checks will warn you and then you will see that, okay, this is not correct. Always add the sanity check commands if possible. So if you're building software that contains an executable, then try to include the sanity check command for that executable. But that's a minimal test for software that works. Also, when you have a PR that depends on another PR, it's always good to include that at the top of the PR. So maintainers can see, okay, this one depends on the other one. So I have to look first at the other one before this one can be merged. And then if possible, try to upload a test report for this PR. If you do that, then we know, okay, this is a config for this PR. It's known to work at least in some cases. And that will help us get it merged faster. And then, of course, try to use the github integration features as much as possible. This is something for you that will help you. And then it will also be easier and more fun to contribute, right? Also, if you want to make a PR for something that is not quite ready yet, you can also do that. But then, and then you can already test it, see if it passes the checks. And then you can convert your PR to a draft PR. So on github, I can... So for example, this PR from Nikhet, it's already a draft one. But I can look at another one. Here at the right, you can say, okay, convert to draft. And then we know, okay, you don't have to look at this one because it's not ready yet. So it's easier for us. How do maintainers review the tests? Well, one of the things we do is with EB review PR, compare it with other easy configs with other versions. And with that tool, it's very easy for us to spot differences between versions. Also, we make test reports with the command to be shown there. And then also we use the bots. So there are currently two bots. And that we can launch for running test reports. These are available for maintainers, but also can be available for contributors, for regular contributors. If you know, okay, this one is a trusted contributor. And if you ask that, then we can also give you access to that one. So that's the biggest blocker for PR to be merged is that it's not yet fully tested. Usually we require that it's at least tested with the two bots that are available. Okay, let's try a little demo. I'm counting here on some of the reviewers, some of the maintainers that are in the room. So, no, it doesn't matter. So the question was to take PyTorch 2.0, but that's going to take a little bit too long. So what do we have here? Can I have silence in the room, please? So this is an example that builds the usable tutorial. You can build it if you want. And it's just a small example. So we're not going to include it into easy builds, but just to show how is the workflow. So this one looks very nice. It seems to have everything that we need. But I'm going to break it a little bit. I will delete the checks and see what happens. Okay, this is annoying. I'm going to move it to the mouse. So one of the things we can do is check if it's ready to grow. And that's with the EB check contribut. I hate glasses. So it shows me that it passes the first test and that is the tab 8 test. So it does some style checking. And then the first, the second test is failed. So because there is no check. How can we inject the checks on? It's also easy to be helpful. And now it has created a backup of the original file. And we can check the difference. And yes, it has inserted the checks for us. So it's all automatic. Okay, now we want to upload it to GitHub. Make it PR. And for that, we need to have GitHub integration. So you'll first check. How can I, how can I remove this? Maybe check it up. So my GitHub user is okay. I'm online. I have my GitHub token installed. I can use it commands. I have the Git Python model installed. I have push access to the repo. I can create guests. I also have a location for the working years for this. So all checks are passed. That's good. So I can now push my contribution. One issue that you may have if you have, if you are running an easy build 4.7.1, there is one bug. That will show you that the guests are not working. So, but there's a, there's already a fix in develop. So don't worry if that's not working. Okay, so. Yeah. Okay. That's not going to work. If I don't specify my. Okay. So now we should be able to see the request. Here it is. And we see that it's immediately running all the tests, all the checks. And if I want to have some comment for the maintainers and I could put it here. And I already got a comment from an annoying maintainer. Okay. Yes, I like it. Anyway. Let's make a change like this. So the reason why we are quite strict about the formatting is that it's easier for us to check the differences. I will show you. For instance, I can now. Yeah. This is not a good example. But if we go to the one night. Maybe I should first update update this one again. Yeah, I have a wrapper for that. That's why I don't remember it. And then the comments. Without it. Yeah, of course I cannot believe it. That's the one. Okay. So what the maintainers do to check, for example, this one. It's for our studio is to run this command. Here we are. It's this one from me. And as you can see, it compares with four different other versions of our studio. We can easily see what are the differences. So if it's if every is a conflict is formatted in the same way, then it's much easier for us. Of course, we can limit the search and say, I want to only compare with this one, for example, and I can. Then it should show me only one. But there's one difference with one other. You can run that also for yourself. And I could also run that for my easy config. I can do it also with the preview. And then it's just local. And then the example. Of course, there is no other easy config found so it cannot compare anything. That was a demo. Any questions? Of course, of course. Yeah, okay. Yeah. Okay, but you get the idea. Contributing is contributing easy blocks. For that. One thing that is important is that we maintain backward compatibility. So you can put your code into this version condition. The example code that is shown there. So it only affects the new versions and not the old ones. You can also open PRs also with the new PR option for easy blocks. And then So cannot told me this morning that actually the the PR targets repo option is not necessary to automatically detect that it's for an easy block. You can override it. Oh yeah, if you want to be sure then include to the option and then there should be no problem. And you should also try if you can upload the test report also for the easy block. And you can do that like that. So you You add the option include easy blocks from PR and then the easy block PR and then also easy complete. And then it will send the test report to the easy block PR. And then we can see it works or not. Okay. And also in the box, we can also do the same with the arcs option. So keep in mind that we don't control all the config files and for example, what was saying, we don't know what other users might be doing. They have may have other repos. And we don't want to break them. So this is how an easy block test report can look like this is for TensorFlow of course TensorFlow we want to test it very rigorously for all the previous versions as well. Make sure that all changes made to the easy block are still valid for the old versions, and then contributing to the easy build framework. This is something that Alex will talk about tomorrow in his stock. But here are just a few points that you should take care of and also maintain backwards competitive compatibility of course, but also make sure that all the changes are covered by the test suite. I know the test suite is not always easy to contribute to that. So if you need to help with that, make sure to ask, for example, in the Slack channel. We will certainly help you for at least cannot if nobody else can. So what Bart is saying that it's quite challenging to make a framework PR without getting errors by the hounds. Yeah, it's a style checker. And it's quite strict. So if you haven't noticed, I'm including some one thing that users can do to be able to be able to contribute more is by use of books. This was already mentioned several times during the meeting. But it's a really handy feature. For example, suppose that you have a pytorch version installed and it's different from the one in the easy build repo because of the single version dependency requirement. So suppose that we have pytorch 1.12.0 in the easy build repo, but you really need 1.12.1 for a specific use case in your cluster, because it contains an important bug fix or something. Then this is this can be easily handled in the hook. For example like this. And then you, you don't have to make any changes to the easy config file. And you can still contribute everything as if it was 1.12.0 but actually it's using 1.12.1. So if you want to have more examples for hooks, then please take a look at the example given by Alex from last year. And there's a link there. And there's also the link to the hooks in documentation. Yes, in the framework directory, there's also more examples. And then this is my last slide. So we're trying to make the documentation much better and we hope that many people will contribute also to the documentation. Writing documentation is very different from writing easy configs. You need different skills and if you have good writing skills, then please take up the challenge and try to help improve the documentation. And this should be a lot easier now, thanks to the migration to mkdocs. So what you can help us is finding gaps, missing info or info that is outdated. And yeah, with every, so I forgot to tell that, but when you make a change in the framework, then we are going to require that you also reflect all the changes in the documentation as well. So I'll give you small demo for this one as well. But it's very small. So, yeah, let's. So if you go to the website is build docs. There's it. So this is the website for the documentation. So it's also a GitHub repo. You just clone it here. I already did that. And then you just install all the Python packages with pip and using the requirements file. I did that in a virtual environment. So I will have to activate it. And now you're, now you're done. Now you can start rendering the documentation locally mkdocs serve shows a bunch of warnings. I don't know if they are important. Okay, so the webpage is served now. I can just open the link. And here it is. Right. And the nice thing about it is that it's like reactive. So if you make changes, it's automatically updated. So let's make a quick change to the documentation. It's not there. That will dash. Maybe if I move it here, make it a bit bigger. Yeah. Okay, that was it. If you have any questions. Or if you want to discuss. How is the time. The comment from Alexander is that in newly. I'm not sure how you're forcing people to do that. To speak commit hooks because I think pre commit because they're always local. You have to configure. Okay, so you're forcing your colleagues to actually use that. That's something we could. Let's say encourage. I wouldn't want to force it. But one thing that's, that's important to, to realize the whole reason we have to get up integration. Things like new PR and things like this is because we had contributors who were not familiar with kids at all. Right. And they actually they, they were almost throwing easy conflicts at us mailing them to us and say, I want to get this in. We wanted to enable them to still contribute back. If we're going to force people to install pre commit hooks, they will get lost again and we will lose them. I'm not sure that that's a good idea. Well, but that's, that's happening. So when, whenever you open a pull request, the CI triggers, and that's doing all those checks as well. So it's just going to give you a red X. If you're not, if you don't have the code style or the things fixed already automatically correct. That's an option. I think SPAC is doing that for something for code style issues. They are, they have their bot that basically is changing codes behind your back again to make, make sure it's patate compliance. We could do that. Yeah. Who does check config. Well, yeah, we have an option and easy build itself that that does it actually change the file. No, no, I just, it just flags it as being. Yeah, but something that there's tools like this that auto auto fix dash black, for example, with three formats or code. Yeah. Yeah. Yeah. So there are tools like that that could be interesting. But that's often not the, I mean, it could save you a back and forth. What we also some do, which is not standard in GitHub is when the CI fails, our bot will pick up on it and give you a comment and says, Hey, CI failed, you should look into it. Because when CI fails, you don't get notifications in GitHub at all. You don't get an email, you don't get the message. That's why our bot starts adding comments and says, Hey, look at this. So we're basically pushing it back to the developer and hoping that they will get around and actually fix it. Yeah. Question from Slack. I agree with what was meant by upload test reports. Yeah, that's the upload test report. For issues. It's for pull request. Is it useful to submit the art with newer version. Okay, let's do the upload test report first. You didn't show that. Maybe you can show it for the eb tutorial. But you don't have the tool chain on your laptop. Well, you could try it and let it fail. Yeah, but I'm not sure it will run. It won't get very far if you don't have it. Yeah, so the upload test report is a way for us to very easily. They do from the art. 17799. Yeah, just like this. It's going to fail because it doesn't have the tools in it don't have robots, but it won't get very far. But what this is doing is this is trying to do the build locally. I can show the food. It's trying to do the build locally. And then basically pushing the result of doing that. But made the test report by the Googlebot. Yeah. Yeah, and that last comment that one was successful and mine was not successful because I don't have the change installed. That's the that's the uploading of a test report. There was another question is it useful to submit the ours with newer versions of software, but built with all the tool chain. I'll leave that one to you. Yes, it's useful as long as the tool chain is not too old. You're welcome to contribute to us, but they're less useful than the ones with the latest tool chain. The problem that the doctor is no longer. I sometimes have that problem that the software is no longer maintained properly. So it's a four years old. I try one of the latest tool chains and it is failing because it has old style crappy on my French. I'm coding in it, but the user is jumping up and down. I'm not demonstrating that now in front of me and telling me I really need it for my thesis now needed by yesterday and all of that. So, then I'm going back and say, okay, I can compile it with GCC sake of argument nine. But then the question is, is it useful to the community. And we can say, okay, I'll solve that problem. So anybody who is currently listening, don't have users jumping in front of them, maybe line managers as well. So what I'm doing then is I'm opening a pull request and lead it up to the maintainers, at least if somebody has it. The same problem. They can fall back to my pull request is that something which is useful. I think it's still useful, especially if the software is new. And then, yeah, it's a valuable contribution because it's a new software that we support from now on. And even if it's not perfect, even if it's with an older tool chain, we can still update it later and save us time. So you can have multiple versions per tool chain, but only one of those versions can be a dependency of others in the same tool chain. And maybe it for Python were quite strict because a lot of software depends on Python. So, yeah, if we would allow multiple versions of Python, then it would become really a hell. Yeah, the reason we have that policy is so whenever we add something in that we call this generation of easy conflicts, so that's using a particular tool chain. We want to make sure it's as compatible as possible with everything that's using that same tool chain. And as soon as you start using definitely Python, a different Python version or basically working off into a different world of its own, where everything needs to be using that Python version. So that's something we learned the hard way if we don't have a policy like this, it gets very messy very fast. I think for Python, we've always saw not implicitly done it with the exception that a while ago we had the Python 2 and the Python 3. So that was already a fork in the same generation, but where we really started hitting problems with boost. So when boost was used as a dependency, we have two versions of boost and two packages with two versions of boost that somehow then became a common dependency for something else. Then it got really hairy and it was very difficult to get out of and this policy allows us to avoid that problem as much as possible. And where we make exceptions is when there's really a technical reason why it doesn't work with a newer or an older version of the dependency, then we say, OK, this is an exception. And the implication is that it's not going to be compatible with other stuff in that generation. But fine, that's the best we can do. Yeah, well for build dependencies, the bottle not scream a don't have to make exceptions at all. So it's only for runtime. So we have cases where we have multiple C make versions in the same generation, but that's okay, you can never get the conflict on that and actually using the models once they're installed. There was a follow up question as well. What about software with extra options? Like if there's an extra dependency that you add, which is only needed for some use cases of the software. Well, it's actually not a problem to add an extra dependency, even if you don't need it. But if you would need something that is incompatible with other dependencies, for example, then you can also do that, but then you will have to add versions of this. So it's clear it's another version of that. Not a variant. Yeah. Okay. You see any other questions. We have any other in the room before we wrap up. Yeah. Okay. Then I want to ask for a lower level. Yeah. That's so update PR and UPR are smart when you have the robot enabled. If you do ev dash dash robot dash dash new PR, it's going to take your easy conflict file. Look what the dependencies are, which are needed for that and check which ones are in develop already and check which ones are not. And the ones that are not, it's going to include that in the protocol. So, so, so you may have the robot enabled in your default easy build configuration. But that also affects what new PR and update PR do. Okay, but that's a bug. Yeah. So if update PR is complaining about not finding an easy conflict file for a dependency. Then, yeah. In separate PRs. No, it shouldn't. And if it does, that's a bug. So if update PR is complaining about not finding easy conflicts for dependencies, it shouldn't. Now, if you do upload test report, it's different. And the CUDA installed already. So when you want to upload the test report for QDNN, then the CUDA one has to be there. Maybe, maybe it's, well, we can take a closer look if you have a specific example and see what's going on. I suspect that maybe because you have robots enabled in your easy build configuration to stop worrying or thinking about it. But that has an effect on what new PR and update PR are doing. Yeah. And that can maybe be disabled. Yeah. But will not. Yeah. Yeah. So there's a separate configuration. There's, there's the robot configuration option which tells easy build where to find easy conflicts and it enables dependency resolution. So there's two things. There's a robot parts option which only tells it where to find stuff and it will not enable the dependency resolution. So you maybe you maybe just have to switch to the other one and you should be fine. Or we can we can take a closer look at that. Yeah. Is it a comment or a question? Yeah. Yeah. So the CI test will fail if the easy conflicts for your dependencies are not found. And that's actually a good thing because it's blocking the PR from being merged until the other stuff is being merged. Yeah. So that's very deliberate as well. That's definitely not a bug. Okay. Good. Yeah. So we went over time a little bit, but not too much.