 Okay, so welcome everybody again. We're going to continue from where we were last time and the plan today is to finish this tutorial and also get into dependencies and go over them as well. So just to pick it up from where we have been so we're discussing how for example you could inject scheduler options to the generated JavaScript, and this is one way. This, the problem with this way is that one could say okay, but this requires that you know that you have for example to learn behind the scenes. And for this reason there is also another way that is a bit more portable that could be adapted to different backends. And that's what is you can achieve with this flag here called extra resources. So these are type of resources that you can define in your configuration and associate them with any type of options, scheduler options, it could be number of GPUs that you get, like with gds or processor types and things like that. Let's see how how this is done. So we have to modify the configuration. And we add one section that's called resources inside the partition. I'm going to explain you in a moment how exactly it works and what it means. Okay, so here essentially is a list of resource objects let's say that they each one has a name. So this is the name that you're going to use to access that resource from within your desk and then there is a set of options that will be passed the scheduler. And there you can have also placeholders which you can fill in in your test. And that's quite flexible because you can name your resources we want as you want. For example, in our real tests, we have a resource they named no switches, I think, or switches, which essentially asks us learn to to to pass the minus minus switches option, which restricts the job allocation to a certain set of nodes under the same switch. So the good thing is that, if, although you can specify those in your test. Like we do here. If I'm running on a system that there's no not if this resource is not defined, it's going to be simply ignored. So it's whereas here it's just it's going to be there you already have to put it yourself if I am on that system then use that with the other. It's going to be more flexible, let's say. And now let's see it in practice. I think the test is called here. Another thing is if you pass both the minus L and minus R option, the minus L option which is about listing, it will always take precedence. Okay, let's see if we essentially got it. Apparently if the test passes it seems that it has put the memory because we're testing the, the actual out of memory messages there but we can have a look. Let me know if the font is is too small. I think we were in the environment. And that's the memory limit with resources test. And there you have it. So without having to know how the option is. I find this is quite useful, especially if you have for example a cluster with canal nodes, or, or, I mean, or a mixed class that don't want to define partitions or for some tests. It's another, it's another, I mean, more test specific resource specification, that's it. So that's about it. And one other thing is how, for example, you could modify the parallel launcher command. Sometimes you might need to pass options to the parallel to the parallel launcher like the MPI run, or the SRAM. And I mean by options options that they are not the standard ones like the minus and four or things like that which refrain automatically generates them if needed based on the company based on the, on the launcher you use you have specified in your configuration file. For example, in this case is an example of a finite test what actually doesn't. It just run an affinity test it doesn't check exactly affinity. So there we want to pass the minus minus CPU bind equals scores to the launcher. Now, this you can do it. Let's open the test. So it's not to be here. So again, very, very similarly to what we did with the minus minus ma'am again with the hook after you can access the launcher through through the job so surf dot job dot launcher and add to it some option. And there we add this. Now if you ask me whether you can do that more portably, I would say, you can't for the moment, not for the money you can't. So that's the thing. So there if you want to to guard it against different systems, you have to do it manually with if I am on that system or that partition. So this is going to add those options to the launcher and I'm going to run around this right now. We call that affinity. Oh yeah, I forgot to add our cluster. And also, for example, here I can let me put see. Let me put here you have I think. No, see it. Yeah. So that's it. Let's do the. Let's run it. Sometimes as a, as a hints on it, when you have affinity tests, all with that always test all the different environments, because different compilers and different opening period and times make different assumptions and they do can do different placements so we have seen differences for example, between in how Intel and for example can interact with the SRN affinity so whenever you have an affinity test just just keep this in mind. So, I'll put the frame. So you see how it generated this. So, depending on the back end. If you had, for example, MPI run, the frame would have MPI run minus and one minus NP one, etc. And then it would add those options that the users pass. Now another thing is, what if you want to test a debugger. And the, so the bugger is usually the, you have to wrap the the parallel launcher command. So here is a case of how you can do that so you can basically replace the launcher and basically wrap it so this is a kind of pseudo launcher that takes the actual one and wraps it with your own command so did it the offline here. And then it will generate this thing I don't I don't have an example about this because I did acquire for example have DDT on the cluster and this this works. Another is closely related to this is, for example, when you want to test something that has its own launcher for example there is. And then you can wrap it by parallel greasy or some visualization software, they do come with their own manage parallel launchers that they essentially behind the scenes they leverage MPI run or SRAM, but essentially it's not the user facing thing. So there you can still do. So here in that in those cases this is the trick. So you can set the executable which is the custom scheduler for example take crazy it's going to be crazy. Pass any options there. And then you you replace the actual launcher that the frame would normally use with the local launcher the local launcher is just basically amidst nothing. It's like the one when you run your executable, which is the empty string practically. And with this thing you can do that so actually this get launcher thing. The name takes the name of the registered name of the launcher of the registered names of the launchers you can find them in the documentation. So the those those are the names have with more details what exactly it's one does. So it will give you back this this this object that then you can also play with it if you want, but that's the way in those cases. Okay, how about if I want to add more SRAM commands, more parallel launcher. So in this example we have a test that we're going to run multiple SRAM, we're going to get an allocation of certain number of nodes here for, and we're going to have four SRAM commands, each one running with four number of nodes. You could theoretically see that I mean there are cases that you might have to pack multiple SRAM in a script. Or when you for example you would need to do a scaling thing but again then you could also use parameterized that's but let's let's see this example for the sake of it. So here is again you have to to fiddle a bit with this launcher job dot launcher. So again you prepare everything before you run. And there is here a small thing that's the launcher has a method all this is documented in if you go into the references so you can see the exact documentation for these functions. So here you just say, okay, give me the command that you would run based on this job. Now it always takes the job because the job might have it has in the information about new tasks, etc, and some launchers may want that information like the NPRM, you need to pass for example NPRM minus NP. And what this thing will give you back is the actual command, which then you can use to add additional options like we do in this case, where practically, as you will see it with their essentially SRAM and then we just add more options and our executable. And here we generate everything in the pre run you could put it in the post run. Remember the tip run and pre run and post run are things that they go before the SRAM or parallel launch executable command. So again, I would need here we always forget. Ah, yes. Okay, we have an error, but let's see what it says now let's examine it. It says that society the error and there we have an assert equal. And instead of getting 10 things we get to one. Let's go back to the test. Okay, but we're testing. Okay, so we have. The same pattern is different, because on Dane they're named and ID but here they're named the frame. So I'm going to do it like that so we can even say no pattern habits in Eric one that matches everything. And then let's make the test a bit more generic. If I am named. And my node pattern equals this thing. If I am on the reframe if you am pattern equals. It's a name. Okay, and then let's put also this part here. And here we simply say that should be. Should do the job. Okay. So it works also it seems that our test, our sanity check works. And, but let's also examine a bit the output. What's the model. So you see, so everything here is in the pre run thing, and you get, as you would expect. And here is also the output. We could even be more strict here and counting the instances of how many times we've seen the same knowledge, things like that. Let's keep it simple. Okay. Change of topic. So flexible regression tests. Yeah, I need to rush a bit. So flexible regression test is a test that sets the new tasks to either zero. Like in this case, or to a negative value which means that I will need at least that number of, at least that number of tasks at least minus tasks. It's a host name check that I'm going to spin it. And it will essentially run in all the nodes. So that's why I'm not only using great here. Am I using great. Okay. Yeah, that's that's a it should have been using built in is just that. Yeah, I forgot to update that that's a bug. It's a bug in the documentation. So that's an interesting. Yeah, that's an interesting feature for diagnostics, because it will spawn by default it will get all the nodes that they are in all the nodes that they are idle. So for example, here let's examine the output. We call it. So you see we got the whole cluster. Now if, if multiple of you run the same test. And for example the time I frame checks it can't find a node, then it will make the test fail. But the other cool thing about it is that it will only for example now it doesn't take all the nodes of the whole cluster, it takes the nodes that they correspond to that partition that you are accessing. So if you have constraints if you have minus be partition etc, it will all take it into account and therefore going to see that this is only valid for this learn backend. And of course the local backend because they're always one of them. Another interesting thing is if I'm if now I'm going to pass it an option directly to the slam minus j partition equals I think lower right Victor. I'm going to get only half of the nodes because it will include the option. Yeah, but it's down. I put it down. Sorry. Okay, I'm going to I'm going to do the thing if I put a note list here and I'm going to put just 10 to 12. I think I have called this thing. I have to call it so that it's happy. Now we're going to just take to these three nodes. Let's see. Yeah, so you see, so that's very useful. For example, when you have like a reservation of nodes, or you want to specific run a benchmark on a specific node and things like that. I find it quite useful. And then you can do several magic here. For example, one thing that I didn't mention is Yeah, so the new tasks attribute of the test will be set by the by the framework as soon as it gets the actual number of nodes. So then you can use that with this deferred expression here to check that you actually got the number of nodes that you, I mean, or do something with the number of nodes that you got, for example, here we check that we actually got an output from all the nodes that everybody we got to print out from everything. You have 10 minutes only so moving on to how you could, for example, test containerized applications you can use that with you can launch containers with refrain. And first thing you need to do is adapt your configuration. And there is this kind of object here, the container platforms, which is a list of the actual container run times that you have. So in this case we just have singularity on your laptop you could have like also docker. And that's another thing that goes here per partition. And actually what you say here is the type of the container. Oh, okay, here, sorry for that. And in case there is a module that you need to load or to set some environment variables as well in our case it's already installed so we don't need to load anything. So that's that's how it is from the configuration side. And then from the test side, it's, it's this one. So it's very similar to the build system in how the way you write it. Let's also put here our guy program environments practically that's why you need a built in environment that loads nothing is for this type of test as well because it doesn't matter. So then you say container platform singularity, and then there are several things that you cannot there so there is the image there's some common to all of them but there's some others that they're more specific to to each platform, the image and actually the commands you want to to run. The self executable is not going to be taken into account. And also you can specify a working directory, which honestly this I don't remember so yeah I think that this one. Yeah, I think this is where it's going to bind. So what the frame does it mount binds the bind mounts the scratch that the stage directory inside the container. And I think the work here is where it's going to do that. But I need to check that. So here and the nice thing is that then you can still use the machinery that the frame gives you with with a sunny patterns to check what has run inside the container because essentially that stage directory has been bound in the inside the container. And let's let's do this. Let's run the test right now. So this is a question. And Yeah, just just give me a second just to run the example Kenneth. Okay, let's let's see. Let's see first what it generates. Okay, this is what we generate. So, and that's but thing. And then as a singularity exec and there you see how we bind the stage directory inside the container as work did. And then we just run what it has to be done. So that's what we do and and let's see also the output. So you see you got the output as expected, which this thing is what the frame then looks at in your son into checking. So yeah, Kenneth, if you can you easily parameterize over different container images as well. So say you want to run the same tests but in different images. Yes, yes, you can you can you can have a parameterized test here and then you can parameterized over the image. Yeah. But that's something we're interested in for easy build so running easy build in different versions configurations. Yeah, it's it's you have the whole machinery of refrain. So the only thing that it ignores is the self dot executable. Because then it wouldn't be the semantics wouldn't be so straightforward. So we just put the commands in here. Okay. Okay, so moving to another chapter. That's it's it's it's five minutes we can't make it but I don't know. Can I kind of get up. I mean, can I have like another five minutes. Yeah, you can. We will have to end the stream. Definitely 10 to so 10 before the hour. So if you still want to have time for some questions at the end, so that means 25 minutes. Yes, then we have to end the stream. So ideally that includes some questions as well. Okay, okay, okay, that's that's fine. I thought we had like five minutes or something for some reason. So, okay, so yeah, dependencies. That's a so that's a nice chapter. Now, before, I think it's first explain a bit something before going into dependencies so that you understand why dependence with with three frame and different tests are not as straightforward as you would, for example, imagine them for a for for like build system like easy build or spot. Let me I think. Okay, just give me I'm going to show you in the documentation. I'm going to point you to some things instead of has some nice figures. So one first thing is how to frame exec test that's that's one thing that you need to keep in mind. So here is is your idea of test, and you say that this test is valid for those systems and for those programming environments. Now, when it loads, when it frame loads the test from the disk, it instantiated. So it will call its init method. That's one thing also to keep in mind if no matter what it will instantiate it, even if it is if it is meant for the system you're on or not the filtering happens just afterwards. So it will then filter the test based on the options you pass on what are the programming environments that needs to be tested and so on. And from this point on, actually, yeah, it, I think it's it generates. Yeah, I think this should be changed now. So after it loaded it generates combinations by cloning the test with different programming environments and with different partitions. So it creates proliferation of of test cases as we call them out of your test, which is those that they will eventually send down to the runner, which will be responsible for running for for, yeah, for the whole round. So the runner basically takes those that doesn't know anything about the test, except just an API and just scan it. And you can schedule them for running, but it deals. It doesn't know anything about the test, all it takes is the test cases which this specific instantiations, or, yeah, not exactly instantiations but clones, let's say, of your test. Now, the dependencies work in that level. But actually you write your test at the, at the higher level but the dependencies essentially work at the test case level, not at the test level. Before come here. So conceptually, okay, let me let me open first test with dependencies. Let's hear this one is enough. Let's go later to the examples. So here, what constitutes dependency to create dependencies is this thing. So depends on and you put the name of the target. So here T zero. And that conceptually is distinct T one depends on things of the T zero. However, if in reality T zero is not just a single test it have multiple test cases. And the same for T one, it has multiple test cases. So the question is when we were designing that, okay, do we need to provide to give control as of how individual test cases are connected. And we came and we came to the decision that guess it's better if you can do that because for example, you might not necessarily want to depend on, on another test case on an irrelevant programming environment, for example, or that has an irrelevant partition. So you define your dependencies conceptually, but then they are kind of expanded in a more detailed graph. And there are different ways that you can specify those connections their shortcuts for those connections. So there is the by case which is the default. So as soon as you say that T one depends on T zero, then practically this means that there's the graph is like that. So I only depend on the same on T on T zero when I'm on the same partition and on the same programming environment. There's no things like cross cutting here at all. Of course there are different ways. So fully that's that could be the obvious for example implementation that gives you no flexibility, I will not start running T one until all the cases from T zero have started, but you can still do that. So if so depends on takes a how argument, which is how you depend on the other guy. So by fully means that none of the test cases of T one can can proceed if not if if it will proceed only for the cases of the zero have finished. And by finished we mean finished like completely the pipeline. Both sanity and performance and clean up everything. And then you have different bipartisan by environment is someone I'm not going to go into the details here so that's that's a basic thing to understand with dependencies. So it's not just this the just think this thing which is the conceptual where it's usually a big system like easy build stop because there we have from this one for us to see them in T one are essentially multiple things that we're going to schedule and run so we need to know what is the relation here. So let's go to the examples. So here I'm going to show you what we picked the awesome benchmarks for the dependencies and we split it in like three. Yeah, let's say three broad categories one test that downloads the test. The code just for demonstration purposes. So that builds it and then we have a set of tests that essentially run some benchmarks. So if we didn't have dependencies. It's of the for example also also latency or also bandwidth benchmarks we would have to download and compile every time there are benchmarks which is not efficient. So in this example, I'm going to show you how we're going to we split that into dependencies. And let's let's start and let's start gradually. So one thing is here I have we have mixed up the test there's no I mean the order the tests are defined they have doesn't matter. And just to prove that they are a bit mixed here so the download test I think it's at the end. So here is you know everything so we doubly you get essentially the test just executes doubly you get post run and then just runs the the entire Z and this apparently only runs on the looking node with a built-in environment. Now, let's go to the build test so build test here. So for the big build test. Let's also doubt it here we need the force. So we say that it depends on the download test fully so we can proceed if download test and we could. Yeah, we couldn't admit that so we because because they run on different partitions. So we say just we depend on also downloads test fully so we can't proceed if it can it has not executed all its test cases have executed. Then you see now and then another example of build systems we're saying we're using auto tools. Yes, use make minus j eight and yeah check for errors. And now how for example game we depend on another guy but how we use the result that of that other guy. Now this is again you you can do that with this pipeline type of hooks with another type of decorator so so so you define your own function here we call it set sources there because we're going to test set the sources there of our test. And here you put the name of your target dependency. As an argument, and that decorator does some magic here so that you can essentially, it essentially binds that thing to a special function that you, you call essentially to retrieve the actual dependence let's see it in practice before because it's better if. So here we say okay from the also download test test case, I want from the also download test I want the test case that has run on the login partition with the built in environment and. And from that I want the stage deal, which then I used to make up the path and set my own sources there. Now the nice thing about this is that this call here. It really gives you access to the actual test case that has run so to another test so you have all the information of the targets test, even it's no tasks, all these things for the target test you have them at the state that it is after it has finished this execution. So it's, you can really be a stretch your imagination with the things for example that you can do it's really really flexible. And, but here we just take the states did that stage did the target one and we. Yeah, we combine that path which is the download path essentially, and we set our sources there that's what the build test does. Now let's see the different tests. So, yeah, then we have the latency and but with test and since they're very very similar. We created a base test here. That's also practice that you can use we use it a lot in our tests, where we define everything that's that's common. So, the valid system. Let's add a guy here. This guy doesn't says the sources there because it's going to be said. No, this is an only test so it doesn't matter. So, I'm not, I'm not going to, I'm not having any resources. We set by default doing tasks and does per note, and then we set the sanity patterns and our dependency so whoever inherits this test, it will depend on the build test. So here's the latency test, who just put it in the description. Yeah, we inherit from our base class. So, we have a difference with this generic way so just to have the units that I showed you before. And then we set the executable. So, now the executable we should take it from it's generated by the build test. So again we use the same trick here so we don't think I didn't mention that when I was talking about dependencies. So here we say we depend on the build test by environment. So, that means that I the test cases between partitions are independent. But if I'm running with force, I want that the build test that has run with force to finish but I don't care if the build test with Intel or PGI has finished, for example, I will proceed as soon as the build test of force will finish. If we didn't have this granularity here then you would simply have to wait for example for the Intel test to be built and that could be and imagine you have a license server with few licenses that you would block your whole sequence of tests. So, again the same trick here so here I'm not passing anything because if you don't pass anything a reframe will pass will pass the current system and the current environment as the current partition the current environment. So, no need to pass here anything because we will be drawing no reframe CN and on force. And then I'm just constructing the path where the also latency benchmark executable base and I pass also the executable option. Exactly analogous is the bandwidth test like that. And then, and then for the also reduce test which exactly analogous except that for this case, we said, okay, let's be a bit more imaginative and make this like a scaling. I mean, scale it on different number of notes so we parameterize the test. Again, each of those tests will depend on the build test so it's it's what I really like it that is that it's really really composable. Okay, let's see first let's list the test because there's also some interesting things here. Okay, we've said all of those stuff. Let's list the test. That's the whole thing. Okay. The download test doesn't have a valid system. Which one then. Which which one does not have the download. It only lists. Oh, yeah, you're right. Let's see if we have put it everywhere. Yes. Yes. Let's see what we're going to get. Okay, that's a nice sign that it has listed everything. So the least option is also a bit smart, especially if you want to work with dependencies. So I'm going to just pick. Let's say the also bond with test right. And I'm going to list it. So here it says it instead of listing just the also bond with it listed also all its dependencies because if you want to run those bond with you have it you have to run its dependencies. Right. So it automatically picks them now. List them in dependency order. It just lists them. Yeah, exactly. Just listen. It's random order. Now, you might want to see. Also bond with if with if you make uppercase L you will get more details about the listing. So, again, you get your three tests but let's go a bit to. Let's take one by one. Let's start with a basic one. Again, they're not in order. They're not in topological order here. So no download test. Yeah, nothing no dependencies build test. Okay. Now dependencies so you have the conceptual dependencies. Which is on the download test, and then you have the actual dependencies. The actual dependencies may may change depending on the system you're running and depending on the programming environments you're running, but here you get the exact graph essentially the edges in the exact dependency graph. So, so it says that the build test on CN with GNU depends on the download test on the login with building and similarly here. And the reason that this cross dependency you say it's not like the classic one that you had depend on the refrain EUM CN here and GNU is because we use the fully dependent property in the how argument. So then you say I depend on whatever you do with download test. I have a full, fully connected sub graph between those two cases. So here you get the actual dependencies and for the bandwidth test you see I depend on build test but here you see it differently. So you see we we have this by environment. So, I depend on the refrain EUM GNU and CN and force and CN and force. And actually I believe that we could even be specified. We didn't even have to pass the how argument to this guy. But let's run first and then. Okay, so let's run everything. I remember we had a small, a small problem with a cluster I remember. Okay, let me just run. I will run everything. So it's, let's see if the message is so when a test is kind of blocked due to dependencies that if rain prints a depth and message and just go once to the next. This is how the synchronous policy works. If you are on the serial execution policy. It will just block until the dependency has finished. It will take some time now to build. I need to wrap up the recording and the stream soon so maybe. Okay, so if there are questions that people will ask. Yeah, please, please go ahead because I don't have anything more with dependencies. Simply. Okay, I would just run the bond with here. And I can take any questions. Yeah, we have one. So, with a simple HPC system where you have built your modules using easy build. For example, how would this thing work for the dependency test you already sort of, you don't need to sort of download stuff you already installed let's say OSU benchmarks. Yeah, I mean, there can be different scenarios. I don't know when you want to that you want to structure your run only test that they, for example, I don't know. You might don't want to run something if some not something else has passed. Right. Yeah, it's. You give me a lot of example allow me to give an example have one. Yeah. So what I do is basically, it makes no sense for me to be running the gromax test. If the version of gromax is wrong, if the minimal things of gromax is not passing. So what I do I basically use the module load of gromax and check if it's actually the divisions are correct if the inputs are being correctly constructed. And after that I didn't go run gromax. So they have like the actual run depends on the input creation and then also the input validation step. Got it. Okay. Yeah. So let me explain to you what I'm doing here is that I'm, I'm removing the GNUG environment if I am on reframing the system. Because it doesn't make sense because there's no MPI compiler there. So I'm doing that. And yeah, any other, any other question as I fix the test. The next step if I may add is that even though you can have easy build things you can still have like compile only tests and also compile and read completely reframing tests imagine that you have build HDF five easy build and you want to test HDF five is working. So what you do basically you do first you download the test and you know you compile it and then you run several HDF five different tests based on that initial one. I think that brings up an interesting point I mean if I actually want to build using easy build in the reframe. Victor you are the. I don't that you want to take that. I have a test of easy build that actually I'm in response of a gromax at CSCS. So, so what I do is basically I generate on the fly using easy build, I mean using reframe easy build recipes and then I compile all different version of gromax with plume D with great storming conversion with great back and then I do all the tests in the after that they run all the other tests right. So I have experimented with the easy build installation with reframe works really well, but currently on this spring, we have a colleague of us a Raphael is actually working on getting a proper build system for easy build. So you can do that season dot build system eco easy build and then voila and actually going to call and do everything for you. I would love to see that I mean that's like exactly my use case scenario. And, you know, would love to see you and when you guys have that available. So, I can, I can point you because that's an interesting with, we have a draft PR it's really just he just pushed. So we don't think if it's working yet but so if you want this thing because we're going to do. Okay, let me just make this bigger. Yeah, so the idea is, yeah, if as soon as it comes a bit more mature I think soon. I mean it would be very nice if you try this out or make some comments because the idea there is that Yeah, so you say as self build system equals easy build, and then you will have reframe. Yeah, use easy build to build something but not to install though, it's going to do it how we have it in mind it's going to use easy build to install it in a stage directory of reframe, and then you use that for testing and then you have to figure out some machinery there for for the so that you load the modules automatically and sets out the module path etc but But all this is going to be handled automatically by by the test that's the idea and we plan to do the same also for spark so you can have built system equals spark and have reframe do the Awesome. Awesome. Awesome. Yeah, I mean, was it decided we envision to start on the stage directory, which will be the default, but you can also sell self dot variables and changing is a bit prefix and change to the path you can install it right. So you can still using this build system change where to use reframe to drive your installation to the global central repository or whatever. But I have to say something about this because I mean the reason why I don't have my test online is because my test is found in practice, more than 3000 reframe tests of grow max right if you really take into account all the combinations of versions and we have a very limited instrument and we flew me without plenty. Then it just like goes crazy. So you have to be careful with the regarding conversation have a regarding parameter test parameterization there is a very very cool feature that we just put in three dots for we just we have only basically document we have not a tutorial about it yet, but it is documented and it's the direction where we're also going for this library of tests and it's so there is a section here called directives that's new. We should have put it here the new inversion three dot four. So, instead of doing parameterized test. I'll quickly interrupt. I'll end the stream here on the recording, but you can keep going in the zoom session.