 Right, so the purpose of this session was, I think, to talk about, in and amongst all of us, sorry, let me just mute my notifications, right, yeah, so the purpose of this particular session, I thought it'd be useful for everyone to, you know, that's been at this Hackathon and has given us your awesome contributions, but we'll also go over some of this in the wrap-up session of five o'clock Central European time and see exactly what we've achieved and how we've achieved it. But the purpose of this session really was more just, you know, now that you guys have had a feel for what we're doing on NFCOR in terms of using modules, some of you are quite experienced now with using them, you know, pipelines around the current syntax that we've got. I think there was always at the back of my mind and possibly others as well, is that we're not necessarily exploiting NextFlow as natively as we possibly can with the current syntax. And so there were some workarounds that I came up with sort of last year to attempt to figure some of this stuff out and make things work and it works, but it's not as native as I mentioned. And so to show you what I mean, I've got a prototype call request and now you guys have already, like I said, contributed to your own modules, you know what the syntax looks like, you know, how we're passing stuff around, you've got the tests working and all of that now. So can you guys see my screen? Yeah, okay, cool. So what we have, so I created this prototype call request now for a fast to see, well, probably I'm going to find it with all these new modules that we have here. So with the new syntax, essentially, what we'll be doing is getting rid of all of these customs. So we had this functions.nfr, which you guys would have seen now. That just contains some basic functions that allowed us to publish files in a custom way and also pass optional arguments around for each specific tool. So if the tool needs arguments, we'd initialize this map and then pass that to the module. What we can do now with what the new syntax is, get rid of all of this, dump all of this, all of this save file stuff and start using more next flow native options. And when I mean next flow native options, what I mean, essentially, you won't be able to see it on this, on this particular call request is that instead of saving files like this with custom options and so on. So this is essentially a custom function to save files that is coming from here. What we can do now is pass all of these options directly from a single config file. So if I show you an implementation, it's what that would look like. A lot of this is actually done by Mahesh, so kudos to him. And we had quite an in-depth conversation today with a bunch of other people, trying to figure out how to finesse this off and so we can actually push it into mainstream adoption. So what I mean by that is we can use process selectors like this, where we can then say that any process with this particular name, for example, instead of having options.args now, we can use the next flow task extension option to pass these arguments directly to the module. So if you see on that prototype call request now, we aren't using any options or suffix. This is directly, this is a known way of passing options directly to modules using this task extension. And the name of this can be whatever you want it to be, it can be suffix, it can be supercalifragilist, it's very dotious, whatever you want it to be. And so that offers a lot more flexibility in terms of how we pass these around. I mean, we're not then restricted to also three arguments, like args, three or args, or we can do whatever we want. For the most cases, we should just need one set of arguments for some tools that have piping and other stuff. Second sets of arguments and stuff, but I hope you get what I mean. So this is inbuilt next flow type functionality that we can exploit and use within modules. Another thing is also the publishing option. So as you can see, we've deleted a bunch of this block of code in terms of how we publish by default. So again, what we're doing really is just using process selectors to suggest how we'd like to publish these files. And there's a number of ways you can customize this. You can have patterns, you can change the mode, you can even enable publishing based on custom parameters as well, which is really neat actually. And it means you don't need an if statement. You can just say, only publish this if this parameter is set to true. So for example, I only want to publish this index if a parameter not saved reference is true. So now hopefully you get an idea that we're moving more towards using this. And next is amazing in terms of the functionality that it offers. I'm not just saying that because it powers the audience. I'll tell them straight up if it's not right. But the flexibility that we need to have this sort of implementation is already there. And it's sort of been bought out mostly like I said by Mahesh in terms of making this happen. So that's kind of where we're going with the new syntax. But what this also means is that what this also means is that we can get rid of a lot of the logic that we would have had the main script, the diff for the main script. So with the old syntax, I was defining all of these parameters here in order to pass them around. So we have this thing where we went from the modules.config file and adjusting any parameters in the workflow. And then that was then being passed off. So actually, no, it's going from the modules.config to sub workflows that are then being done in the workflow. And then, you know, it's just, there's lots of options being passed around. And we had this sort of include syntax where we were changing those options. But now all of this is happening directly by that modules config. So we can literally just do a massive delete. We don't need to use this add prams anymore, because all of these options are coming directly from that config file. And so you can see this massive red where now all of these options are now just changed in a single config file. So you can kind of understand why I'm really excited about this, because it just means that you don't, not only does it for beginners, you know, it helps them figure out how these options are being passed around because they're in a single place. But it also means that it's easier to maintain and update and develop in this way. And the first thought of all obviously was to do this type of thing in a real world implementation, which is why we chose the RNasec pipeline, because that tends to be sort of, you know, the cutting edge type implementation for anything DSL2 we've had. And so the first thing we tried to do was have an end to an implementation of the RNasec to try and find an iron out any issues. And it seems like pretty much everything is covered. So here at the workflow level, we basically deleted so much stuff there. And we don't need to pass these options now around everywhere and initialize them. We can just directly go from the config file to the module itself. Because you keep the config within the parameter within the config, how do you manage to replace the parameter thing? How? Because we're, because we're, we're using task extension to pass the, so all of the, all of the arguments that we were using for, okay, they are already there. Exactly. So, so don't link there are anything, and I think we need to talk to you about this, Paolo, as well in a second, this is, is the caching mechanism for this, but I'll come back to that in a second. So, so yeah, we can, another thing that was quite important that we needed to maintain was that if in a given workflow, you, you say you call sand tools index more than once in a workflow, we needed to be able to independent and change the options for both of those instances. So, in this case, for example, if you use bed tools, you know, cut covers the process selector, if you call this module twice in the, in the pipeline, then it would overwrite the options for both of these instances. However, again, Mahesh found this way whilst we're discussing and trying to iron out these issues, where you can use some quite clever process selectors to hone in on exactly the, the right version or incarnation of that module, you want to change the options you want to change. And so then that means that, you know, if you want to run, say, fast PC pre trimming and post trimming, you can have completely separate options for the one pre trimming and post trimming and even the publishing options and all of that. And so that was quite important because obviously you need to segregate, you know, where even the simplest level where you publish these files. I think this context would be also interesting to mention that what's your timeline to implement this? No, so we, we, we were thinking before Christmas. And that's not because I think also because we were thinking, talking about this stuff, there was a long discussion. In this context, instead of having this huge, okay, I love this fact that they're moving all these things from the script to the config. This is definitely a great idea. But still maybe it's not optimal. And the optimal would be to a config for each module. And that config, you could put the published here deciding to have this huge configuration with all the files, you know what I mean? We actually discussed this literally just now. So, so another thing that we want to do, I'll skip onto that now because we brought it up. So what we're thinking of, so NFC modules at the moment is specifically just for like units. So like fast to see or PWA index or minimap or something. So it may be single tool type modules, but what we also want to do is extend it to work with sub workflows. And a sub workflow is a chain of modules, right? So it's more than one module essentially. And so we've put this net next to a config with it. In anticipation that we could have some default options. I mean, this, this isn't representative what would be in here. But for example, you can imagine there'll be with name process selectors in this next one config file as well to set some default publishing options and extension of arguments and so on. But I think we identified that the problem with that is that the process selectors that we set in this config file will be specific to this sub workflow context, not the workflow context. Do you see what I mean? So if you, if you have this align both type to module here, say you set a process selector for, I don't know, both type to align. So you do something like process with name by two extension blocks to write things out. The process selector will be based on the name of this particular module that's been used in the context of the sub workflow. But when you actually use it in the workflow context, it could be, it could be, first of all, aliased, right? So you wouldn't be using exactly this name in the workflow. You can use both type to as both type to align once and then both type to align as both type to align twice. If you want to control, yeah, exactly. We'll get a warning saying that this process select isn't recognized. Something that you fixed just now with the process selector name warnings. That with this, if we try and use that, the vanilla version. So we had a chat about this earlier. Yeah, it would be nice. Maybe in the next iteration, we can figure out how to do that. But the simplest of purposes, I think we would have all sub workflow type options in this, in a config file like this for the purposes of unit testing type stuff. Okay, make sense to me. Okay. So we'll just, I think it's just the different context that you're running in my course to confusion. But it's definitely something we can probably, you know, revisit in the future. Right. So that's, that's basically the functionality. I don't think, have I missed anything, Mahesh, Grebel, Jose, anyone else that was in on those chats? Have I missed anything in terms of what we discussed this morning and what this can offer? Right. So that's, that's basically how the new syntax will look. Like I briefly alluded to earlier, we're hoping to get this RNAC pull request sorted out as soon as possible. And then to end, the test surpassing, which is always nice. But we're hoping to sort this pull request out as soon as possible. And try and find out all of these issues and stuff. And then we will update all modules on NFB modules, which means that things will almost definitely break if we try and update them. So there will be breaking changes at some point, just a heads up. But I think it things will just be much more simpler with this whole syntax. We're still playing around with a few things and the discussions this morning will really help work around just sitting down and talking about this. And then hopefully, so the process is update and release our NFCore modules all in one go. And so we can just do a find and replace because everything is standardized. And one of the reasons we insist on it being standardized is that it will make this sort of mass changes quite easy to do. And then, thank you for bringing us further. And then after that, we will release RNAseq because obviously, we need to update the modules first to install them in the pipeline. We'll release RNAseq as a proof of concept and then push those changes to the NFCore tools package to get a stable release out. So it's been a bit annoying that you have to use the dev version of tools to actually get the latest versions. And then it's basically because we're not releasing the proper enough. We're missing a trick there in terms of having more contributors to an NFCore tool. So if you like programming Python and using it can help with that, that would be awesome because all of this functionality with sub-workflows and modules and stuff will require some back-end tooling as well. And if you're up for that, that would be awesome. So yeah, the idea is update modules, install an RNAseq and then have a mass release of tools. So I don't think we'll have a release of tools now until we've implemented these changes. I don't think there's any point in having an intermediate release of tools because of the fact that we know these changes are coming. So that's kind of the process, the mind map I think we've been thinking about and implementing this. And then obviously in the future, we've just also had a session talking about this sub-workflow stuff and how we can have standard sub-workflows on NFCore modules. And I think this will be really cool because it means that not only are you just installing individual units or modules within your pipeline, you can install a whole chain of modules and just plug it directly to your pipeline with some sense for default. And so you don't need to worry about those individual units. You just got a chain of modules that you can just plug in and use directly. And so again, we've had some more discussions about that today which have been really useful. But I think that the key there is that what we want to be able to do is test these sub-workflows in the context of modules. So you can imagine if you update a FastQC module and that FastQC module is used by a sub-workflow that runs FastQC and Primvalor, for example, we want the CI to work in a way that if you change the module that it runs the sub-workflow as well. So you can fix both at the same time. And then that just allows the repository to be a bit more self-sustained that way. So people come into Contribute can fix all of this stuff. And everything is always working on a repository for someone that wants to install these sub-workflows. So that's kind of the plan. Edmund's done an awesome job in getting the CI hookup for that. And like I said, we still need to build a lot of tooling for that with this NF4 sub-workflows command. And that would be in the NF4 tools package. So anyone that would like to contribute help with that would be awesome to have on board. In terms of some potential things that we discussed so parallel, the main reason I guess for having you on this call is that for this for this adoption of this new DSL2 syntax, there are a couple of things that would be nice to have in the next minute. Most of the stuff you fix like ridiculously quickly anyway. And I think in our discussions this morning, we basically tested other stuff and ruled out a bunch of other things that we don't need now and closed issues as a result of that. So I think this is the main one left now. Otherwise, we're pretty, there's a couple of things here that goes through these issues with you. So we can maybe discuss amongst ourselves. So this issue is when you change, okay, so now say we've got a conflict like this parallel. We've got extension. Yeah, exactly. So if we change, if we say I run the pipeline once and I realize that for better tools, you know, I want to use different tool specific arguments that I use, resume with that and change this via a custom config, then what happens is that the workflow is actually still cached. It doesn't break the caching mechanism, whereas we would like it to pull down three things. Yes, I don't know, but I tried right now. Yeah, okay. So that's basically one of one of one of the again, it's not incredibly urgent because the simple answer is run the pipeline twice for now until we found the solution. But it would be nice, even if it's not via this extension.us, because from what I understood from my chat that anything that is task.extension or anything that is task.something isn't contributing to the caching mechanism. Is that right? Yeah, you can see what that is there at the point. So changing this via a custom config, say for example, this is the main one just conflict that comes with the pipeline. And you as a user want to update the command line argument for a tool because it's not working. You say you're working on a small genome or a large genome that starts behaving very well. And you need to be able to change this argument. But you don't want to remap all the samples or whatever else you just wanted to resume from where it left off. Then the idea would be to then have a custom process declaration. And again, this is another cool thing about this new syntax is that it will make it more easier to update the options for users as well. I confirm that not just for the any directive, if you change them across with you, they are not taking consideration because the ID or the next little task is only completed taking consideration of the input and the common stream that you're going to run. And I think also like as an exception, the global variable, the point that you are using them as a barrier. In theory that the script section should be changing, but it then comes down to variable replacement. So if we look at a module, so if we look at this module, this args coming in is coming in by task, by extension of args, which we now just specify in our config file here, for example. And then this args then gets passed to the script section. So the script section is changing, but that depends on whether this variable is evaluated. That would be awesome to have that sort of thing because it just means that it will deal with so many things in terms of options and positive arguments around. If we can get changing this to break the resuming ability or even have something else that's not necessarily fit into this, that would be awesome. So that's that issue there, which is just one there. And then, so these ones are niceties, not necessarily required. There's only one required. There's only one really required, which is that task extension one. This is more of a nicety in the, and I think we brought this up with you before, is that we have to have this params or enable condor wrapped in an evaluation because if we just have, and for those that don't know what I'm talking about, just ignore the comments that occurred to that. To make these modules as flexible as possible, it would be nice to declare all of these things in one place and to make them tough and close. So if we just have condor via condor fast you see, because there isn't an enabling mechanism for condor at the moment next well, it will automatically activate and run condor even though we don't want it to, right? Which is why we need this params to enable condor actually then enable it, otherwise set it to no. And all that's really doing in the pipeline files is just in the conflict what it's doing really is, it's just in a profile enabling condor like this. Maybe having like, I don't know, something like this. Yeah, I remember this part. So the point here, if we are going to change now, we create a big mess for all of these things in pipeline because we see that just specifying the possibility is enabling the execution. I don't think it will. So, I mean, all it would be, would be, we just trained these three states. So if, for example, you have a condor that enabled option in next well, right, we're similar to what we have for document singularity like this, or HODMEN or any other sort of packaging system. If we have something like this, then all we would have to do really is just comment out and replace this line. This line can be as simple as that. We don't need to evaluate it because this will only get activated if condor is actually enabled. Do you see what I mean? Yeah, yeah, something like that. I'm thinking how we could do this without regulating this in pipeline. If it is enabled by default, then it would probably mimic the current behavior. So we could just set the enable to false in our NFCore country file. Good point. Yes. So if it is like, so at the moment, if we do this in the module then this will by default use condor. So maybe it's by default in next row set to true and we disable it in the pipeline config menu, which is another option. But again, that's more of a nice idea. It'd be nice. It's sort of a nice to have type feature to get rid of the syntax. I mean, it's not a big deal having a parameter doing its evaluation for now. But I thought it's as we're on a topic of discussing, you know, where we'd like to get eventually with this. I think it's probably worth mentioning this as well. That's that. Yeah, so this isn't really a block like this. Yeah, we found a bug potentially in next row, which I think Gregor has submitted an issue for somewhere. Not yet. I created a note in our board and they'll make a red X one set up a bit more time. Okay, cool. And lastly, I think it's this custom script stuff that we talked about before. So it would be nice to have modules local or you have a folder for a different module like this. And so this is the module and that essentially all that's doing is calling this gtf to bed executable. It'd be nice to be able to have self-enclosed or isolated module type scopes, I guess if you put it that way, paths and stuff are exported on shared file systems or on AWS voucher anywhere else. And that automatically appends this bin directory to wherever this module is run somehow. And that will allow us to basically have custom modules on an app on modules as well, because at the moment it's, it needs, I think we've talked about it to death, but I think it might be that it needs some sort of next row type implementation to make it work. So wherever the module is run, the bin directory for the scripts for that module are also exported into path or whatever else to then run that module. So yeah, I think we talked about the fact that it would be different on the shared file system compared to AWS batch and stuff. But yeah, this is a nice city as well that we would like hopefully to have in the future to do work. And I think some other people requested it as well, but it's not urgent. So I guess to summarize, I mean, we don't need much for me, Paolo. There'll be a few things that would be great to have. I don't know if I missed anything that anyone would like to mention or say or ask Paolo for that matter whilst he's here. I think he has to kill you. We can kill you, but actually I don't know. Can you hear me? Yes, can you hear Paolo or not? Because I think you were not hearing him. I don't know why. He's too far away from the stage, I guess. That doesn't matter, no? Does it matter? I think you need to be in a spotlight that the whole group can hear you. Oh, maybe come on stage now, Paolo. Come and join me. I'll try. I'll give you, yeah, there's a jump in. With your official award service. You can pass to people. Make sure you go ahead. Hello. Okay. So, Mela, something you've got. The most important one is this point about the review of the configuration settings. Yes. Yeah, it likely makes sense to allow this. I don't know if we should only do for this X group or for everything. So, the thing is, at the moment, because we're prototyping stuff, we can be quite driven by what you've had to next row. So, if you don't want to break existing stuff with how X is working, if there's another directive or another option that you'd like to add that isn't cashed, we can easily quite easily adopt that because we haven't really used this with any good force yet. So, we're quite flexible. I was also thinking it's consistent logic to not include or to include, but for example, if a task was executed successfully and you increase the memory, in theory, there is no reason to execute that task. So, changing the memory should not impact the caching. This was, I think, the direction of the same for the CPU. On the other hand, with the CPU, I'm not so sure, for instance, I recently had an example with single cell data where if you run it in parallel, it will get another UMAP block than if you run it on a single CPU. So, for reproducibility, it might even make sense to include the CPU and the caching like this. Why? Okay, because you will say you want to say perform better. That's an issue of the tool personally that one should use the proper, that the tool should be fixed itself so that it reproduces the correct. It should, but like if I change my next .contact file and rerun the pipeline and it says, yeah, everything is cached. It's all right and I've published the result and then someone else runs the pipeline on another system. But now if the result is not cached, but it uses the updated CPU value and it then gets a different result, then the pipeline is not reproducible. Agreed, but I would still record this as an issue for the tool. I did. But that thing, it really makes sense, it really makes sense that if there is one piece task, here it is mentioned in the command script, if that URL can change, surely we should validate the cache for that task because, yeah, people here are expecting that. Change that value, so we change the script so it should be executed. Yeah, I think it could be done. And so the other thing without the constraint is not so. The logic behind it should be possible. Yes, I mean, I guess then, you know, we can be driven by whatever you think is best on the next well end. It just works by flexible at the moment in terms of. The only thing we make sense of is if you could open our initial on-line flow for this, okay, I think my hash sorry already for that. I think I did, but I don't remember now. Okay, excellent. Yeah, cool. Excellent, thank you. Anything else? Any questions about the new syntax? Any objections? Uh-huh, it's very beautiful. Okay, clean up a lot in the logic. And above all, I think because in the future, it's always a plan to have a better way to manage the task at the output. So it's a good step to work on better things and I think that we should make sure to manage the output, not using the published here in the contents of the task, but that's a great mechanism, more configuration, so you're really on the good path from the man himself. So any other questions in the audience about the new syntax, adoption, anything else that you'd like to know or help with or just anything, just unmute yourself and speak. One thing I'm wondering about since we're kind of simultaneously discussing it, so there's a full request for fly in the moment and is there a method currently to pull in different types of data because I mean the initial tuple is the meta and then reads, but dry assembler, for example, takes in Iquio reads, takes in Nanopore reads, and it requires also like a label as well to differentiate between the two different types of inputs. And so when you pass the meta thing in, which would likely have the label as well, so Pacfire, for example, and then the reads, currently kind of limits it to one input set. Do we have a mechanism to kind of like be able to pass both Nanopore with the tag Nanopore and Pacfire with the tag Nanopore to the same process and that's in the current way of specifying input. Because I mean I know we can manipulate the channel inputs with like all the branch operators and more, but it kind of then moves away from how Nanopore is specifying the input. Yeah, I think this is in 99.9% of cases what we have in work, but then you always get these outliers that sort of want to do things slightly differently. And I guess the simple answer, so this module can take Pacfire and Nanopore at the same time and what sort of label is it? Is it a value or what is it? What is that label? It needs to be able to do each type of read input. You specify a flag, so you do like a dash dash Pacfire raw followed by the Pacfire reads dash dash Pacfire, sorry Nanopore raw or Nanopore corrected and that would kind of like be the type that would come in with the meta I guess. There's probably a way to do it. I can't think of a simple way. What you might be able to do is pass the reads in themselves as a list possibly where you've got, I don't know the first element is this mode like Pacfire or Nanopore and then you have a channel of reads within that. So you've got a nested list of a platform and reads and then you can have that to be as big as you want or as small as you want and then you can somehow read that into the channel context, sorry the module context and evaluate that to replace the module. That's what I proposed already but yeah. I think that would be the simplest option for Nan. It's separating out the reads in the channel by platform and allowing the module to figure out how to use that information. Any more questions or comments? So we've got an hour left now until the wrap up and I won't expect anything less than solving world peace in the next 55 minutes so yeah the last sprint. Thank you for joining I hope it's been useful and I'll see you guys back here in about an hour and thank you Paolo for coming and joining us. Ciao, thank you.