 Good afternoon everyone. So I'm giving another talk for the benefit of the hackathon and again for something that we can maybe point to in the future if required. This particular talk is going to be about the DSL to pipeline structure so this has come up quite a bit on slack and I assume it will probably come up during the hackathon for people that want to convert their own pipelines and and also you know others that may not know exactly why we have certain files in in them in the latest DSL to pipeline template. So, no the pipeline template essentially all it is. It's on hosted enough for tools, and we have a pipeline template directory and in that we have all the boilerplate files that are created when you run NF core. Create you create a vanilla version of a pipeline where all the pipeline names and other information like authors and other stuff gets put in exactly the right places it needs to be to create a pipeline for you so as you can see there's this sort of ginger variables here that that need to be replaced once you run NF core create so this is what the structure of the pipeline looks like and this is what we continually update. In order to evolve the way that we're dealing with DSL to fixing bugs in documentation adding documentation, changing the way generically that we're doing stuff. And so this this template is always involving. And when we do a release of NF core tools, every pipeline the great thing about this every pipeline will get a sync PR, asking to update to the latest template and so that's the way that the pipelines keep up to date. And so the latest iteration of our DSL to implementation would be would have been basically created using a very very similar template and is most likely to be found in the RNA seek pipeline. So this has now turned out into probably the most up to date DSL to implementation that we've got in terms of real pipelines. And so, if I quickly take you through what all of these files are hopefully it makes sense so worth noting is that a lot of these files are actually the same as with the DSL to version of the pipeline template. But there's there's a little bit of restructuring that we've put in and also siloing some some some of the boilerplate code into this live directory, which is automatically imported by an extra runtime so these functions are found by next run time and they're written in groovy, which is obviously what makes those built on top of so if we look at this in VS code so if we just go to seek an open code up there. I want to see my explorer, and so that will give us the same view that we've got on GitHub essentially because it's the local clone of the repository. Just remove all of the additional files are used for testing. And so what we have here are your standard files so you've got the read me file which is just bog standard read me stuff. Again, this a lot of this is great on the template with logo and badges and stuff. You've got an extra config so this params and test actually are something that I've added here so ignore those for now. And the custom config one. So that you've got your next low config which is a standard next low config file where you initialize all the parameters and so on. So you include any other config so this base config for example contains default resource requirements that is in this conf folder here. And so we're just including that in an extra config as well as this modules config, which has become quite an important configuration file for anything DSL to. And so at the moment it just contains a bunch of options that you would use for each process for, say, changing arguments for a particular tool or changing the publishing option. And so on. And so this allows you to customize the way that you run these modules and this has been around for about a year now. And it's tended to work quite well but there are some downsides with it. So at the moment it's quite simple implementation in that this modules is just an instance of a params and so it's just a map that you can then use to have these options and pass them around from your main workflow to your sub workflow to your modules in order to change these options. So we will be implementing a new DSL to native next flow syntax soon, which will hopefully simplify a lot of these things around and so for the most part you may need to add options here to change the behavior of the way a module is run because the default modules will be quite vanilla in terms of options in order to make them as flexible as possible and so you may need to change certain combined line arguments for your particular use case like star align adding all these options for example or or just publishing these particular files this is completely developer orientated so you can customize this as much as possible. So that's quite a key file for DSL to which wasn't there for DSL to. We also have an extra schema which a lot of you know which again isn't specifically DSL to related it's just a way of having computer readable parameters specified within the pipeline so you only have to specify them once in one place. So this can be rendered in numerous places like usage documentation on the website for launching pipelines to use in on next five tower for launching pipelines. It can be used in a multitude of places which makes it very useful. If you have any parameters you want to add you can use the NFCore schema build command with that will essentially just edit this file and allow you to change fonts and sorry not fonts icons and and descriptions and all sorts of other stuff. We have this modules Jason, which is essentially the way that we essentially we eventually decided to do version control with modules. And this is done by the get SHA so NFCore, the NFCore module suite that we've got built in the NFCore tools package has a bunch of commands that allow you to install remove update modules and we use this modules Jason file as a standard way of of tracking information so for example if you if a new version of BB map BB split has been added to NFCore modules and you want to update and you run NFCore modules update it will change this get hash and then that's the way that we're we're version controlling these module files. Similarly, you can also there are options for you to stick to a particular commit of a module so say for example, the modules been updated NFCore modules but you want to use the older version of that modules because it's using an older version of a tool for example that is compatible with your pipeline. Then you can have options in this NFCore.yaml file that basically mean that you won't be able to update this module beyond this particular commit hash and so you can only have this particular commit hash and of this tool installed in the pipeline. And this is exactly why I actually added it for these RCC tools is because the latest version of RCC is broken for some of these tools and so I don't want to update it and I only want to use this particular so someone goes on NFCore modules and updates it. I want to use the older version so you can restrict that and the NFCore modules commands will find this first see that it can't be updated and it won't update it. So that's this modules Jason it's got a get hash of all of the modules you've installed with NFCore modules. The main script in DSL2 is ridiculously tiny now. It was massive with DSL1. Everything was lumped into here because it was monolithic everything was done in one script. You had all your processes and boilerplate and everything in one in the initial template. But now it's with DSL2 and the fact that we're using this lib approach. Over here we can split out all of that into separate files and so I've got this workflow main file that you'll see here. And so this essentially is for the main workflow which is quite workflow main and so it allows you to silo away code specific to particular workflows as well. Just to organize things a little bit better so anything workflow main related called from this workflow I put in that file for example. And then all you're doing is is calling the individual workflows that you have so this you can have more than one workflow so for example the viral recon pipeline has a workflow for Lumina and Nanopore. And in that case you would have some logic here that that basically just calls either of those based on some sort of user specified parameters so for example, the main script there. You're saying if you've said by a param that you want to use Illumina then you execute the Illumina platform and Illumina workflow, otherwise use the Nanopore and so on. So that's the main script is it's much simpler now. A lot of the boilerplate code is in these lib directories we've got a license which is MIT very permissive do as you like pretty much, which is awesome. I kind of expect that for most of the stuff that we do a code of conduct citations which isn't automated be nice to hopefully automate something in the future so you have to manually update this and add the citations as you go along but it's always nice to to site people for their work so I'm great to keep on top of that. A change log again standard standard stuff that Yamlfire just showed you and this allows you to bypass some of the linting errors for example. With the RNA seek pipeline, because it's quite cutting edge we push a lot of changes into that pipeline that aren't necessarily in a stable release of NFC tools yet and as a result of that, we need to put exclusions in here so they don't they don't fail the linting when NFC tools eventually will be released and those those new updates are pushed to the new release of NFC tools then we can remove these, for example, these lines from here and it'll be fine but this just allows you to bypass the linting failures and by specifying which files you'd like to ignore or particular lint tests you'd like to ignore. A standard markdown lint yaml file for markdown linting with some configuration options get ignore files attributes files again quite standard editor configs a lot of these dot files are just config files for the way that we want to to to lint and have particular stylistic, I guess, code within this repository. And then you have these modules sub workflows workflows directly the workflows like I said it's just your main info RNA seek we just have one. So this contains the main implementation of the pipeline. You include modules from from the NF core from this modules directory. And then you can also include things like sub workflows as well, which are a chain of modules, but not quite a workflow so they're just a chain of modules that offers some sort of functionality. So again this is the main implementation of the pipeline. And like I said you've got sub workflows local this is quite commonly asked question why we have local and NF core. So I just went with this convention, because I wanted to separate out any sub workflows that are built entirely from NF core modules, or sub workflows built from NF core modules so you can you can have both and and an example of that would be. Here so so when you, it's a standard sub workflow that just takes a bam as input and then sorts it index it and sorts it. Sorry, sorts it index it and run some basic stats on it and so what I've done is I've I've got NF core modules that I'm using to do this for sample sort an index standard modules, but I'm also using a separate smaller workflow to do that just generate the, this is also an NF core sub workflow, and hence, everything can then be put in this NF core folder. Hopefully in the near future we will be pushing these all directly to NF core modules or maybe a separate NF core sub workflows repository. But for now that they you have to manually copy them and move them across and install the relevant modules in order to use them so it's a bit painful at the moment. So we're having discussions at the hackathon as to how to how to tackle this and now that we the modules functionality is relatively stable. And so any sub workflows where you know you're using custom modules or local modules, you would put in this local folder because they're quite specific to the pipeline. An example of that would be. Yep, so here I'm using a local version of star align so we can't really push that module for example and so I've moved it to the local folder. Similarly you've got local and NF core modules and of course modules are modules you install directly using the NF core modules install command from NF core tools and it's just ridiculously easy you install it provide a name for module and it will install it in this directly for you. You're using other than that local modules are those which are again specific to the pipeline, or you can't push to NF core modules because they're not generic enough for, or there's another customization that you need to use them. And then we have these lib files so you generally have one per workflow so we've got one for the main, we've got one for the RNA seek for the viral recon we've got separate ones for example for the Illumina and Nanopore. So it's just again modular modularizes things a little bit so again you can see Illumina and Nanopore. This allows you to then if you want to strip out a workflow or rename it or do whatever you want with it. It just allows you to have things a bit more modularized so you can do that and update a workflow or remove it and do what you want. And so that's what these lib files are some of these are standard this is to do with parsing the parameter schema and other schema related stuff. There's a NF core template so it just contains the logo and a bunch of other color codes and stuff like that are utils which is generic utilities for checking condo type stuff and joining module arguments. And then you have, you know, you can customize this however you want for this so this is used for the log information the citation, the help generation by the schema, and so on so this is all fairly standard stuff. You can start adding in your own processes and your own validation in this initialized function if you like for checking, depending on how you're running the pipeline and what the context is whether you have a faster as input when you want to check it exists and so on. You can then have other custom functions in there that you'd like to call in the main script to docs standard documentation, a bunch of config like I said the only different one really is this modules config where you have module specific parameters. And there are ways that you would then do this as some documentation at the top. This will be changing quite a lot. Hopefully in the near future, where we'll be using the with name directive to to specify all of these options rather than having a custom solution, which will be very nice. And then, which contains executable files for our Python or whatever you want. An assets directory that contains standard assets that you know that you would require for the pipeline so logo, multi qt configs, the schema for your input file so in this case this just defines what your input sample sheet should look like so it provides some sort of validation on top of the parameter validation where you're now validating the schema of the input file itself to see where it's got certain columns, and whether they match a particular pattern for example. And your GitHub files finally, where you have a bunch of workflows for running GitHub actions, some some templates that you would use for with that GitHub would use for rendering pull request templates and so on. I guess the most important one is this CI one, which is no different from the DSL one version. It's just you specifying which tests you'd write to run, and with which parameters for example. And similarly you've got ones in order to be able to run the pipeline on on AWS and so on as well. I think that covers most of it so overall most things are the same. These three directories are the biggest difference that the module sub workflows and the workflows, where you are now sort of separating out components of your pipeline into module sub workflows and workflows. And then you have this local NF core depending on how customized and I guess local that these modules and sub workflows are to a pipeline but other than that. I think it's fairly standard. Maxime Rique, anything I've forgotten that I should mention. No, I think you're good with everything. Cool. Okay, great. So I think that's it. Hopefully that helps and clears up a few of the common misconceptions and confusions with with the new DSL to pipeline template. Hopefully it will change but hopefully we've got a relatively rigid ish structure for the way that we'll be organizing things in the future. The content and the way we go about things will definitely change but hopefully we'll be able to keep you up to date with that in the future so Thanks for tuning in.