 Hi everyone and welcome back to this next flow and NFCore online community training event My name is Chris and I'm a developer advocate for secure labs, and I'll be the one taking you through the training material again today First of all start up with a recap of what we did yesterday We'll talk about what we will do today, and then what will be covered in session three and four So in a session one we started with a welcome and an introduction to next flow Then we started to Understand how the next flow scripts are written by exploring the hello dot NF script and finally we started to develop our own proof of concept RNA seek pipeline Today in session two we will start with an introduction to NFCore, and we will explore the NFCore website Then we'll start to look at NFCore for users and developers and Finally we'll look at NFCore modules and sub workflows and how these can be used and shared between different pipelines in Session three we will continue to expand on the ideas that are introduced as a part of session one And will be introduced today as a part of session two such as managing dependencies and containers Channels processes and operators. You have an introduction to groovy as well as some more information about modularization Similarly in session four we'll continue to expand on some of the concepts that have already been sort of introduced But not properly explained such as configuration profiles deployment scenarios Cation resume a little bit about troubleshooting and then we'll finish off session four with a bit of information about how you Can get started with an extra tower It is worth mentioning reinforcing here as well that a lot of the things that we have covered and we'll cover today Particularly the configuration and modularization using modules and sub workflows The way these things are written and potentially some of the explanations that I've given and will give today will be very Narrow and we will come back to these in greater detail as a part of session three and four However, if you do have any questions as we're going through the material Please direct those to the dedicated Event channels on slack So we have a number of different channels there for the different languages and we have a number of community volunteers So it'll be there to help you if you do encounter any problems Okay, so let's get started What I would like everyone to do is follow this link here and this will take us over to the NF core website Okay, so when you hit this web page what you'll see is a welcome screen with a little information about NF core What I do want to highlight is some of the features of the NF core pipelines, which is this documentations So all NF core pipelines have extensive documentation. So what you need to install a pipeline What you can expect to do to To use the pipeline as well as what you can expect is the outputs of the pipeline So that's really just to ensure that you understand what you're doing and why NF core pipelines have extensive CI testing. So every time a pipeline is modified On the repo, there's a lot of testing that is done to make sure that everything is still working and that the pipeline is still performing to best practices Pipelines are all released as stable releases. So you can always go back to a previous release using a revision If you need to go back and run a pipeline Is an older version So an example of that might be that you have run a handful of samples or a handful of analyses And then a new sample or something else arrives and you want to go back and do it again or add something into the analysis So it's really really powerful having really solid version control As well as that all the software NF core pipelines is packaged So you can use documents singularity, comda or others to automatically bring in the tools and software required to run a pipeline using appropriate Version controls for that software as well. So you never need to worry about installing anything locally Pipeline is also portable and reproducible. So this is really sort of touching on some of these other concepts such as having packaged software and several releases is that Collectively the pipeline should be portable so that you can run it on separate devices without having to worry about There being too much happening behind the scenes which might affect the results With the things like the stable releases and the pinned package versioning It should always see all be reproducible as well Something else worth knowing is that all pipelines undergo full test Full testing using full size test data on aws Also expanding into using Microsoft Azure for providing full size tests on the cloud Um, so that should give you some confidence that if you do scale these pipelines that they do they do work As mentioned yesterday of introduction, NF core is not just another registry. Um, this is core idea of working with the community So you can find collaborators and sort of work together to make one pipeline really great rather than even working separately They're also in silos We're not potentially getting the the feedback and the input that will be achievable By doing this as a part of the community Pipelines start from a template, which is a really fantastic way of starting any pipeline And it's all integrated into NF core tooling so you can integrate things like modules and sub-work flows really quickly and easily It was also regular template updates Which can be synced in so that your pipeline is kept up to date to allow it to benefit from all the NF core tooling As well as maintain best practices Finally, there's this collaborate don't duplicate ethos, which largely means that NF core won't accept pipelines that are effectively a duplication of another pipeline what we do what you know core Sort of strives towards is having a real community where people are working together on that one pipeline Or a pipeline that does that one particular process Or has one particular function rather And that's really just so that we can sort of collaborate and forces that community Rather than duplicating and creating more work for ourselves by doing something that somebody has already done Down below here, we have a little bit more additional training videos So these are really great resource if you want to sort of understand more or didn't understand something or if I'll describe something poorly These are really great resource to go away and have a look at Some more information down the bottom here about getting started Installing and running next flow and any of core as well as some more links For people that are already using any of core pipelines as well as some more sort of quick links to different parts of the website What I will do quickly is just work along this top matter here and just kind of show you Some of the things that already exist on the website The website really is a great resource for anyone that wants to explore NF core We'll see what pipelines are available and potentially how to run them So over here under pipelines, we have a list of all the pipelines that are available You can see here at the time of recording this there are 75 pipelines that are currently available as a part of NF core And they're all listed here I'm just going to pick one at random. So let's go with Eager What this will do is take you to a new web page for that particular pipeline and it gives you kind of an introduction here For what the pipeline does a bit of a summary How to use it quickly so a quick start some of the summary steps and and kind of Excuse me what you might expect to see and be able to Take his outputs from this pipeline Like I said, this is really just a outline An introduction to the pipeline down here below. You can see some authors of people that are contributed this kind of example of all You know, this is a community initiative So there are lots of contributors who have made this pipeline possible as well as some pipeline. Excuse me some references down the bottom Next along with this sort of second second tier Sort of pages beyond introduction is the results. So like I said, this has been run on a full-size test data set I think with this particular pipeline, you can see all the results here Under the NF core AWS mech test Eager folder posted on AWS You can also see the usage docs So again, this is kind of like some quick start information that you can use to run this pipeline Again, there's a lot of extra information here, which I don't have time to go into But If you do want to use this pipeline, this is a really good place to start As well as this we have a list of all the parameters that are included as a part of this pipeline So in session one of this training event We sort of talked about parameters very briefly You can see here that in a full pipeline there can be lots of different Premeters that are required But thanks to like the extensive documentation on NF core These are well described and you should be able to find everything you need to Understand what these parameters are doing and how you might want to tune them Here as well, we have output docs Which again is just a way to describe what you're going to expect as the output from This particular pipeline Finally, there's some release statistics Just statistics on each of the different versions What's worth mentioning here as well is that if you are trying to run the old version of the pipeline And the documentation has changed you can scroll back to the previous versions here Over to the right So the documentation is never lost Moving along the top. So besides pipelines, we have modules Modules are a little bit like pipelines in which they are hosted and stored on NF core Modules are Effectively processes that have been written and submitted to NF core by the community These are generally pieces of software that Can be shared between different pipelines. So with my mathematics, for example The singular tools might be used to cross multiple different pipelines What we have here is the modules repository Where each of these has been packaged into Nextflow script already as a process or a module Here we already have a list of the inputs and outputs A little bit of information of how to use it as well Something we will be exploring later is how NF core tooling can quickly Install these update and eventually remove them if you need to Remove these modules from your pipelines As a part of this as well, we've also got sub workflows sub workflows are Effectively the chaining of multiple modules, which can also be installed and edited and modified Using the NF core tooling These are larger blocks Because they do comprise multiple modules and they largely string together Multiple modules that are frequently used together Next across the top of the page we have tooling So NF core is a separate tool to nextflow and has its own Set of tools Here we have a table of concepts lifting all the tooling that is available Like I said, we will work through a lot of this today as a part of this this session And we will largely sort of work through this and then mostly this order We'll start by sort of talking about NF core making sure it's installed We'll work through some of the execution Sort of commands and how that works and then we'll sort of talk about how to create pipelines using some of these NF core create for example how things like minting and schema can work And then finally we'll sort of conclude on how the NF core modules and sub workflows commands work And how that might benefit your workflows as well Over here we have documentation So to the left we have usage which is a lot of information about how you can use these pipelines Over here to the right we have contributing rather Which has information about how you can contribute to the NF core and some more information about best practices when you are writing NF core pipelines, modules and sub workflows Down the bottom we have some extra tutorials The material used today will come from this Well, most of it will come from this at least but we'll come back to this very shortly NF core is a community and like any good community also it has a lot of events Thanks to funding from the chair in zuckerberg initiative. We're able to run A bite-sized seminar series. So weekly we have Members of the NF core community present either a pipeline Or some of the tooling And these are a really great reference for anyone that is Sort of new to the community or trying to understand a pipeline more a part of tooling It's just an introduction. So these are short about 15 minutes followed by questions And they are really really great resource like I said As well as this we do have other events that do occur Quite regularly. So more training events like this where we have community members who present The training material as well as the NF core hackathons. So the hackathons are Effectively events which are hosted in gather town where we have community the whole community gather or as many as community that can attend And we will work on different issues and improvements on NF core pipelines modules tooling everything to do with the NF core world It's worked on and it's a really great event a lot of fun And it's a great way to sort of meet people that you might come across as developers In the community really sort of get to know them which is which is always a lot of fun Okay, finally on this page. We have about which is a lot of information about your core about its history Bit more about the community and some community statistics. Excuse me statistics A list of publications some information about the mentorship program. So again, thanks to the James Ackerberg initiative We have a mentorship program where mentors and mentees appeared up and work on A project together for about three months We have the code of conduct as well as some more information on actually joining NF core on different platforms So slack, which I hope everyone has already joined slack is a really great resource If you ever have any questions about a pipeline or a piece of tooling or just want a bit of guidance Slack is a really great place to go The community is really fantastic and proactive in answering questions Of course, we also have the GitHub repository a twitter and mastodon and a youtube account Youtube is where we store most of the videos like this As well as all the bite-sized seminars for example So the youtube is a really great resource as well Okay, so Today's training material is different to session one and what we will do in session three and four as well And this is just because we want to use the nicole tooling And it's nice to see a slightly different environment. So what we will do is we will go over to the docs So this is where we will force we can go to docs at the top of the page Down here on the right underneath contributing we have tutorials and we will click on this link here Which is creating pipelines with the nicole So this was some material that was created for the last set of trainings in october 2022 And we're going to use this again as kind of a base for the nicole training content But also i'm going to kind of diverge from this a little bit When we talk about actually executing an nicole pipeline and some of the considerations that you might have When you try and sort of configure this for your own environment So we will click on this launch git pod Right here. It'll bring up your git pod environment and start to load Load this this effectively a git pod repository What you can see is again always take a few moments to load What i'll do in the meantime is that if you are trying to run this locally and you've already got the exploit installed You can install NF core using pip So it's hosted here on pi pi NF core is the NF core tooling is written in python So you can install it from here and also from Anaconda a bike on the rather using the conga install commands Okay, so that's all loaded or is Loaded now This is just git ledons. You can close that And everything else is still initializing So like next load NF core is continuously updated and improved Um, so like occasionally there can be bugs and other things that come through on this Please don't be too alarmed if the latest version does have a bug. It'll generally be solved pretty quickly Um, and just as I say that there is a bit of a bug here. I'm not quite sure what's going on Um, but let's just try and clear that and hopefully it won't cause too many issues Okay, um, so what i'm gonna do is make this a little bit bigger. What's that 150? Hopefully you might can see that I don't want to make it much bigger. I always we lose quite a bit of real estate When viewing some of these commands and outputs What we will do first Is let's make a new directory So we should land when you open this is in the tools folder. So this is work workspace tools And we will make a directory called training We can now see that this has been created over here And we're going to move into that folder We can see that it's empty and we're actually sitting in this folder the training folder now What I'm also going to do is you can see over here in the explorer that we've also already got quite a lot of information a few different conflicts chain logs markdowns MMO files. We don't actually want to be in this folder when we execute Our commands or start using the different input only just because some of this might interfere so Like I've already done I've moved into this training directory Like I said, you do want to be in a different folder. Otherwise it will kind of compete In some way, but just to kind of remove this to make it a little bit easier to follow what's going on I will click up here. So these three lines the top left hand corner I will go to file And then I will go to open folder So this new table of open open folder We're already in workspace, but we want to click on tools And then training so this is the folder I've created if you've called it something else or it's something slightly different That's okay Really what all I'm doing. It's just sort of resetting this this browser view. So I guess slightly nicer Explorer so that all of that has been empty Okay So that is empty. You should have a nice cleaning terminal because I've just cleared it And the explorer is empty as well So what we will do first is I'm just going to show you that next flow is installed So you can see next flow version 22.10.1 That's fine the version and this position doesn't matter too much And also we're just going to check that nf core is installed Again, this should be installed because this environment has nf core installed already but nf core Let me go dash dash version It'll print out the version which is 2.7.2 If you're watching this at a later time and the version has changed, I don't anticipate it's going to it's going to make Too many differences what we're doing today At least in the sort of next six to twelve months but in saying that If it does you're always to actually revert this to use a previous version Okay, so here we are it's still empty. We're still in the right folder Next flow nf core is installed fantastic So the first command I want to sort of talk about and show you is this nf core command So this is different to the next flow command And what we're going to use is is help So this will just list all of the The different commands that are available as a part of nf core We will work through a lot of these today And largely what's going to happen is we'll start with these commands for users and then we'll sort of branch into these commands for developers So the first command that we will look at today is this nf core List And what that'll do is list all of the repositories all of the different pipelines that are available on nf core So here you can see the pipeline names How many stars they have the latest release number when it was released So you can see if one scan was released two minutes ago, which is exciting released this this version is When it was last pulled so this every time you run or pull a pipeline from nf nf core It'll it'll give you a last pull and it'll store it on your computer for you and hit the next five folder As well as do you have the latest release and this is important Later on that if you're trying to execute an older version of a pipeline A different release. You will need to sort of update or modify what version you have stored on your system Anyway, um, let's try and pull a workflow. So this is going to pull the workflow down from the repository and have it stored locally for you So for this you can use the next like pull command We're going to pull this from the nf core repository and let's just pull Pipeline that comes to mind, which is just sick. So again, this is next slide pull nf core chip seek What you can do with this command as well is that you can actually Start to shoot nf core here for one of your local repositories Of course, that isn't a nf core pipeline, but it works the same way in that you can Pull down a pipeline and then sort of modify and update it control it using different revisions as well So what we can do now is we can check to see if this is being downloaded using the nf core list function again I've made this a little bit bigger so hopefully you can see it And as you can see here, what we have is the chip seek pipeline Which has been downloaded from nf core dot re chip seek The version when it was was last updated when it was pulled by me and if it's the most recent version which it is As an example, you might also want a slightly different version for your reason So you could go for minus r and say you really wanted version 1.0 0 You can type that in and what will happen is Next flow will go away and pull that down for you So you can see here done revision a slightly different revision number also known as 1.0.0 So when you go back to nf core list, you'll see that this is now being pulled It is no longer the most recent version. It is version one While previously it was version two If you wanted to change this back you can just type in something like master So the most recent branch is known as master Or the most recent commit is known as the master commit the most That is the version's release has probably been a better way to describe it You can see here. So I pulled master We can look at that again and it's been put back up to 2.0.0 Which is cool Okay, so What I will do now Is show you how to execute an nf core pipeline And simply you don't actually even need to pull a pipeline What you could do Is just run Your pipeline so next flow run You can type in the repository name in this case we're pulling it from nf core And for simplicity I'm just going to go for the chip seek pipeline again Actually, no, I won't let's go for something different. Let's go for What's a good one that won't be too big I think RNA seek is quite popular. So let's look at that So next flow run RNA seek. So this is a different pipeline. I haven't pulled it. This wasn't listed in When I use nf core list I know that when I run this pipeline It's something that we will discuss very shortly is that we need to use some profiles to decide What package manager is going to be used and also in this case bring in some test data But I know that with this pipeline that you need to set an output directory Which is a parameter in this case. I'm just going to call it Results So what I have here Again, as I have my next flow run commands nf core is the nf core repository for the RNA seek pipeline. I am using profiles with most pipelines you can You'll probably need to specify this one profile For the software manager that you want to use So as a minimum I'd expect you that you might want to use something like docker If you want to use condor Or singularity There are others of course As well as that I also want to use some test data. So as a part of this pipeline all pipelines have a built-in test Data set or test profile, which is the minimal set of test data Which is used for the automated testing when it's pushed up to github But you can also run it locally to test that the pipeline is working on your system Like I said, I'm going to use docker because docker is installed in the system and I set the output directory You might not need to include this on all pipelines But I just know for RNA seek pipeline that it is required If I was to try to execute this without it or just give me an error message saying that it isn't working Please include the output Or please specify an output directory Okay, so that's pulling the nfql RNA seek repository So much like the next-flow pull command If you just go extract a next-flow run that will run it in the background This is the warning. I wasn't anticipating but it looks like it's Just telling us that it's using this gtf file as a priority because there was a gtf and gff file parameter both included So that doesn't look like anything to worry about What it's done here is just list all the processes that are included as a part of this pipeline And what we'll be doing now is is going away and downloading those Those soft bits the software From docker hub and then executing those commands with docker What we can see is it's already starting to sort of process some of these one of one 100 percent It's preparing the genome Taking the input samples and this will just tick away slowly over time In the meantime what I think I will do What's worth showing actually If we do your nfql list again, what we should see is that This pipeline has already been pulled and then it is the most recent version That we have on our system, which is cool. So I don't have to actually download it and run it As well as that something I could have done is actually just called a slightly different revision straight from the command line So all that's still running that'll take some time Oh, I could have just for example done next flow run nfql RNA seek Profile test docker Um Outdoor results. So this was the command I typed before it's not basically typo z i've missed But say I wanted to run version 2.0.0. I could have just put that in there and it would have run that version for me okay While that is running what I'll do Is actually look at some of the documentation online and then we'll have a quick look at the actual repository On github. So if you are just trying to execute a pipeline some of this stuff isn't necessarily essential to know but it will help you understand how some of these things fit together And that's quite valuable. We should are trying to configure a pipeline So one thing that is worth Understanding is that with next flow you can actually configure a pipeline in lots of different places. So here on the next flow documentation This is just the latest Here under configuration files There's an explanation of how Next flow will look for configuration files and parameters In a number of different places and these are treated in a hierarchy in that things Specified in the command line will override things that are provided in the prams file Or a config file or in your next by config in the current directory work directory Your home directory or The main dot any file And all of this might seem quite overwhelming because it is quite Intimidating when you realize that it can be parameters stored in so many places But most of these files will never have to touch if you are simply executing a pipeline And then of course has tooling to help make executing these pipelines easier So that you never really have to overthink her or get into the nitty gritty of writing a The parameter file for example So one thing that I will show you in the repository over here. So this is the actual repository the github repository for the RNA-C pipeline Over here. There is a config file called conf co in if And so I don't have a number of config files The first one I wanted to point out is the base config. So this is just A config file with a lot of information about the processors Um being run and here we have this with label Um with label basically means any process that has been labeled with this will be given these resources to run So this is quite a powerful way of Allocating resources for different modules and you'll find that most modules on any court have been given a label Or running with sort of high medium low or single resource allocation Um, so that's really good. This will be automatically loaded So you'll never have to really think about this apart from applying labels. Um, if you're a developer We also have an i genomes config here. So one thing I haven't talked about is that with nf core pipelines and the nf core template You can automatically download and include um genomes that are stored on Um, aws, um the 11 i genomes reference files Um, and all these files are available. It could be pulled automatically as a part of your pipeline Here we have a modules config So this is um, quite a big and potentially quite complicated file Where every process that has been named has been given a separately block Um, we have some information about that pipeline or how that process has been run Um, in this case where the publish directory is for that that module is included here What you might find that some of these also have Extra arguments, um, which can be used to help sort of process Um Help control the running of that module. So this is really important for Modules that are coming from nf core where you might find that Running it in your pipeline might be different running into someone else's But largely you should never have to actually touch this file If you are just executing a pipeline, but it's worth knowing about if you're interested in becoming a developer Here we have the test config. So this is where this kind of minimal test data set comes in Um, here we just have a little bit of information about um, or some names that's been called itself a test profile and it's a minimal test data set This is a test profile So something I haven't spoken about yet is that nf core also hosts a large amount of test datasets. These are all quite small files and reference reference files for example And these can be brought in directly into your pipeline just by specifying, um, the web address And what's cool about this is that all of this is hosted elsewhere And then if you are executing this locally it can be brought to the pipeline in this test profile Um, and like I said, the test profile is a really useful and powerful way to test if a pipeline is running As he's part of the automated testing when and when it's pushed to github And also executing things locally, which is what I've done Um over here by using the test profile okay, so Ah, of course, as well as that is um the full test profile. So this is when we push it up to, um, aws and run it on the cloud Um, this is the the test profile for that As well as that we also have the next flow.config. Um, so you'll see this listed over here um in the config What is in here is effectively all of the the default parameters So most of the time you'll find that um parameters are switched off or the default is for something to be turned off Um, and then you'll find that it's turned on as a part of a config or in the command line So a good way to think of this is that Um, you know, things are turned off unless you turn it on Um, because you don't want to execute things that you don't need to as a part of the pipeline As a part of this down here. We also have these profiles for Um different software managers. So like I showed earlier conda docker singularity But there are a number of others that you might consider using depending on your system So like I said, um, what was the error there? I'm not sure why that's happened Let's kill that for now. Sorry. Doesn't he think I don't know what it's doing there Okay, so I think this might be a git poly type issue, but I'll um, I need to troubleshoot that later um Anyway, that's not overly important, but what we can do is um Excuse me jump back over here. We're where we Of course we're talking about the configs. So Back over here. This is what we had previously run Looking at this again. We can see next-flow run NF core RNA seek. This is that sort of base That we keep coming back to these have been included as profile. So these are included As a part of that con folder or the config folder And as well as a part of the next-flow.config so we sort of specify this docker profile as well This is all to manage our execution What we can also do is overwrite different parts of this So this is where Everything happens on the repository Is it as a user if you're just trying to execute the pipeline you don't actually need to touch that And there are different ways to do this the first is By using the launch command So What you'll find Is that NF core pipelines are all developed in a way that there is extensive documentation and also schema for Actually understanding what's happening with parameters and also helping You use them with with other NF core tool links and the tool I want to show you is NF core launch Which is this one here. So NF core launch Again, I'm going to go back to this repository Even though we didn't have that error We can have this appear in the browser. So this is excuse me the terminal And what we can do here is we can pick the version that we want to run. I'm going to go for 3.10 this time. Maybe that's maybe the error was in 3.10.1. So that's Still trying to do something a little bit different here What this will do is it's checking the parameters against the schema. So if we go back here We can see that there is a schema next slide schema chasing Which is a big messy file That's a very structured file, but there's a lot of information in here, which most of you won't want to touch Luckily we have this for us I'm going to use the web-based interface unfortunately with Hitpod it isn't always overly intuitive to access this in the browser So I'm just going to exit out of that. So it does say that it's aborted But what we can do is follow this link here And what this will do is take us to a launch window Here is the launch window And what you'll see is this is effectively the schema that has been rendered in the browser And what we can do in here is actually edit this with all of the different parameters that we want to use in our execution And what this will do is create a prams file that we can use to execute the pipeline So here you can see that I've sort of used this before Here I've got named my first launch profile coming at test docker The work tree tree has already been already been provided We're not going to resume So you might remember from attending the first session that we can use this resume function If we've run the pipeline before and it can use the cached results Here we have an input You might need to provide this normally but because we're going to use this test profile There's already been an input path provided as a part of that In our project tree we start needing to supply which I'm going to click results to We could add an email if we want to Also things like a multi qc title save much save merge fast queue lots of other things happening at near What we can also do here And I'm just going to commit a bunch of these Is we can change some of the parameters that have been set by default as part of that next flow config So these are all the the default configs and what we can do is modify them If we don't modify them and they'll just stay as a default or not a little empty or depending on what the The actual parameter is But what's worth noting here Is that all of these are really nice instructions as well as You can get some help text if you need some more advice on what to do or how to set these So instantly What was quite a lot to sort of understand or comprehend to execute this pipeline becomes quite simple And what I want to do is just click a bunch of these to try and demonstrate Kind of what we're going to get is an output What I've actually clicked doesn't matter, but I've just changed some of these from the defaults Then we can click go up the page here and click launch Unfortunately with Gitpod, I don't think this will launch in the browser No, because of the water there here But what we can still do is copy this launch command with this ID So this entire session has been given this ID And what this means is that whenever we reference this ID it is referring to this session Alternatively, we could copy this and then save it as an nfparams.json file Um, anyway back here what I'm going to do is just Paste this this is nf call launch ID. I'll split my ID number Click enter You'll see that it's gone away. Um, it's pulled Um some information from that browser. It has introduced this nfparams.json What this is this is a json file of all of the parameters that I've changed Now when we execute this using this command here This params.file flag will use this params nfparams.json And execute the pipeline. So all those parameters I've changed have been introduced and included Um, so we can just click yes here and this will start running So I think that's pretty cool. So what's happened is that I've edited Or created this this json from the browser using the nfcore launch command or nfcore pipelines will have this We could use all of the information Already provided by the schema to understand what was happening and get some extra help if we needed it And then we can go straight back into the browser. Okay, so these are all the warnings because I've missed all the settings And we can actually execute this um straight away So all of the extra hassle of understanding how a pipeline needs to run Um, thanks for the documentation the schema the conflicts Has all been done for you and we didn't have to get it anything in the actual nfcore pipeline Um, so I'm just going to kill that again Clear that So I mentioned before that there's also a hierarchy Um for running a pipeline So When we run this This is just telling you that the file already exists Um It is just loading again. Okay, so this is because I've gone back to this This page here. So let's copy that again Do I run on this command now? So this is the command that it wants to run Um, I'm not going to run at this time So as I mentioned before there's this hierarchy And what you will see is that in this hierarchy Oh, where are we? The prams file sits above um a config file So you can introduce some of these parameters in a config file as well You might also want to introduce some configuration for running your pipeline Based on an institutional config or something else like that as well We want to explore that as a part of this session. Um, just because of time But you can actually introduce a lot of different Things into a concrete file like this without using the nfcore launch command As a part of this there's actually a huge number of different scopes that you can use So if you're trying to modify your docker execution If you're trying to modify a particular process There's information there So here for example, you can say use the sun green engine among long queue when submitting this to the hpc Um, and we also have like prams. So this is where these are the parameters for the actual pipeline This is largely what was modified as a part of the the Excuse me the prams json file already As a part of nfcore, um, sorry, this is a little bit of a diversion, but as a part of nfcore as well, there's actually um You can submit conflicts which can be pulled directly from the web as well So a config might be something that you use that like an institution for example to write a particular pipeline Or deploy it on your system A good example of this might be Here for example that we want to start using prams Oh, it's a profile Um I know that there is one for the quick so the quick as an institute that has submitted a profile to nfcore profiles There was documentation online for this But say you were working in a large institute and you're using the same cluster system You could submit a profile for your institute It'll be reviewed and accepted by the community and then you can just automatically run this on your system Um Using this particular profile and this is something I would recommend if you do have like a shared resource that lots of people are using Um, so here for example, there's some information about the institutional, um, config provided Um, of course, this is foul because I haven't actually added an input sample sheet here Okay, um, what I was actually working towards was showing you that you can Um provide Um some extra information as Um During the command line this is because we need to even with the command line is kind of overrides everything else And this is what I was mentioning or trying to describe when I was talking about this hierarchy Um for the configuration listed here Um, so just for actually demonstration. I'm going to keep that there But I am going to change Um This parameter here So, you know that this is called the parameter. Um, this is a new name or chris I don't remember too much, but we can just execute this again. Um, I won't let this run to completion I'm still using this prams file, but we've overwritten um The prams file as well as profile because we've added this to the through the command line Which is the name of the parameter followed by The actual value that is being included as a part of that So you can see this has been added here But you can imagine that if you're trying to override the number of cpu's being used or any area of the time Um, or a different step like this. You can do that quite quickly and easily So, um, that's quite a clumsy example of how you can use configs to Sort of control the execution of a pipeline. Um, what we talked about was using the henev core launch command Which of course brings up that web browser and you can control the configuration with different parameters Um, if you want to use things that aren't parameters, then you will need to sort of include that as a as a config Um, so what you can do is just create your own config file. Um, I won't spend a lot of time on this Um, you know My config dot config Um, you could store this somewhere and then specify this as a as a As path but like hypothetically you could Um Have a prams block And then just sort of, you know, put in whatever Whatever you want here cpu's equals two or five ten or whatever Um, as a part of that And as a good example of this, I think you can actually use the test profiles Um, back here for for any of the pipelines Um, just to actually see what like a config might look like Um, so here for example Um, oh, it's called max cpu's other cpu's. Um, it's two. So you can just put this into the pipeline Um Here, um, I said a 10 or someone say two or four or whatever Um, start listing these as we go down Um by adding different parameters that you've already added Um, you can also do different things with like profiles and the management of your container software and stuff like that All of this is really sort of described in configuration. Um, and I wish we had more time to dig into this in detail Um, but I just wanted wanted to demonstrate is one that there's a hierarchy To that this is any any of core launch command to help you configure any core pipelines But also there's a huge amount of information here that can really help you fine Tune any pipeline execution without needing to modify the actual code base Okay, so that is kind of execution um We can see That any of course our pipelines that you sort of pull them and run them very quickly and easily from the repository Um, you could also do stuff like get clone and bring it into your system. Um, and then just run it directly Say you had it sitting in the folder in a folder in this directory. You could just run it straight like that as well um What I will do next is sort of jump to downloading pipeline so, um This is for people that are running offline. So a lot of people might be at institute and their high performance cluster is offline And they don't have access to the internet With any of core you can also download a pipeline Including all download including all of the Um singularity images that you might be using for the execution of a pipeline So in this example, I am just going to say, um, I'm going to go for RNA seek All the pipelines that are available included here Um, we might want to download, um this version here We want to include singularity images Do I want to define a cache? So this is this is a really nice way that If multiple users and use the same cache directory on a system, you can specify that system now So that all of the singularities images are downloaded and stored there You don't need to worry about Um, sort of storing the same images multiple times and every user having a different different cache So I really recommend this. Um, you can also export this In your system, um, I'm just going to do this in my workspace tools Just so we can see it populate down the side of the screen here. I'm just going to call it training Um, do we want to export this? So it's open to every time in the new terminal. So this is adding it to your um To your bashrc file, um, we're just going to say yes It is going to ask us, um, how do we want to download this? So this will really be a preference thing if you want to download it and then unzip it in your system I'll just try and download it as the singularity images Um for ease. I'm just going to go for a zip file And then I'll start downloading. Um, this might actually fail. I'll get popped because singularity isn't installed Oh, here we go. No, but it does still download which is cool. Actually, I don't know how to do that Um, so just downloading all the singularity images and you can see here that it's populating down here Um with all of these different images Um, and the idea here is that you could download everything and then export it To your offline system using some sort of internal transfer What you can see over here is that we already have all the um the configs Um, or a bunch of config files or the singularity images are getting populated there Um, we also have all the workload information here. So this is what you'd expect to see on kithub as well Um, but yeah, like I said, all this has been downloaded. You could transfer it to your system Um and execute everything there quite easily and quickly So again, this is another way that extra pipelines are very portable You don't need to necessarily overthink how you are going to get all the software and install them offline Um, and I really do recommend doing this if you are working on a system that is disconnected from the internet Particularly those that might be working at a clinic, for example Um, okay, so that's just going to continue downloading and running in the background I don't think I need to do too much for that at the moment Um, I really just wanted to show that this is an option For people who don't have internet access on there. There's sort of main hpc system Okay, um, what we're going to do now is jump across to some more information for developers. So This is going to be a little bit different And this will include a bit more sort of nitty gritty detail about like some of these files and some of the things that are happening As a part of like the automated testing Um, and also some of the other tools that are available as a part of the net core That really sort of help maintain best practices Um, so I'm going to make a new directory. So I've just moved back one directory Um, and I'm just going to call this training to And so we can see here that look I'm going to move into that training to Directory some bad. There any practice there, but let's not worry about it We're going to go to file open folder Training to um, so this is just because all this is downloading that other folder and I don't stop it It can be more trouble than it's worth. But this is just a nice way to sort of load a new environment In a new folder without having all of this sort of happening in the background Okay, so again, I'm in this new folder full training too. I've just reset my spore off to the left there Damn nice new clean and spore window so Like I said, this is really for people that are interested in creating and developing a pipeline So NF core also has a lot of tooling that is available for the creation of a pipeline. So It's going to be seen here. We kind of have this this information for users I'm going to talk about list launch download. I've skipped over licenses But here we're going to start with with NF core create so NF or create is Basically how we start with a template so we can start a new core pipeline We're going to call it demo This is a description And the border is going to be chris. So again, all I've done is come in a name which is demo a description. This is Description not very creative of me and the author, which is chris Here given the option to customize which part of the template we want to use For the purposes of this, we're just going to say no We're going to include everything as a part of this With a little bit of information here So just bring it to the top of this further up the screen Because NF core partners are heavily integrated with Git and version control Here it is recommended that you do sort of Move them to directory and then sort of push this to Git So you can start tracking this online using your GitHub account I'm not going to do that here. I'm just going to keep pushing on But as you can see as well, Git is also installed on this And I can use git status We can see that there are the master branch and we are operating Oh, that's because I'm in a slightly different directory apologies Okay, so where I was previously because I'm actually working in a GitHub repo for the NF core tooling I've moved into my new folder and now into my new pipeline, which I created with the NF core create And you can see that here on the master branch What I actually want to do is show you branches So here we're actually on the master branch, but there's also the dev and the template So the template I mentioned earlier in templates I'm going to give you the template updates and you can sync these into your your dev and master Dev is kind of like the The development branch for a pipeline anything that goes into master has to go through dev first According to NF core best practices and what this will do is it will sort of force you to go through multiple reviews of people in the community or locally as well as all of this is controlled by sort of automated GitHub actions Asking for reviews before it can be sort of merged into these dev and master branches You can turn all of this off, but this is just an idea It's really me just explaining that a lot of this is happening behind the scenes and that for any pipeline to be Maintained or updated on NF core. It does have to go through regular review processes Okay, so I haven't actually changed anything on this pipeline. Yep, so you can see st a Status is there on the master branch and there's nothing to commit. So there's nothing to worry about there I do want to mention very quickly as well that you'll you'll never have to touch this template branch If you start messing with that, then you know, you'll be in for a world of hurt Just because you'll have a lot more conflicts as you try to keep your pipeline maintained using many core tooling Okay So one thing I want to do now Is to show you that we're sitting in this folder. This is the big one folder very similar to the RNA sec folder that I showed you previously I've just moved back one trick tree. So we can see that the pipeline is called NF core demo What I want to show you is that this is actually a working workflow Oh geez, I'm just going to clear that move it up a little bit so everyone can see it So I'm going to use next flow run Because this isn't on the NF core repo We don't actually need to specify that because next flow will automatically look for a pipeline locally before it We start looking at the GitHub repository or another repository that you've specified In this case, the NF core demo is sitting in the directory that we're in So we're sitting in here at the moment What we are going to do is we're going to use a profile. So all NF core template pipelines already come with some profiles pre-installed The first being a test and the second being Docker for example, and it's going to ask for an output trick tree as well very similar to the RNA sec pipeline. That's my results this time So that'll start running again. What it's going to do is it's going to go away and pull that pipeline Well, she doesn't need to pull the pipeline this time. This is going to run the pipeline What it'll do is go away and pull the tools from Docker and execute this locally on my system So as you can see in here, it's given a name we're using Docker as container engine I've got some information about the launch directory work directory. All of this is just happening as it is a default for That is created as a part of the template So we can actually go and look in here and we can see in our config folder like shown before we've got this this base config which is all this information about You know the resources that we're allocating so this will be very familiar as well as this test config, which is very minimal Excuse me a minimal set of test data that can be used to execute this pipeline So what you'll see down here is that we've got the pipeline starting to run And you'll see the processes that we've got some processes already installed These can be seen down here in the modules. So we've already got local Local and ethical modules. I'll explain these in more detail as a part of the modules and sub-work flows content, but for now Something you haven't been exposed to yet is that you can actually separate out Your processes from your main workflow to help readability and you can just include them in that main workflow This is something that we covered in greater details a part of session three But for now, all you really need to know is that In here in the main workflow folder You know this process for example The sub-workflow excuse me has been included from sub-workflows and down here we're including like the fastqc module for example Which is an in-of-call module from here, which is just a path to this folder wherever it may be Here it's all here for example And this might look familiar although a little bit more complicated Being a process with an input and output a win statement, which you wouldn't have seen before In some scripts as well as a stub, which is something that's relatively new as well anyway, I digress What I wanted to show you was that this this pipeline has run Everything has run to completion We're getting sort of some some bayy local stuff up there, which I think is more Have a good pod thing than anything else And what we have is The work directory we saw the work directory yesterday So this will be all the the hex numbers Generated as a part of this so this will be one for each process and four for this one here So 6c is one but if we were to run with Answering like false would have seen multiple lines to that With the rest of these that are missing here And we could go and look at all of those in detail. I'm actually sure if tree isn't stored on this. No, it's not And then of course in my results we can see what's over there There's some results from fastqc, multiqc and some pipeline info. So What I really wanted to show you here was that this this initial NF core create pipeline is actually working pipeline Straight out of the box And that's a really nice way to start your pipeline Because it already has all the folders and other things that you might expect to find just already got some some configs Which is a really great place to start For your development What we'll find as we build on this and then the rest of the session and the future sessions Is that this is a really powerful and I think important way to start a pipeline because It really sets the foundation for best practices For for the rest of your pipeline. Oh or using best practices for the rest of your pipeline Okay, so What I've done already is created a pipeline now It is you might expect we could spend a lot of time sort of looking at different folders and trying to sort of play with different modules or functions and You know, there's there's really endless possibilities from this point But I talk about this practice a lot and you might think oh man Like he just keeps going on about these practices like you know, it's a lot of hot air But I don't think it is I think NF core has some really great practices And they have a lot of tooling to help you maintain these practices And the first one I wanted to show you Is this NF core linting so linting is a way that we check a pipeline against the NF core guidelines In this situation what I'm sort of referring to as best practices So What we will do is we will now use the linting on the pipeline that we've just created So again, here we are in the training to folder. I'm going to move into the NF core demo folder You'll need to be in here to actually execute this and then you can type NF core lint So for those that are unfamiliar linting is just a way that we can check the code to make sure that it conforms with Different standards. So for example, we can check that Some files haven't been edited or Modified in any way so What we have here is the results of The the linting we see that 180 tests have been passed and zero tests have been ignored 22 tests have warnings and zero tests have been failed Up here we have 21 pipeline test warnings. So what you'll find as a part of the NF core create Function is that it does introduce a lot of to-do statements And if you are interested in Great way to resolve these and how you might use these yourself By having these to-do statements, you can use this to-do tree Over here, which is already installed as a part of this environment And you can actually go through and identify these and jump to these straight away Here for example As ways to check and give yourself a mind is to come back and edit certain pieces of code In this case as a part of the template, these are all to-dos that he might That would be recommended to go back and modify To make your your pipeline better Okay, so that's great We have we flag here saying that this module is out of date. Um, we will update this as a part of the NF core Modules content, but if you will be getting too very soon But what I wanted to do is actually sort of defer for a little bit Um and sort of continue talking about this linting So there are a number of tests that are Happening sort of behind the scenes when we do this I think that Some of those tests for example Can be viewed here. So there are a number of files that you shouldn't change as a part of this Here's just some information about the NF core tooling There are a number of files that like I said shouldn't be changed or we'd be dislikely changing Such as the code of conduct for example So over here in the pipeline That we have just created Oops, sorry over here We can go down here to the code of conduct which is written in markdown And as a part of NF core, we sort of have these this this code of conduct about how people should sort of Interact and communicate and work together on these pipelines As a part of this you might think okay, I want to change this and sort of remove this or Do something Which is discouraged So here for example, I do What I want That I've saved that Because this is a file that is maintained or checked as a part of linting So one of the linting checks is to check that this file hasn't been modified if we were to run this lint test again We can see that this test has now failed Obviously we we don't want you to Turn off these checks if you don't need to we'd highly discourage it, but for some pipelines you might need to So what you can do is actually go away and modify this to turn this off so over here in the Which file is it it is the NF core.yml This is where some of these checks are sort of controlling and Controlled rather As well as some information about the pipeline is stored. So here you can actually add in some some extra code I've taken this from the training material It's on the NF core website But all I've done here is I've turned off the pipeline to do so this will stop the warnings because of all the do statements and here I have Allowed the files unchanged to exclude code of conduct So I can save that there oops Sorry, I've just closed the wrong window I would as quickly reopen that and fix my code and Okay So the last thing I was doing was I was adding in some information of this NF core.yml file What I've done here is just add in Some information for the linting which I've turned off the pipeline to do's and also add an encoded contact to a list of files Unchanged meaning that it will be ignored from the linting test So now if you were to run NF core lint As I'm doing the on the bottom there Let's move it up on the screen so you can see it a bit easier You can see that all the tests are passing We have two tests that are ignored which are the two that I've just specified being ignored the code of conduct and ignore the list of to-dos And we still have that one test warning from multi qc because I haven't done anything about that yet That is really where I wanted to leave the actual linting tests But before I go any further, I wanted to talk a little bit about prettier So prettier is used in NF core for checking the quality of your markdown documents So you can use prettier by typing in prettier minus c for check And then adopt for the directory that was sitting in if it wasn't a different directory You could specify that path and what it does is checks the quality of the markdown code What I want to show you is that you can actually use this to Check the quality of your code and also make edits that are required for it to comply with An easier readability format So prettier is automatically run when you submit your code to NF core. NF core could have a repository It'll run a series of tests as it's uploaded. So through the The ci's and this is really just to make sure that everything is of a similar standard and quality so that If someone else comes along they will have an understanding of your markdowns and it'll make sure everything's formatted properly So what I've done here is I've just quickly edited the changelog, which is a markdown file A .md file and I've removed the line between the heading and the actual text What we will now see if we run prettier again is that we see that they're our code style issues And I'll ask did we forget to run prettier in which case we did And we can go back and run it again by this time with a minus w So minus w means write and it will automatically fix these issues So what you'll hopefully see now is that in the top here It has fixed the issue. So I've reintroduced those lines that I initially Removed from this markdown file So again All I've done is I checked it initially and then just rewrote With prettier and as already mentioned as well This is really just happening at the CI level so when you submit a pull request to github and one of the NF core repositories, this will automatically run To check that the quality of your markdown is is good Okay, so it's really that we also had we also used black for checking the quality of the The python scripts in this case, there's only two python files. One of them will probably be up here and then In the bin or that is here it is For the check sample sheet So it's just making sure the commas are in that place and then indentation is correct Um, so that's just another way that we really check or any of core checks That the quality of your code and your documentation is is is high and is a good standard Okay, um, so I don't want to talk about next Before we move on to modules and sub workflows is the next flow schema So down here is the next flow schema And as you can see it's a bit of a monster So this is what's used to render the information for the parameters on the website But also in the NF core launch, which we've just talked about So whenever you add a new parameter, you will need to add it into the the next flow schema and then the JSON file But as you can imagine, this can be incredibly difficult when you have so many parameters and it can be a bit of a monster to edit and format properly Luckily, NF core already has some tooling to help you with this So the NF core schema function has a few sort of sub commands build doxlint and validate And what we're going to be looking at today is the build function So what I'll do first before We try and do anything else is actually introduce some new parameters to this pipeline So I'm just going to call that bar My new parameter bar and we can call that something when it's going to commit a string Which lots of good string which is called a dog And we're going to add another foo which we will just give it a number Let's just go for 32 and we'll save that So say you're developing a pipeline You think you've done a good job of including new parameters Everything looks good, but you run your lint and you realize that you've gotten to update your schema That should be detected as a part of a linting test Yep, so we see here two tests have failed because there's picked up two parameters that are missing So what it's saying here is the parameter bar from the next local click Which is a file that we've edited has not found in the next row schema This is where we can use an NF core schema build function To help us resolve this So instead of actually editing the schema file We can use this tool to Get us most of the way there and also open up a nice browser that we can edit all this in without having to touch that that JSON file So add to pipeline schema. Do I want to add prams dot bar? Yes. I do want to add prams dot foo. Yes So it is running that's the schema, but even better it gives us the option to launch a builder for customizing and editing this in The customizing and editing this So we can go over here and click open and this will take us to this browser window So this is a rendered version of that next flow schema file So everything that was already included is already here. So we have the inputs the outputs We have descriptions of what they are the types Default values if that's relevant We've also got the option to say if it's required or hidden and we have some other options here to modify other settings if you want to All of this is included under different groups and you can add different groups up here along with adding other parameters at the same time What we can do is we can scroll right down to the bottom here And we can see that we've had these new lines created for our new two parameters bar and foo What I'm going to do is just move those up into this section to demonstrate that you can quickly and easily drag these around You can see that we have this ID already And we have the default values which are very specified as a part of that next flow config Remembering that that next flow cloud profile is where I've installed or included all of my parameters Default values So first things first like I said, I've moved these up into this group, but I want to give it a nice cool We emoji for Probably a bit of a dog, but anyway, I've added some icons Of course for these you probably would use something that's a bit more familiar for um, what the actual Premeter is doing, but here I've just added in some random ones I'm happy with your IDs, but I want to add some descriptions. So here I can just type in This is bar. It is a string Down here. I've got foo. This is foo. It is a number Here I can actually add some help text. So over these little books I can click that and then I can add in some help text. This is helpful Summation mark because it is very helpful Um, so things like this will be rendered on the launch. Um, like I said as well as the website Here I can change the type so at the moment it is string But we have options for numbers integers and billions depending on what type Um, you want to choose um, you'll have different options and different defaults So that's going to give me an error because dog is not a not a string Is not a number or an integer If I change it to boolean, for example, it would automatically change this tutorial false But I'm going to change that back to string and keep it as dog for now I have the option to either have it as required or hidden For these options which in this case are Generic options, most of them I hadn't already so I'll just do that as well And over here we have some additional settings. So here I could say It has to be a numerator value I could add in some sort of pattern or I could choose a format So because this is a string I could say it has to be a path or a path directory Or it could be a file or directory I'm just going to close that for now for foo, which we've got as an integer We said as a default it's 32 I've now decided I want that to be one and I can edit that here I could of course again use a numerator value. So actually you had sort of set Criteria, you could sort of set options that you can sort of navigate between you can include those there Or you can add a minimum and a maximum value and I can save that as well When it feels like I'm done, we have a big long schema at the bottom here that is already being rendered Automatically you can see that we've got some options that have been included there Alternatively you can just click finished What that'll do is take us to this next page which is says, okay, this is done And if your end of course schema build is still running you can go back to it and everything has already been transferred there So we can go back over here And we can actually look at the next slide schema We can scroll to the bottom and we can see that this has already been updated with what I've done in that browser Which is fantastic Again, I just want to reinforce that you should never actually have to touch this this next slide schema file You can use the next slide schema build function You can do this as many times as you want to go back and edit this and keep editing it as you remember things or decide to change things or restrict your things It's a really fantastic function and You know, it's always it's always really nice to see people discover this If they are developing a pipeline because it is one of the less amount functions, but I think it's one of the more important ones Okay, anyway, um, so now that's been added we could run out any to call them again and see that everything is passing because that's now being added to the schema 179 tests passed to ignored one warning and nothing has failed which is success Okay, so I think that's where I'll leave it today For this part of the course. I'm just gonna go away and have a wee break. So I'll see you all very soon Hi everyone, thanks for your patience there So what I've done in the background Just as you'll notice that my screen is a bit different now What I've done is I have just tidied up my window a little bit remove some of those those Um, some of that code I had shown up the top there. Um, I'm now sitting in this this training two folder Um, down here in the terminal I've actually moved into the the NFCLE demo, which is the pipeline I created using the NFCLE create function okay, so What we will do now is jump to talking about NFCLE modules. So like I said earlier NFCLE modules are processes They're used to execute different tools and software that are quite common in bioinformatics or shared across different bioformatic pipelines Using NFCLE and next flow. So NFCLE has tooling for this So if you type in NFCLE modules You'll see a list of commands that will come up. So list info install update remove hand patch We'll explore all of these now as well as that. We also have some extras for actually developing new modules This is something for Developers I'll touch on this very briefly, but I won't dig into it as much Just because it is kind of that next step up and it won't make as much sense until you've done the rest of the rest of the training anyway Okay, so what I will show you first is this module's folder over here. So When you create an NFCLE pipeline using the template, you'll automatically create some folders here such as modules and sub workflows Modules are where we store the processes That we're going to execute as a part of our pipeline So you might remember from session one that we have processes included as a part of our our large index flow script With dsl2 and next flow you can actually have these stored at different folders now. So here for example This statement here, which is part of our main demo dot NF script So our main workflow we are including this module that is stored in a different folder So here for example, we have the fastqc script Over here in the modules the NFCLE folder We have the fastqc folder inside that we have main which is the process for executing fastqc written by the community As well as that in the modules folder, we also have local. So this is these are other modules other Other processes that have been written In this case, this is specific to the workflow and it hasn't been shared with the NFCLE community So you could also write and store Local modules in this folder as well And the way that you include those is very much the same as here But instead of including it from NFCLE, you just include it from local as shown up here With this include statement Jumping back to the terminal What I will show you now is some of the functionality of this tooling. So the first is NFCLE list And this will give us the option to look at all of the modules That'll That's NFCLE list after all the pipelines and you put a modules in there So this is NFCLE modules list It'll give us the option to either look at the local or remote modules So the first thing we'll look at is local So this is a list of all the modules that are already installed as a part of NFCLE and this pipeline So here we can see the custom dump of software as a fastqc and the multi qc Which are all included here under this NFCLE folder inside the modules folder We could also look at what is on the remote. So The remote these are all the the modules the processes that are already available On the NFCLE repo. So depending on what tool you're after they might also already be a module That is available for you to download and install quickly and easily Going back to this list of list of tools. We can also ask for information about these modules So using NFCLE modules info What we will see we'll get asked this question is as a module locally installed first of all, let's say yes It is now going to ask me for what module we want to look at I'm just going to say multi qc because we know multi qc is installed over here And this is the information about that module And it's got information about what the inputs are and what the outputs are This might not make a lot of sense here because we've been talked about all of this in detail Again, we'll come back to this in session three But for now it's just I want to show you that you can't address or look at the information for these modules without actually having to go to the website And open up the modules information page We can also look at information for modules that we might want to install So we haven't actually installed these modules yet, but we want to investigate if these tools are available and what they look like So we're going to say no this time So the module is not installed And what I can do is I can just look through all the modules that are available typing the start of the command or the piece of software That I might be interested in Mem there's quite a common tool used in bioinformatics and here I can already see some information about this this tool For example, what the inputs are what the descriptions of those inputs are and the patterns So much like a pipeline is a lot of documentation and information around what can be what is included as a part of a module If you are creating a module for NF cores, you'll need to sort of create all of this information Using a similar schema format to what we have for the pipeline And of course there is totally help with that as well We'll keep moving along. So NF core modules also has This install function. So this is really where things get interesting and really really useful So we'll have typing the wrong command there. I want to type in NF core modules install So this is where I can use NF core tooling to find the tool on that NF core repository and install it locally So what I'm going to do is just look for a tool as you can see there are there are lots of different tools here I'm going to go for something quite simple. So cat fastqc fastq And as we can see here, this is now being installed. What has happened Is that over here in the explorer, you'll see that this cat Has been added. So this is the cat folder and the side that we have Fastqc folder. So this is actually a laid Layed module And in here we have information on the cat Fastq. So this is the actual module that has been installed. You can see that this has been installed as well as the meta This is the meta information that will be populated if you use NF NF modules info And What we also have down here in the command window is this include statement So with this you could quickly and easily copy this over to your to your workflow Here for example, and you could just quickly drop this in And the formatting here isn't overly important, but to make it pretty old is exit that moved that over So now my main pipeline if I wanted to to execute this I could just include this and then somewhere down to my actual workflow block At that process much likely to just as a part of session one Okay, so As a very quick summary that is that is NF core modules installed You can see that there are there, you know, 700 800 almost tools available If you don't look at the repository they're all listed here like I said And it is really really quick and easy just to install these Here's another example. So it's been added over here automatically as well What you'll also notice is that if you do something like NF callant Like we were doing earlier Just taking a wee second to run All of this information has been added. You'll see that all the tests are still passing. There are now more tests Um, what is happening in the background here is that all of these tools are automatically added to this file here called Jason Modules jason and this is really used to help track the tool. So there are things happen to this tool And it has been modified or edited in some way and it no longer fits with what's happened on any core Or what is stored on the NF core module. Um, the version has changed Either locally or on the remote This modules dot jason file helps record all of that. So this is a really nice way Um, so that we know that the tools are updated or moved much like what was happening when we were looking at the NF core modules in the last When we're looking at the NF core create we're looking at the linting previously I mean that that multi qc was out of date. This is the file that's actually keeping track of that so any core modules Again, it's really really powerful because we've installed The dsl2 module into the pipeline using this install function Hypothetically we may need to update a module. Um, so for that you just do update So there's a game. This is any core modules update You can either name a module that you want to update specifically or you can just say all modules If you wanted to look at the differences, I don't think there's any differences here because everything's already up to date But you could just have no previews And what it's doing here is it's checking in access modules jason file We're seeing that the modules in the core all of these tools So the three they're originally in the inner core create that I created with the NF core create and these two that I've now added Are all included there okay, so That one's a little bit harder to actually sort of live show because when you import the The modules they will be automatically up to date But I just want to point out that there is this update function that you can use to Update or keep you or keep your processes up to date with the latest version in the tools So if the tool is updated on bioconda, for example, that module has been updated on the repo You can very quickly update it and your pipeline using this using this update function What you almost so excuse me also might find is that you now want to remove a module So the tool is no longer relevant or has been superseded or you just want to remove from your pipeline for whatever reason So you might want to use any core modules remove So here we can list the modules that we haven't installed. So I'm going to remove the bwa align This is populated based on what's already included in your NF core Modules folder up over here in the explorer I can just go bwa align. I can enter Oh, whatever in the wrong window Now automatically that has been removed and you'll see here much modules jason has been modified as well. So As you'll see down here in the command window, it says removed files for bwa align And it's all it's dependencies. So bwa align has been removed. It's no longer a part of my pipeline And like I said, you can see here up in the modules jason has been removed as well So again, just to reinforce this this modules jason is really used to help track all of your modules Coming from NF core. It won't be tracking what you're doing locally. So these Modules that you have over here and the local folder and the modules folder aren't tracked in the same way So that is worth sort of pointing out as well that this is a really nice way of sort of controlling and keeping your modules up to date You help get help from the community Because anything you want to do locally here, you're going to have to manually do yourself, which can be quite time consuming So it's another way that the community and having an open community is a really Awesome resource because it helps you keep your code up to date Of course, you don't have to update pills if you don't want to you could just sort of update one by one without doing all of the time but Having a community to to help you out is is really fantastic Okay, so we're getting to the bottom of this list here. So again in your core modules Uh, we've just removed a module there's also this function here, which is called patch and this is um, it's a really cool function as well. I think because quite often what we'll find is that Um, you know, you might want to have a slightly different implementation of a tool and what's written on the NF core repository doesn't quite fit your purpose So for some situations you might find that You need to edit a module So this is more for a developer who has found that they need to edit a module for some reason What you can do Is here, for example, let's just add in another output, which is just going to be a path to Let's just split this into two a little bit of one So here, for example, uh, all i'm doing is i've just split this tuple. So now this would be admitted as separate channels, not the one channel Um, I'm just going to change the formatting to help readability a little bit um, this may all fall over because I Uh, making this up as I go along But what I've done here is to split this out into different channels. We can use this emit Um, and we're just going to call this uh meta info Um, this won't make a lot of sense at the moment because we haven't talked about, um, how The output is structured, especially with a metamap and your mitt, but this is something again It will come back to what i'm just trying to show you here is that we can modify one of these nf core modules Um, and save it so i'll just hit save So again, this is one of the The modules that i've introduced this is this cat fast queue which I have installed using the nf core modules Um, I've gone into the main dot nf file and I've edited it So now it will no longer match what is on the github repository on nf core If I was to Um run nf core lint right now What you'll find is that this test is failing because the local copy of a module does not match the remote So the nf core tallin is detecting that I have modified this and it no longer matches. Which is what is happening remotely So ultimately I could go back in and I could uninstall this reinstall this Work out what's happened and actually fix that but in this situation. I'm actually going to use this This patch function So nf core modules patch is another function That can be used to help manage your modules your your processors This is it here and what it says is you can create patch file for minor changes in a module So what I can do is nf core modules patch hit enter It'll ask me what tool I want to patch in this case. It's going to be the cat fastqc It has detected what is different in my module versus what's happening locally versus the remote It has changed the meta yaml Well has noticed that the meta yaml is unchanged and it has created a patch file with this dot diff What you can now see Is over here And the Excuse me in the browser Patch file of modules and of course cat fastqc written to modules and of course cat fastqc cat fastqc dot diff And what we should see is that this is now being modified So first things first is that if you actually go over here You can see that there is a new line being added to the modules dot Jason so you can see that there's now a diff file And this should be populated over here in my Modules folder so we can go modules nf core Cat fastqc Okay, I'm not quite sure why that isn't there that should have popped up So yeah, okay, so it does appear to be there I'm not sure why it hasn't popped up, but you can see that this is oh here we go And I'm just took a wee while to render You can see that this has been created over here is this this Cat fastqc dot diff and what this is doing is just saying that and I'm on these lines of code. This is different Please don't fail my lint test because of because of this this difference So if we were to run nf core lint again, so again nf core lint You can see that this is now passing we have this diff file here and the differences that are tracked as a part of this The differences that have been clues for this module Have been accounted for and it's also been tracked. So now If this module changes again, you might need to update the patch But ultimately it has been tapped it has been tracked If this module was to be updated on nf core Say there's a new version this tool comes through for example, then this patch will still be applied Unless there are differences that affect these lines of code specifically It would automatically be integrated and used and it's all being tracked So again, this really helps with that reusability Of pipelines and interoperability by shooting on interoperability The sharing of a pipeline between yourself and others because every change you have made has been tracked Using using this tooling Okay, so That is everything that is really for For developing using existing tools So listing what's there finding some information about them installing updating removing and patching There's also some information down here about developing modules And I won't like I said earlier I won't go into this too much because it is a little bit more complicated And we haven't talked about everything that you might need to really understand what's happening here As a part of this workshop But just quickly what I can show you is that there is some functionality for creating your own modules So you could use nf core modules create We can create the name for this new tool So apologize. This is now down the bottom of my screen, but we're just going to call it my new tool It is looked for dependencies for this on anaconda But we can also add in for the bioconda package So what's happening here is the tool is actually looking at Like I said, I can add a condor and bioconda for a tool that matches this name Of course because it's called my new tool. It doesn't exist But here for example, you could add in samtools Okay, so it's like that But effectively what you can do here is go away and find the bioconda package that is named like this I thought samtools might have actually popped up here. I'm probably typing it in wrong or leave something else Oh Cool, okay, so it does it has worked yet What it has done it has automatically gone away And found the samtools or makes version of famtools on bioconda And it has downloaded included a couple of co blocks that are used to download these tools From document singularity and have those images available for you already So if you're trying to create a module for a tool that is already on bioconda You can go and do this. I added an old symbol there Here is just asking for my names. This is just asking Who's creating this module? I can change the number of resources here So I just we talked about labels very quickly earlier. So this is the label that I've added to this process Don't want to create a metamap with some information again We haven't talked about this properly which is partly why I don't want to talk about this too much right now But we're just gonna say yes And this is now creating a new module So this is a really great way to start off if you are developing a new module. We need to create You know a new module using a piece of bioconda software for example What this will do is it will automatically give you a lot of taboos that you've got to go back through and add everything You need them to create that module You can see here that it's really created the process with the name my new tool. It's given it a meter id Again, we haven't talked about this yet. This is something that you can do Using metamaps and next flow to help label and include Information about your your samples has been passed around This is the process low. So this is the label that I added to the module earlier Because I asked to use samtools So this was earlier when I typed in samtools into the terminal What it has done is it's gone away to bioconda and it's found this tool The tool called samtools with the most latest version And this is included the code blocks required for managing that using singularity and docker as well as bioconda using conda so When we use those profiles, which you've talked about already with docker singularity conda These tools will automatically be downloaded and included It'll run as a part of your pipeline So you don't need to worry about installing this locally next label download It's all run all this for you without having to having to touch any of this on your local system Here it's just added in some sort of temporary information some information Like you normally expect to see as a part of a process for next flow. So we have an input and output As part of this was have a wind block, which you wouldn't have encountered before Down here. We have a script block with a couple of definitions Down the bottom here. We have the actual code block with the with the samtools tool Again, it's pulled some of this from the web. So you haven't had to do a lot of this yourself, which is really really cool So again, a lot of this won't make a lot of sense right now because there's a lot of code there There's a lot of sort of to do some things that won't make a lot of sense Please don't panic about any of that right now What I what I really wanted to show you is that there is this create function And it does get you most of the way there for creating any module which can be included in your pipeline Again, it's really just a demonstration that there's there is tooling to help you with a lot of the things that might be challenging, especially if you are starting out but Like I said, please please don't panic. It's okay to not understand all of this is doing and what all these to-dos mean What I wanted to show you is that there's tooling exists. It is there's quite a nice way of doing things What else do we have down here so any of core There are a lot of other functions here, which I don't think I'll go into too much So what we have here is create test YAML So this is auditioning a test YAML So each NFCore module if you are creating as a part of the module's repository rather than a pipeline repository You can create test files which used to automatically test the modules as they are created and integrated and used on NFCore This is partly why This is this is all kind of separate to what we've done already So this wouldn't be creating a local module, which is only going to be installed here in your pipeline This would be for example, if you cloned the The modules repository from github, the NFCore modules repository that is And you wanted to create your own module and sort of submit that for everyone to use Linting, hopefully you'll see by now that I'm a big fan of linting and the air core linting tools These really help keep your your code up to best practices and anything that is sitting Outside of what is expected does create a warning or an error We have some code here to help you bump versions of the module If you are increasing a version, it will go through and help you update all of the version numbers We have a tool here to help you create mold images of different tools So an example that you do have multiple tools as part of a scripting block This can help you create and Run a mold image Using biocontainers And finally, we have a test function which will run module tests locally So like I said, there are a lot of testing that goes on behind the scenes with these modules Helping keeping everything in line and running properly These things trigger the warnings like we saw Excuse me, like the warnings that we saw when we started editing that module All of these are really useful functionality for anyone who is developing I will move on very quickly just because we are running out of time In the session, so finally we have NFCore sub-workflows So this is very much like modules As I said earlier sub-workflows are the chaining of multiple tools together to create Kind of a block of code or a block of modules that are frequently shared between different pipelines This is something that is relatively new probably in the last six to twelve months sub-workflows has really sort of Gone from an idea to an actual tool that has been included as a part of NFCore As you'll see, there's a lot of the same functions, info, install, list, remove, update As well as create and the create test demo Because it's a little bit newer, there's slightly less Functionality, but I think what you'll find is this will be updated Very, very quickly over the coming months Like I said as well, because this is new There are significantly less modules that are available on the repo But oh NFCore sub-workflows list remote So here's a list of all the sub-workflows There's probably 30 something there at the moment You can install these the same way that we have a module So we can go sub-workflows info We can ask for information about the workflow and let's just go for this one down the bottom because I can see it VCF impute glimpse Again, we can see the information like the Kevford module. So what's coming in What's coming out of the pipeline We have information here about how to install this Which I think is what we will do just so we can actually visualize it in the browser So again, you can do any core sub-workflows install So I've just copied that command that was generated as a part of the info We have this block here, which we can include into our main workflow Again, we will revisit how to do this as a part of session three And what's happened over here is we have created A new sub-workflow in this NF core folder So much like modules with the local and the NF core folders as part of modules or sub-workflows We also have a local and a NF core folder Here we can see this the sub-workflow block Again, we haven't talked about this in great detail. So this this structure won't make as much sense But you might notice that there's these include statements again. So these are bringing in the modules We have a separate workflow which we can then use in our main workflow. So this can be included Using similar statements to what we see up here It is taking in these channels. We have these main blocks this main block where it's sort of Chaining all of these together. So all of the different modules that are included up here So go into chunk phase and Like a rule included here and files will be taken as inputs that we taken in and process through each of these modules And then given as outputs Out the other side as well using the emit down the bottom here. This is a slightly different Functionality to modules. So things are a little bit different You'll notice here that the wording is a little bit different with things like emit And take rather than input and output But as a whole what's happening here is that we have this this block of code Much like a module that has been imported or you can import into a pipeline um So for example, I would just copy this and put this into my my main block um at some point This will be slightly wrong block I'm pretty sure put that in here and pulling the core modules has to be the space for it. Okay. Um, so None of these will work because I I haven't actually sort of chained these together as part of the difficult pipeline But what I wanted to show you here is that you can import the sub-work flow very quickly and easily Using any core tooling as well Okay, so what are the functionalities? Do we have their sub work? Lows Okay. Yeah, so we still have like remove an update function as well So you can remove a fun remove the sub workflow just like even stored it as well as updated as these are continuously updated and improved on By the community because they have some functionality here to help you create your own as well as create a test YAML Again initially because we're working in a pipeline repo. We would be able to create This will create a sub workflow and install it locally, but if we were in a modules repo, which is slightly different It would create a few more test files, which we won't explore today as a part of the session Okay, um, so that's everything I wanted to show you as a part of the core tooling today Just to recap very quickly And of course a different set of tooling outside of next flow Um, it really relies on this sort of community to develop our best practice pipelines And it's part of that there's modules and workflows a lot of functionality as a part of the nf core tooling Which is really fantastic and will hopefully make your life easier if you are either executing or developing the nf core pipeline What you will find is that the nf core lint function will be your best friend because it helps you keep your code up to best practice But also there are lots of functionality here to help you launch download And sort of develop like I said develop your pipelines using best practices um Okay, I think that's it um Finish off today. I will just say thank you again for coming along and attending the session Um, I hope that you've all managed to learn something about the nf core tooling and potentially some of the best practices And we're all interested in joining the nf core community Tomorrow in session three we'll go back to the material that we started in session one So that basic next flow training will continue that by delving more into operators channels processes as well as a few other things So i'm really looking forward to it. Okay. Thank you everyone for coming and we will see you all again tomorrow