 Hi, my name is Alas Dostrovsky from the Galaxy team at Johns Hopkins University. Today, Cristobal Gallardo and I will be walking you through how to wrap galaxy tools from Kanda through deployment. Let's get started. If your tool already exists in BioKanda, you can skip this section and go directly towards the tool wrapping section located later in this video. Your first step is to clone a fork of the BioKanda recipes repository. Once that is done, you can create a Kanda environment in which to develop your package. You can do so with the command kanda create slash y which auto installs the necessary packages, naming your environment in this case seek tk bio kanda and then the packages that you will need within your environment in this case Kanda build. This process can take a moment so we're going to skip ahead to once the environment has finished building. Now we activate our environment by saying Kanda activate and then the name of our new environment. You can see you have entered your environment because it will list so in your terminal here. Now we're ready to create our actual recipe. So let's CD into the recipes folder and then make a new directory for our package. In this case, we're going to be calling it test seek tk. The first thing we're going to do is download the tar ball for the package that we are building using W get. This link can be found on the GitHub for most tools for the releases tab and then you are going to copy the link to the download for the tar ball for that release. With that downloaded, we can check the hash, the installed tar ball, letting us be confident that we will be installing the correct version of the tool in our bio kanda package. You can do this using Shasim-a256 and then the installed file. Remember to keep that hash handy because we will be using it soon. Now we can create the files that we will be needing for this recipe. There are two that are necessary for every package, which is the meta.yaml file and the build.sh file. With the prep work out of the way, let's open the meta.yaml file and get started. The meta.yaml file allows parameters to be set regarding the environment and requirements that are necessary to build the package. First we'll set up some variables to allow us to use text repeatedly throughout the wrapper and to edit it more easily later on. The current ctk recipe uses the version string and the hash. Next we'll put information about the package itself, in this case the name and the version, which we will be pulling from the variable set above. Now we tell kanda more about the package that it will be pulling, specifically where to pull it from and you'll notice we use the version variable there because it's easier to edit and the hash, which again we pull from the variable because it's easy to edit in the future. We also tell it if it has any patches by which we mean any time that we need to edit the actual tool itself, which I'll be showing later, but this refers to the file to look at the patch. In the build section we put the version of the kanda wrapper, which is separate from the tools version. It allows for updates to be made to the kanda package when there are not necessarily corresponding differences with the tool itself. The requirement section is broken into three parts, the build, the host, and the run. The build section is what dependencies are required to create the package initially. The host package is what is needed to build the package on the end user's machine and the run packages are what is necessary for the actual running of the tool. In this case, the build requires make and the standard bioconda c compiler. The host requires zlib as does the end user. The test section is a command that needs to be run at the end of the build process to double check that it went properly. If that command returns an error code zero or a true, it passes. Otherwise, the build will fail. The about section returns metadata for the user. In this case, the location of the original tool, the license under which it is available, the location of that license file in the tools tarball, as well as a quick summary that can be displayed to someone looking to download the tool. Finally, in the extra section we're going to be putting a biotools link so that it can be more easily connected. And metadata can be assembled without having to manually curate. And that's all that's necessary for the seektk-conda-meta.yaml file. Now let's take a look at the builds.sh file. The builds.sh file is the actual command that will be run to build the tool. Oftentimes, this is just a make command, but in this case, it's a little more in depth. We start off with a shebang, which tells the interpreter what language will be being used in this. Next, we export variables that will be necessary to tell conda where to find its binaries that it will need. seektk then needs to run the make command and then place its binaries in the proper places, prefix referring to the default location where bioconda will search to find the binaries. And that's all for the builds.sh file. Now let's return to our terminal. Although I did not generate it in this video, there is a diff file required for the seektk package. We can take a look here. The patch file is simply a git diff of what has changed from the original tar ball to what is required to build the package properly. Now we can build our package using conda build and then the current folder. This is a long process, so we will skip ahead. And here you can see that the test that we added has been run and passed. And therefore, we're ready to create a pull request. Make sure to remove the tar ball itself before the pull request. And then we're ready to add it to bioconda. I won't go into details in this part of the video on how to create a pull request because Kristo will be doing so later. So with that, that is the end of my part. Thank you for joining me and enjoy learning how to create a galaxy tool from scratch. Hi everyone. My name is Cristobal Gallardo. I'm member of the European Galaxy Group and in this training, I'll show you how to create a galaxy wrapper from zero. For this tutorial, it's recommended to have basic knowledge of the linux command line and git. This training is organized in four main sections. In the first part, we'll set up the conda environment. The second part will create the galaxy tool wrapper by using planemo and the galaxy language server. In the third part, we'll test the galaxy wrapper by using planemo and galaxy. And finally, I'll show you how to publish the galaxy wrapper in a public repository. So let's start with the first step. Set up the conda environment. Conda is an open source, closed platform, package manager and environment management system. It allows to automate the process of installing, updating and removing package. In addition, conda allows to create separate environments containing files, package and their dependency that will not interact with other environments. So let's install conda. In our case, we are using linux, so we need to copy this command, paste in the command line and now execute the bash script. So, press enter, accept the license, accept the localization, ok and now. Now we'll set up bioconda, a popular conda channel for bioinformatics software, which provides multiple software distribution for computational biology. So, let's copy this code and paste in the command line. Ok, so let's check if conda works, it seems that it's correctly installed. So let's start with the second step. Create the galaxy tool wrapper. We need to download one of the public galaxy tool wrapper repositories. In our case, we are going to use the galaxy tools repository hosted by PRGronen. I suggest you to have a look at the realme because it contains quite a lot of information regarding the development of galaxy tools. So, for example, you can explain how to create a conda package, also it provides some details about different galaxy tools repositories and yeah. So, we need to fork this repository, we need to copy the address and clone the repository. This is the repository that we'll use for publishing our wrapper once it's finished. Ok, so let's access the repository, let's access the tool folder. If we have a look at the content, we can see many different tools and in that folder we create a wrapper. So, the next step is to install planemo. Planemo is a complex command line tool whose functions include facilitating the development of galaxy tools. For example, the planemo.link command allows us to validate the XML files in order to identify common problems. Planemo also allows to generate a graphical interface of a local galaxy instance, which is very useful during the development of tools. Ok, so let's install planemo. We need to run the command pip3 install planemo. Ok, perfect. So, let's check since that works. Ok, in order to illustrate this tutorial, we are going to create a wrapper for a useful tool called sectq. Among the functionalities of this tool are the conversion of FastQ files into FASTA. Ok, so first, let's check if the content package for this tool is available. We need to run the command content search sectq, just a few seconds. Perfect, the content package is available. So, now we should create a new environment in order to avoid conflicts with other previously installed tools. So, agree. And now we need to activate the environment. The next step is to install the content package of the sectq tool. So, it seems that have been installed correctly. Perfect, now we need to create a new folder for the sectq tool. And, we are going to initialize the galaxy wrapper by using the command planemo tool init. It will generate the basic content. So, we need to use the option macros id equal sectq. Sec name convert to FASTA requirement is sectq version 1.3, example command. We are going to provide a very simple command. Now we are going to provide an example input, an example output 1.FASTA. That information will see later it's very useful for initializing the tool wrapper. And finally, we need to provide the command for the help section. Ok, so we see, planemo have created two files and a new folder. Now we need to setup the galaxy language server. The galaxy language server is an implementation of the language server written in python to assist in the development of galaxy tool wrappers. We are going to use the peaceful studio code extension of the galaxy language server. This extension provides xml validation, tags and attributes completion and other extremely useful features. In the readme file, the different features are described so I recommend you to have a look at it. Ok, so let's open the files created by planemo with visual studio code. Here we have the macros xml and the main file. So in order to install the extension, we need to click in the marketplace, look for galaxy tools. And install, perfect. So now we can start writing the wrapper. When writing very simple tools, macro xml files are not necessary at all. However, it's highly recommended to use xml macros when writing complex tools in order to reduce duplicated xml elements. In the main xml file we can distinguish 5 main sections. The command section, input section, output, test and the help section. Ok, so now the first step is to set the correct format and now we are going to create the input elements. So we need to... Here we see in the help section each parameter includes the type of variable and a description. We'll use that information for creating the element. So by using the galaxy language server, we just need to type gx and select the correct type. And just we can move between the different fields by using the tabular key. We need to fill each fields with the information provided by the tool. In that case we just need to use the information from the help section. In other cases perhaps you need to have a look at the tool repository but in that case it's not necessary at all. So we are going to create a few inputs. Now we are going to create one more boolean gx. Now we need to type the argument, the default value should be empty. Let's check it false. Now the label mark complement and in the help section we are going to include the information effective with minus m. Ok, so you just need to continue and create the rest of the option. But we'll skip that. So let's do some magic. Voila! Now we'll use other other features of the galaxy language tool in order to generate the command section. So we need to click in b and we need to select generate commands. Perfect, now we need to pay attention at the order. So we see that the option should be before the inputs and the outputs. We need to remove that option and move the input file and the output files at the end of the command section. So, perfect. Now we are going to create the test section. Test requires input data and one simple approach consists in look for datasets in the repository. So we can use that one. We are going to copy this FASTQ file in our test data folder. We'll rename it as dataset01.FASTQ and here it is. Ok, now we have an input dataset. Next step we are going to generate an output from this input dataset with this simple command. This is required for the test section. Ok, perfect. Now we have the input dataset and the output dataset. So now we can check how many lines, the size of the output dataset. That information also is useful for creating the test. Ok, so now we are going to auto generate the scaffold of the test section by using other of the features of the Galaxy Language Server. In that case we need to select generate test and here it is. The tests have been auto generated so it requires some minimal modification. We can write a command test01 default parameters. In the input parameter we need to type the input dataset that we have previously copied. We are going to set the minus A parameter as true and in that case we are going to use also the previous information. In order to fill the assertions and we are going to remove a few lines which are not required at all in that case. And also we are going to provide the expected output size which is 430 bytes and delta is the variation. Random variation. We are going to set that as 10 and ok so now we have this test is done. So now we are going to create a second test in order to show you a different way of creating a test. In that case we won't use assertions but we provide the output file that we have previously generated. So value is test01.FASTA and the expected file type is FASTA. Ok so this is done before running the test we are going to include some additional information. For example the citation is also necessary so the easiest way is just to use a paper related with our tools. Copy the toy and we just need to paste the toy in that web page. It will generate the piptext citation. We just need to create any element citation type piptext and paste the piptext reference. Ok so now we are going to create a few additional elements. One of them is the tool version which is very useful in order to make it easier to update the tools when necessary. Also we are going to create an element called version suffix. So now we need to replace the version with this token. And also in the main XML file we need to replace the version with the tool version token. Plus galaxy version suffix. And also we need to provide the profile which is the required galaxy version. And the last element that we need to include in our wrapper is the bio tools element. Bio tools is a web portal for tools registry and service monitoring information. And the use of idam for precise scientific tool description. The information stored in the bio tool registry will be used in galaxy in order to classify the tools according to their function. So this is also an important task. So we just need to look for our tool in the database. Let's see ok there exists an entry for this tool. That's perfect. So let's back. So we just need to create a new XML element. XML name, bio tools, gx and we just need to type the name of the tool. And finally we need to expand this macro in the main XML file. Ok so we can start with the first step, the first section which is the testing of the galaxy wrapper. Before running the test we can validate the wrapper with the command l. And now we are going to move to the root folder and we are going to download the galaxy repository. The reason for downloading the galaxy repository is that planemo use galaxy to run the test. So if we don't have the galaxy repository in our system planemo will download the repository every time we run the test. So we need to copy the ssh address and git clone and paste the address. So now the repository is being downloaded. Just be patient, it can take a few minutes. Ok so perfect. Now let's go back to the folder which contains the wrapper. And we can run the test with the command planemo t galaxy root. We need to specify the path of the galaxy repository and update the test data. Now we are running the test that we created a few minutes ago. That first planemo requires to download some dependencies. Ok now the tests are running. Ok perfect both tests pass it. So we can have a look at the report generated by planemo. So it includes the information such as the command line and the value of the different parameters. These reports are extremely useful for debugging. So now after running the test the next step is to launch a local galaxy instance. In order to check if the tool interface is coded. This is very useful in order to identify misspellings or typos. Also to use the galaxy interface is quite useful for debugging and developing the tool. So once again we are going to use the galaxy repository that we have downloaded previously. Ok just a few seconds. Ok so the web interface of the local galaxy instance is running. We just need to copy this address and paste in the browser. So here it is. This is the tool wrapper. You can see it seems quite fine. By using the web interface you can easily identify problems in the labels for example. Also let's try to run the tool by using the web interface. So we are going to upload the input data set which is inside the test data folder. Here it is data set 01 start. Ok so here is the input data set we can modify some of the parameters and run. Now we can check the outputs and compare the inputs and the outputs. Yeah we can see the FASTQ file has been converted in FASTA format. Ok then we can close the web interface and we can move to the fourth step which is publish the wrapper. Planemo can be used to publish repositories to the tool sheet. It requires to create a special file called sheet.eml. In that case we will use the command planemo sheet init. The shed file will be used to map the files to the tool sheet. We need to run the command planemo sheet init name nameof. The tool owner in that case signs we are using galaxy tools repository. The owner is your running description, short description of the tool, long description. So for example tool for converting FASTQ files into FASTA files. And finally in category which in that case is FASTA manipulation. Ok so let's have a look at the file. You can see this is a very simple file. Ok so let's have a look. Now we need to add the content. Also we need to create a new branch of this repository. Did check out. Now we need to commit the chains in this new branch. Hit commit, minus M and up to files. And we are going to push a wrapper to the repository which we forked previously. So we just need to click and compare and pull request. You need to provide a meaningful title for example add tool sectq and include a description. For example this PR include the sectq tool. So that's all. The last step is just create the pull request against the galaxy tools repository. So this is the end of the training. I hope you enjoy. Bye.