 Hi, my name is Catherine and I'm one of the administrators of Galaxy Australia. I'm going to talk about Galaxy tool management with ephemeralis. So I'll talk about how tools are configured on a Galaxy instance, what the Galaxy tool shed is and how to install tools from it onto Galaxy and what ephemeralis is and how it can be used to manage tools on a Galaxy instance. A Galaxy tool or wrapper is an XML file describing the software program. It allows Galaxy to display the tool interface and execute the software. The wrapper XML contains definitions of the tool input form and instructions to translate form entries into your command to execute the tool. The underlying software packages needed to execute the tool command are called requirements or dependencies and this could be something like SAM tools or Biopython or any number of software packages. A tool shed repository is a code archive in the tool shed containing Galaxy tools. The tool panel on the left hand side of a Galaxy site contains all of the tools available on that Galaxy instance arranged into sections. In this picture the text manipulation section has been expanded and you can see all the tools within that. There are tools that come with the Galaxy code such as the uploader but the vast majority of tools on a public Galaxy have been installed from the tool shed. The contents and the layout of the tool panel are customizable. The contents of the tool panel are defined by configuration files and Galaxy knows about its tools based on what's in these files. The tool quant file contains Galaxy built-in tools and manually added tools. The shed tool quant file is managed by Galaxy and contains all tools installed from the tool shed. The integrated tool panel file contains all of the tools from the first two files and it can be edited to change the layout of the tool panel. For example, you might want to rearrange tool sections in a different order and these are small snippets of the tool quant and shed tool quant files. The tool quant file contains paths to local tool wrapper XML files and each tool element is within a section telling Galaxy where to put the tool in the panel. The shed tool quant also contains paths to the tool wrapper XML within the directory structure that Galaxy uses for the shed tools. So in this case it has the file is shovel XML and the shed tool quant also contains metadata for each tool. So the tool shed is Galaxy's app store. It's a separate entity from your Galaxy instance and it's a free service that hosts repositories containing Galaxy tools. It's not a development platform. Tools are usually maintained in open source GitHub repositories and uploaded to the tool shed and the tool shed can be found at this address and this is what the tool shed interface looks like. It has thousands of tools, almost 8,000 tools as of October 2020. You can go to the tool shed site to search the tools by name or to browse by category. To add tools to your Galaxy instance, you can do this manually which is useful for tool development or from the tool shed and this can be done via the admin user interface or using ephemaris which we'll talk about soon. And to add a local tool by hand it can be added to the tool quant XML file provided that the wrapper is stored somewhere on the Galaxy machine and can add the path to that wrapper to the tool quant file. There's also a variable in the Addisable Galaxy role for doing this and in this case the dependencies need to be installed separately whereas with tool shed tools the dependencies will be installed along with the tool. Installing tools through the admin UI is pretty easy and administrator can look up any tool from the main tool shed in the admin panel and click the install button. Installing from the tool shed will install the tool dependencies. This is typically a virtual environment containing every package that the tool requires. A tool shed tool might have multiple revisions and this is important for reproducibility. If you use a tool in an analysis on a public Galaxy server it will be there forever. If you need to rerun your analysis in a year's time the tool that you've used will still be there even if there's a newer version of the tool that's been installed. When we talk about the tool shed we're talking about the main tool shed but there can be other tool sheds. The test tool shed can be used for repositories that are not yet production ready and these repositories are public. The tool shed is a web app backed up by a database and anybody can run one but it's discouraged to run a local tool shed. By default Galaxy only accepts tools from the main tool shed. The list of accepted tool sheds is in a file called toolshedsconf and the default file for this contains only the main tool shed though there's a commented out test tool shed in case you want to install tools from there. When installing a repository from the tool shed the repository is downloaded to the Galaxy machine and the tool's dependencies are installed if needed and also if needed reference data tables are installed and an entry for each tool is created in the Galaxy database or the tool installed database depending on the configuration of Galaxy and the tools are added to the shed toolconf. When installing with Ephemeris you can find the repository that you want to install in the tool shed and install it with the Ephemeris Shed Tools command which we're about to talk about. Ephemeris is a Python library for Galaxy management. It can be used to install tools, reference data, workflows and data libraries onto a Galaxy instance. It can also be used to run tool tests and it's a package that can be installed with PIT. It contains commands that can be used to manage tools through the Galaxy API. There's no need to be using Ephemeris commands from the server running Galaxy though you can and there are links here to the Ephemeris code base and the Ephemeris docs. One of the Ephemeris commands is get tool list. This can be used to get a list of installed tools for any public Galaxy instance. For example, you could get a list of tools installed on Galaxy Europe. An API key isn't required to run this but some options are not available unless an admin API key is provided and any Galaxy user can have an API key. You can get it from the user preferences. An example of output from get tool list is this YAML list of tools. ColumnMaker, BWA and Tabular to faster and you can see for BWA there are two different revisions corresponding to different version numbers that have just been added here as comments and each tool has a name, an owner and a section label. You could actually use a file like this to then install the tools on another Galaxy instance. A Galaxy administrator can install tools with the shed tools command by providing their admin API key. They can specify the name, owner and section label or provide a YAML list of tools and there are actually a lot of different options available for shed tools installed. And so as an example, installing Circos. These two examples are equivalent one and two. Circos can be installed with name, owner and section label provided as command line arguments or alternatively they could be provided in a YAML file and the advantage of the second approach is that there could actually be lots of tools listed in this YAML file and could be installed all at the same time. If one wanted to install a different revision of Circos they could supply the revisions argument but by default the latest revision of Circos will be installed. Shed tools also has a command to test tools, run tool tests. A good tool will have tests. Instructions within the wrapper to run the tool with test input and see whether the tool produces the expected output and also to check that the dependencies have installed correctly and that nothing's gone wrong. You need to be an administrator to install tools but any Galaxy user with an API key can run tool tests. So you could go and test tools on Galaxy Australia. And the output from running shed tools test will be a list of tool tests that have passed and a list of those that have failed and a more detailed file of data from the test jobs that's useful for debugging. There's a Python library planamode that can be used to generate a user-friendly report from the JSON data. From a Galaxy workflow, from a downloaded Galaxy workflow file, workflow to tools can be used to generate a list of tools. And that's all of the tools required to run that workflow. Ephemeris also has a command to set up data libraries to upload shared data to Galaxy. And this is the example of an input file, an input YAML file to set up data libraries and it's describing two folders with one file in each and each of these files can have their contents downloaded from a public URL. There's also the function Galaxy Wait which sends an API request to a Galaxy server to check whether it's running and able to accept the request. If the server is ready, it will return straight away. If not, it will keep sending requests. And this is useful if you want to run any of the other commands such as shared tools in store and you don't know when Galaxy will be ready. And so a little more on tool management. This is a picture of the directory structure of a simple toolshed repository. It just contains one tool and it has a tool wrapper, remove beginning XML. It has an accompanied Perl script which is called from code in the wrapper. It has an input and output test file, which contains tool metadata. And this is the shared YAML file. The repository name and owner are set in the metadata file. The file also contains the development URL for the tool and the development URL is displayed in the toolshed as a link to the tool's files within the development environment. And this is the GitHub repository you would go to to raise an issue about the tool or to make a pull request to improve the tool. And then we've got Vascan, which is a more complex toolshed repository. In this one, there are three tool wrappers related to the same software. And installing this repository would add three tools to the tool panel. Sometimes tools will need reference data such as genomes, lock files and data tables are used to link tools with reference files. If CVMFS is installed, a lot of your reference data needs will be taken care of. There are also data manager tools in the toolshed for installing reference data for tools and these can be run from the admin panel or using ephemera. There are also repositories in the toolshed that are sweet repositories. And sweet repositories can be used to install multiple tool repositories at once. For example, there are many tools associated with SAM tools owned by the IUC, such as SAM Tools View or SAM Tools Empyla. And installing SAM Tools Suite will result in all of these SAM Tools repositories being installed. And there are quite a few different tool suites in the toolshed. Without going into detail about this file, dependency resolvers conf, at runtime, Galaxy will look for installed dependencies in an order determined by the dependency resolvers conf XML file. The file shown in this slide is the sample configuration and quite similar to what is used on Galaxy Australia. So what it means is that given a set of requirements, Galaxy will first look for an installed toolshed package that meets those requirements. And if Galaxy finds this, it will source the package and look no further. If Galaxy does not find this, it will look for a Galaxy package with a required version, followed by a condo package with a required version, followed by a Galaxy package of any version, followed by a condo package of any version. And typically on Galaxy Australia, the packages to run the tool will be condo packages. A more recent development in Galaxy is the use of Docker or Singularity containers to resolve dependencies. And there's some further reading on this and also a tutorial. So the key points from this slide set are that there is a Galaxy toolshed containing thousands of tools that you can install in your Galaxy if you want to. And that as a Galaxy administrator, you can choose which tools are installed, how they're arranged, and that Ephemeris can be used to manage tools. Tool installation with Ephemeris is best practice because you can script it and keep a record of what you've done. And it allows for the automation of tool management tasks. And thank you to the Galaxy training network and all of the contributors to this tutorial. Hello again. I'm Catherine from Galaxy Australia. And we're going to go through a tutorial on Galaxy tool management with Ephemeris. And this tutorial will introduce you to one of Galaxy's associated projects, Ephemeris, which is a small Python library and a set of scripts for managing and bootstrapping Galaxy plugins, tools, index data, and workflows. Its aim is to help automate and limit the quantity of manual actions admins have to do in order to maintain a Galaxy instance. And this is used a lot on used Galaxy style instances. And as a background scenario, you could be an administrator of a Galaxy server and a colleague might approach you with a request to run a specific Galaxy workflow. And in order to do this, they need to have the tools installed. So you need to identify what tools are required for the workflow and install these tools and their dependencies on your Galaxy instance. And so the first thing we're going to do is install Ephemeris on our virtual machines. And we'll do this in a Python virtual environment. And the first thing I'll do is make a directory for this tutorial and go into that directory. So the virtual environment, the Python 3 virtual environment, and we'll put it in our home directories. And it's okay to copy and paste that block and just type in this manually. And then source the virtual environment and in the environment, install Ephemeris. So this is done and you can see that Ephemeris version 0.10.6 has been installed. Ephemeris has a command for extracting tools from Galaxy workflows. The Galaxy workflow files are complex JSON documents. And the workflows consist of many steps and steps have tool IDs. A tool ID looks like this. It's got the tool shed, the owner, the repository name, and the version of the tool. But to install it with Ephemeris, it needs to be tool shed repositories and workflow tools makes this conversion. So for fast QC, it turns into this yellow block with name, owner, and revision. And the section label you can choose but the default is tools from workflows. So we can try this on a real workflow by downloading the mapping workflow from the GTN. So copy and paste this command to get the mapping.ga file and I'll just show you what this looks like. Just the first 15 lines. So it's a JSON document of a Galaxy workflow consisting of many steps. So use the workflow to tools command to extract the tool list from this workflow into a file named workflowtools.yaml. And this links to the Ephemeris docs and you can see that workflowtools takes three arguments. The workflow or list of workflows, the output file, and the panel label which defaults to tools from workflows if it's not provided. So the command is workflowtools wflag mapping.ga wflag workflowtools.yaml and the lflag is mapping. You could choose anything for the section label. So I'll use this command copy, paste. You can see now there's a file called workflowtools.yaml and if you have a look at that it's a list of tools, of tool repositories. The next thing we'll try is installing tools on our Galaxy instances. To install a tool you need the URL of your Galaxy server and also the API key of your admin account. So in your Galaxy page open this up if you don't have it open from the user menu you can go to preferences and manage API key and if you don't have one of these yet you can create a new one. So copy your API key and keep it somewhere. You might have to use it several times during this tutorial. I'm going to make a variable for it called API key so I can call on it when I need to. So we'll start first installing a single tool and use the ephemera shed tools command to install pylon owned by IUC into a section named assembly. So this links to the documentation where you can see what arguments you might use for that command. And so the command in this case should look like this shed tools in store and your Galaxy URL your API key and then the name owner and section label flags for pylon, IUC and assembly. So I'll copy this one and here I will replace your Galaxy with the Galaxy URL I'm using which is gatt2ostraining.galaxyproject.eu and the API key that I've stored. Now if you've taken the Singularity tutorial you'll have the Singularity set up on your Galaxy and this won't take very long. If you haven't taken that tutorial this will be installing the environment with conda and it might take a couple of minutes. So this was done in 20 seconds and this is good for a one-off installation but it's not convenient if you want to install many tools. So to install many tools you might use a YAML file like the one we just generated from the workflow. So there is an option to watch the installation proceed by running JournalCTL FU Galaxy in a separate remote shell. We're using VoBoo so we could actually split the screen and watch JournalCTL on the same screen. And there is a tip here for opening a split screen in VoBoo moving between the splits and closing the split bits in focus. So I'll do that now. Shift F2 to open the split and JournalCTL and we can see the Galaxy logs. I'll go back to the first screen with Shift and arrow keys. So now using the ShedTools command we can install all of the tools from the workflow tools YAML file and Galaxy. So going back to the documentation this will be similar to our last command but instead of using the name, owner and section label we can just use the T or tools file black and so the command looks like this, the Galaxy instance and the API key and T workflow tools.yaml So instead of copying this I will use the up arrow to find my previous command and get rid of the name, owner and section label and instead install workflow tools.yaml and so in the logs you can see that something is happening and Galaxy is installing the tools starting with FastQC and this all happens fairly quickly with Singularity, if you haven't got Singularity set up then this might take a little bit longer and the last one is BAM Tools filter and it's done in less than one and a half minutes. So now there is some new tools on Galaxy that we've just installed and we can have a look at those in the Galaxy tool panel. So if we refresh our galaxies there are two new tool sections mapping containing the tools from the workflow and assembly containing Pylon and we can actually import this workflow into Galaxy by right clicking or control clicking on the link and copying the link address going back to Galaxy if we select workflow at the top we can import the workflow by pasting the link here and so now the workflow is in Galaxy we can have a look at it in the workflow editor and I'll zoom out a bit so that we can see it text is very small now but the workflow is here and it has the input steps and the tools trim, galore, fast QC multi QC, bowtie2, SAMtool, stats and BAMtools filter and we can run this workflow on our galaxies now because all the tools are installed so the next thing we're going to do with Ephemeris is tool testing it's very important to be able to say that our tools are working and we're going to use the shedtools command for this to test Pylon and the command in this case looks like shedtools test then G your galaxy URL and your API key and then the name and the owner but we don't need the section label this time I'll copy and paste this terminal is a little too big that's better and the API key that I've saved so testing Pylon and you can see the job running on Galaxy what it's doing is it's uploading the test input files and running test jobs we can see it in the logs we can also see it in Galaxy so if I zoom in again click on analyse data to get back to the main page we can look at the test history in the fuel histories so there are two test data files that have been uploaded and two cued Pylon jobs the first time a job is run for a newly installed tool it'll take a little bit longer because the containment needs to resolve so if we look back here it has run the first test from the first test has passed and it's now running the second test you can update this page by refreshing it and it's got three cued jobs for the second test checking not only that the jobs have run without error but that their outputs match the expected outputs I'll refresh again it's taking a little while and the last jobs are running the tool tests are very important a productive Galaxy will contain hundreds if not thousands of tools and it's great to have an automatic way to ascertain that they're all working as expected and it looks like this is done if I refresh the test history again it's all done and all of the jobs have run without error and if we look at this we've got two past tool tests and zero failed tool tests and looking at what's in the directory there's also a file tool test output Jason with job data so Pylon is working and the last thing the last thing in this tutorial will be obtaining a tool list from a public Galaxy instance and we're going to get this from usegalaxy.eu or Galaxy Europe and use euphemeris getToolList command so I can open up the docs for this and there are a few options to include and exclude details the only argument we need is time to the output file and the Galaxy URL so the command is getToolList from Galaxy this URL and our tool file called euToolList.yaml copy that and paste it here and it turns almost instantly even though there are a lot of tools in the list if we have a look at the first few lines it's a YAML list of all of the tools on Galaxy Europe and if I count the lines there are 11,000 lines in this file because Galaxy Europe has a lot of tools we will not install all of the tools from the EU Galaxy server as that server likely has more tools than any other Galaxy instance and it would take up hundreds of gigabytes of space which we don't have on our virtual machines and a bit about production best practices both Galaxy Australia and Galaxy Europe have automatic scripts that regularly update tools that have updates available and systems for automatically installing new tools using Jenkins servers all using shed tools Galaxy Australia runs shed tools test on tools at the same time as installing Galaxy main in the US has a Jenkins server that once a week runs shed tools test on all of the tools on Galaxy main Galaxy Europe and Galaxy Australia Galaxy Europe's Jenkins server runs a script that uses the Ephemeris setup data libraries command to keep shared data from Galaxy training network material up to date on all of the useGalaxy's data instances and there are some more details about how AU and EU maintain their YAML files One note here is that this does not have a function to uninstall tools although there is a function for that in Bioblend and contributions are appreciated and new contributors are extremely welcome so this is all open source and so for the key points Ephemeris and automation help with the tool management on Galaxy and there are best practices to learn from there's no need to manage your Galaxy tools manually using the scripts makes life much easier and gives you a way of keeping track of what you've done and so we really love feedback and we're always trying to improve these tutorials there's a feedback form at the end of the tutorial and congratulations on successfully completing this tutorial