 Okay, so we are now live. Let's see so this is part of the galaxy resources for different communities series and this is the third of four. This one is for tool developers the next one is two weeks from today for admin and infrastructure providers. Today we have three presenters. And here's our agenda. Dan Blankenberg. Galaxy PI and from the learner research Institute is going to introduce us to resources for tools in the galaxy ecosystem. Anthony but how do, and I apologize Anthony for that pronunciation. We're going to talk about dockerizing galaxy for tool publishing, and then finally Peter Novak of repeat Explorer is going to talk about publishing your tools in your very own public galaxy server. So let's see. Okay, let's get out of there. I'm going to stop my share. Okay. I'm going to turn it over to you. Great. Thanks Dave. So today I'm going to be showing you about where you can find resources for making tools available within the galaxy ecosystem. So a brief outline of what we'll be talking about on this portion of this webinar is first I'll sort of talk a bit about standard galaxy tools these are basically command line tools that people generally think of when they think about galaxy tools. I'll touch upon galaxy interactive tools, also on to data source tools, and then as well as external display applications. These are all different ways that you can use to connect your software with galaxy. So first we're going to start with galaxy tools. So where can you find all this information later. Everything's available from the galaxy community hub, which is available at galaxy project.org. In the top here if you click on learn, you'll then click be able to click on teach with galaxy and that will take you to the galaxy training materials, or you can go directly to training dot galaxy project dot org, in order to get to the galaxy training materials. And so we're going to be going through a few of these options right now will be going to the tool development and integration into galaxy set of training materials. And so I'm just going to switch over here to the galaxy community hub, like I said and we can click on teach with galaxy, which will take us to the galaxy training materials. And then we're going to come over here where it says galaxy for developers and admins under development in galaxy we're going to click on that link. And that will take us to a list of topics that we can use for developing with galaxy. Let me see if I can make this bigger. And so then we can go ahead and we can see all these various topics that are available. So what we're going to be concentrating on right now is this galaxy tool development and integration into galaxy. And there's two links here. One link that will look at first will take us to the the tool integration slide deck. And then there's also a bunch of hands on tutorials that you can follow along as part of the planimo documentation available here. So we're just going to start by taking a look at these slides. And I thought that if you want to you can view the slides in various languages, just by clicking on this drop down. And that will generally use Google translate unless the manually translated options are available. And so here we can see that we now have our tool development integration slides. And so let's go ahead and we'll take a look. So first we're going to talk about what is a tool for galaxy, then sort of how to write a best practice tool, how to deal with the tool environment so for example dependency handling and so forth. And so here we go. And so inside of galaxy, hopefully everyone's familiar with the galaxy interface if you're not there's tutorials available for getting familiar with the galaxy interface from that same training dot galaxy project or blank. But here's your general UI your user interface inside of galaxy on the left hand side you have your list of your available tools. They're arranged in sections you click on a section it will expand the tool, then you can click on the link when you for that tool when you click on that link it will put the interface for the tool inside of your middle aid. You can select your input data sets configure your input options. And then when you click execute the job will then be executed on some compute resource a cluster somewhere, or in your laptop for example if you're using a development instance. And when the job is executed a new box will appear in this right hand side, which is known as your history over here and your data sets. And they'll start down at zero and they'll work their way up in the sequential fashion recording all the steps that you've done as you perform your analysis. And so one of the big things that will concentrate on here is this general idea of the galaxy tool wrapper, which is what we can use to sort of control the generation and the viewing of the galaxy interface. And so here we have our for example a galaxy tool wrapper, where we have our input file. And this is really just going to be working around to a command line executable in this case this is a Python file for graph land but this could be compiled C code it could be a Java code, I could be a pearl script can be a shell script just any sort of command line executable that accepts input options. And so, basically we can generate this inside of galaxy and this would be your typical command line. So, when we think about these inside of galaxy and you can see here that we have some some tool on some compute resource we have our galaxy instance it generates a command inputs are put past in the inputs and that commander pushed out to that compute cluster. And outputs are defined within that tool and that's actually what ends up being pushed back to your galaxy instance. So when this this actually happens, we have a very simple XML description here. It's a very simple hello world tool that's just calling echo, and it's going to output the string hello world, it's going to, it's going to use a placeholder here to replace this my string here with this input text box. And you can see here that then we see this my string is passed over to this my string this output one data set that's defined is put over here on on command line redirection for the output to store it. And this one click on execute for example to just execute echo hello world. If you put in you are amazing within that box would then store that within your galaxy data set and then put out hello you are hello world you are amazing and that would be the item that that appears in your history. And so when when we sort of talk about how to invoke the tool. It's important to think about your underlying dependencies and so there's a requirements tag we need to find a package, and then a version for this package and in this case we have graph land. And then we have a command line that's generated here it's generated using the cheetah syntax. So basically we can use it nice templating format in order to replace all of these items for for you for your galaxy tool that you're trying to develop here. And so you can see here we have more of a another example. If this is a requirement from an actual package, you can just call that generally speaking directly as an executable. In this case we actually can execute this also if we had a Python script rewrote ourselves just called graph land up high that we held next to the XML file, we can then just use Python as the underlying requirement and just call Python on that file. And so here we can see that this is a cheetah templating language and so we can have control statements like if else for loops you can have all sorts of different standard constructs within that that that syntax. And so here we have various different types of input data types and so in this case we have a parameter that's actually type data, so be an input data set of a format text and so we can restrict the format that's accepted by that tool by changing this this this entry here for we only wanted to include fast queue files, we could put fast queue here, and so forth. We give it a name that we then use for that internal place holding purposes, a label which actually is what gets displayed to the user and the user interface, and then we have help here that can show up here under that that parameter input. There's lots of different types of parameters so for example this we have an example of an integer parameter. We also have an example of a floating point parameter and optionally with these numerical types you can specify a minimum and a maximum, along with the starting initial value and so if you do a minimum and maximum. You then also get a slider along with that that input box. And so you can, you have a lot of control over how you can design your input forms for everyone. And then text it's very, you know it's one of these very basic types, and then we also can generate select list and select list can be generated dynamically, or statically in this case we have a statically defined select list. And with the options, the internal value that will end up getting passed to your underlying tool, along with the option that's actually displayed to the user within their interface. So you can do multiple different types of select and in this case, we have another type of select here where we have a multiple select and so you can have multiple options that you're selecting, for example, you can have Boolean parameters, and you can also have these really nice things called conditional parameters, where you have one of these select list and based upon which item you're going to change select here paired or single end, it can display different sets of options under that conditional parameter. We also have repeat parameters and so that you can just define one set of, for example series here we have your input data set and then the column that you're going to be defining for that input and so this is actually using the metadata value from your input data set to decide which columns are actually going to be available for that selection here based upon this data set that's being selected. And then what's how do you actually define what sort of outputs you want to produce so here we have an example where you actually have two different outputs that are being created a tree file, and then also an annotation file. You can have any number of outputs that you would like and you can have data set collections and you can filter out the options that you want to have as well. There's a whole lot of different ways to configure these tools. One of the very important part I just want to point out is that you also want to be able to include citations with your tools. And so this is a great way that if you wrap a tool for galaxy. When someone uses that tool inside a galaxy they'll be able to automatically see the citations and be able to actually extract the citations from from a history where they use the tool later on. I'll put the information in here and I'm just going to quickly move through most of this just so we can get to planimo. So planimo is this really great command line utility to help you build galaxy tools. There's a whole lot of documentation available, along with building tool tutorials and if you're actually really interested in following these, I definitely recommend following these tutorials for example, and the documentation. So basically just briefly outline what planimo can give you as far as a galaxy tool SDK. It has a nice planimo init command where it will automatically initialize a galaxy tool wrapper for you. And then it also has additional availability for linting your tools, making sure they adhere to best practices, and also for uploading them to the galaxy toolshed so that other people can then automatically install your wrap tool into their galaxy instance. And so here's an example of the planimo init command. So planimo lint in order to lint your tools. And when you're building up a tool you can use this nice planimo serve command in order to pull up a galaxy instance that you can then just use to work on your tool that's under development. And this is a very sort of iterative process. And of course you should write tests. So if you're really interested in adding a standard galaxy tool to your galaxy instance that you go ahead and you go through these tutorials and through these slides fully, but I just want to quickly pop back to here and just give a little bit more information about the galaxy toolshed. And so if you want to have your tools available within galaxy you really want to make them available to everyone and so you want to add them to the galaxy toolshed. So what the galaxy toolshed, think of it as the app store for galaxy. This will allow you to install tools, data types, data managers and so forth. And so it allows the galaxy administrator to actually install an update tool so you have different versions of a galaxy tool, you have multiple versions all installed at once and so if you have an old workflow that's using an older version of a tool, it can continue to use that old version, or you can upgrade it to use an older version of a tool for example. But if you're really interested in getting access to the galaxy toolshed I recommend you follow along with these galaxy toolshed slides as well. Now we're going to just move on to what a galaxy interactive tool is so those tools we have all seen before these are all command line tools that are executed, perhaps on a cluster somewhere, you know basically just sent off in a batch modes for example if you were submitting it to your slim queue, it would run somewhere come back with the results but what happens if you actually have a tool that has its own graphical user interface already. You can create a galaxy interactive tool for that. And so the big difference here with a galaxy interactive tool versus a standard galaxy tool is that it's required for the dependency to be inside of a container so that the networking can be handled. You can specify one or more entry points that's composed of a port and optionally a URL and if you want to see several examples of those those are available within the galaxy main source repository. But so here's an example we have a familiar galaxy XML file, we have a requirement on a container somewhere we specify an entry point. Now a user selects that interactive tool that appears in their middle pane, they can configure that tool interactive tool just like they normally would they click execute, but in this case instead they actually end up with an interactive web server that's been launched inside of a gallery inside of a Docker container in this case that they can then interact with play around with look at their data and so forth. Basically, you're relying on an external tool, you know, executable that creates an interface that's not inside of galaxy in this case and so we can integrate those tools inside a galaxy as well. And just a quick, quick view of how this works and thanks to Helena for providing the slide. And if you're really interested in galaxy interactive tools there's actually a webinar just a few weeks back and that's that's available on on YouTube channel I believe. And now let's just talk quickly about data source tools so let's say you want to get data into galaxy but you're running like a data warehouse. And so you don't want to require a user to download all that data and re uploaded the galaxy server. And so we do support some protocols for for for providing this sort of idea where a user can actually click inside of their galaxy instance, get forward with an external resource in this case you see SC table browser configure the table browser than actually click on a send query back to galaxy and this all happens inside of the browser that galaxy will then later on go and fetch that data down. And if you're interested in sort of seeing more about this. Go ahead and check these links as well. So let's quickly just talk a little bit about galaxy external display applications. And so let's say you have a web server somewhere out there on the web somewhere that accepts input data from users that can display stuff on that that that interface, or it could be another analysis platform even if you wanted it to be. You can connect basically a galaxies user data sets to this external resource, you can do that using galaxy external display applications. And so here we actually we have an example of a galaxy history and a data set within that history. In this case this is a BAM file about 30 gigabytes, and we can see here we have multiple different display applications available. In this case we have a view at BAM.io bio. And so if user goes ahead they can click on this link they'll actually afforded in their browser to BAM.io bio, along with a URL that refers back to their data set, along with. In this case we have a BAM file so we have the BAM file we have the index file both available. And we can see here for an example of a simple display application. That we have defined here BAM.io bio basically points to their server provides a BAM file. And we have defined our BAM file here from our galaxy history item, and then also the BAI file that the index file that's been defined as part of the metadata for that BAM file as well. And this is all then available and the way you make these available as you assign these galaxy external display applications to that that data type here and so here we have inside of our data types conf.xml file. We have our BAM format, which is a type of display binary BAM. And here we then have our BAM.xml file that points to here, and that's how we tell galaxy then to load that. This is an example of a static application but you can have more complex dynamically defined applications. And so if you want to see more examples of those you can look at the supplemental material from this paper that came out recently, and if you want to see a tutorial about how to build the these external display applications inside a galaxy there's also additional information available here. And of course I want to convince everyone to join the galaxy community. And I, you know, build tools get your tools into the galaxy get them used by people. It would be, you know, really great to see what everyone else can build inside a galaxy it's really exciting it's a great community. I invite everyone to come along. One other thing I did want to point out quickly is that when you are at the training materials, in particular, if you you notice that here we're at our galaxy, the development topics, and there's a chat available here that uses Gitter and so you can actually click on open chat, and that will actually connect you to to this to this Gitter Gitter application so that you can actually join in chat and so forth. And you can see, you can connect here you can sign in using various different things. There's also a whole bunch of different channels available and so this connect us to our dev channel. If you're interested in other channels and click on this is globe up in the upper right hand corner. It'll show you a whole bunch of other channels that are available for various other topics for the galaxy project. With that, I want to hand it off to our next speaker please. Thank you Dan. Anthony Europe. I'm ready just sharing my screen. That's it. So you all see it. Yes. Perfect. So let's talk about Docker and galaxy now. So the first thing you need to know is that there's a Docker image available for to launch very easily a galaxy instance within seconds or maybe minutes maximum. This key dot IO slash big running slash galaxy. And just by running this command you can launch on your PC or Mac or whatever a galaxy instance that you can access at this address. So it's Docker image which is based on the open to 1804. And it includes everything you need to run a galaxy server, which means the galaxy web apps. SQL database and FTP server and every other components that maybe needed. So as any other Docker containers it is you can throw it away when when you have finished using it but you can also persist all the data you generate while using galaxy by just mounting a single directory which isn't a slash export. So you can launch your Docker container do whatever you want and backup this single directory and you can come back later and and and relaunch the same container with all the data you have generated earlier. So the image of galaxy is developed at this address on a GitHub repository. And so you're free to go and and look at the Docker file and contribute if you want to improve it. You have access to every setting that you would find in galaxy dot email file for configuring galaxy behavior. So everything is done by using environment variables. And when you run jobs from the galaxy interface by default the jobs are run inside the galaxy container. And you can also configure it to run them on external compute resources like storm cluster for example. My default the container is launched with all the tools that are the default galaxy tools from the galaxy source code but as we will see now you you have the possibility to create some galaxy flavors which are just kind of specialist Docker image for the scientific fields. So for example you can build your own image. And there's a catalog of existing images for example to to to perform genome annotation RNA analysis or meta genomics, each one of us has a specific Docker image with pre installed tools workflows and data. It's available on this page. And now we'll see how you can easily create your own favor with your tools and your data. So when you're inside the container you have access to a specific command which is installed tools. And all you need to do to install to is to write a YAML file listing all the tools you want to install from the to shed. You have an extract that on the right side here so in this case we wanted to install the the tool named Compa line P from RNA team owner from the to shed and we wanted to install into RNA alignment section in the tool list. And you have another a few other tools that need to be installed you can even specify a specific version of the tool. Once you have written this YAML file, you just need to read to to run the install tool commands, which will perform the installed step into into Galaxy automatically for you. You can do almost the same for for workflows. So if you have a collection of workflows that you have exported from a galaxy instance in the GA format, which is the default one. You just need to run the workflow install command and give it the directory with all these GA files and the workflows will be installed into the your galaxy instance and made available for users to use it. You can also add the data libraries to all the way to to populate data libraries you just need to write a YAML file. Once again with a standard format like this, where you specific the different sections and the URL to fetch the files from internet and the data types or each file. Once you have written this YAML file, you just need to to use a specific command once again, which is set up data libraries, which is quite simple. And finally, you may want to to add some reference data like your reference genomes on on your galaxy instance. So to do this, usually you use data managers to galaxy. And there's a possibility a special command, which is run data managers, which which takes as input a list of data managers, which need to be run with specific options to populate all this reference data into your instance. So when you put all these together, it's very easy to create your own galaxy flavor using this YAML syntax and specific commands to execute the tasks you you need. So all you need to do is to create a structure like this directory with a Docker file where you say that you want to create an image based on the official one. You can customize it with a brand name here. And then you say okay I want to install a few tools that are defined in the YAML file just beside my Docker file. I want to populate the late library data by using the YAML file here and to install a few workflows here. That's it. And the last step if you want to is to customize a welcome page when you when you show for the first time the first page of your galaxy instance. So you can change the logo and the text that is displayed. So it's quite easy to do it. Once you have written all these Docker files in YAML file, you can of course build locally your Docker image by using the Docker build command. But you can also share it and put all these files into GitHub repository and then have it built automatically by Docker repository like the official Docker hub or key.io for example. All this machinery is used automatically for the galaxy training network which is providing for every topic like assembly, genome annotation or transcriptomic or whatever, metabolomics for example. So automatically there is a Docker image which is built for each topic and that can be launched like this. So this is just a screenshot from the topic page on the galaxy training network. And it's very easy to launch this in containers in a few seconds. I think that's it. Now, I'm going to leave the next speaker. Thank you, Anthony. Peter, you're up next. Talking about setting up your own galaxy server. Please ask questions in the Q&A. We will get to them at the end. Okay, we can see your slides Peter. I can't hear you. Okay, there we go. So Dave asked me to show the, on the example of our tool, why we use public galaxy server, why we actually move our tool to the galaxy. So before I will show our story I will just introduce our group. I'm from Chico Republic and I am in a laboratory of molecular cytogenetics and we investigate mostly my plant genome. We are interested in a repetitive DNA and we want to study genome composition with respect some functional aspects, like Centromere function and so on. And there are really not, you know, if you look into a plant genomes, you will see that there is a huge difference between the genome sizes of various plants. So there's actually like three orders of magnitude differences in a genome size. The smallest genome is Genlysa, Negrocaulis, largest genomes can be found in genus Fritillaria, and most of the size is determined actually by the repetitive content. And there are various types of repetitive DNA. There are some like focused DNA like satellite DNA which occur in some localized space regions in the genome and then we have a dispersed repeats like a rental transposon which could be actually really dispersed across whole chromosomes. And we are really interested in the composition of these genomes with respect to repetitive sequences and then what are the differences between the individual taxa and if we can understand better the evolution of the genome. For this we needed some methods how to study this repetitive DNA. And originally we could study the assembly genomes, but the problem is that this repetitive part is quite often missing or it's underrepresented in this genomes, in these assemblies. So we could use also some bad lab approaches which are usually biased and you don't get too much information. And when in 2005, when the next generation sequencing was introduced, it turned out that this is really a great tool for us, at least for our group to study repetitive DNA because when you do a sequencing like even like shotgun, with low coverage, then you are getting a lot of like unbiased information about the repetitive DNA because you are sampling basically chromosome randomly and because we are studying something which is a repetitive set then you can get some idea how does it know what is what repetitive is in the genome. However, there are no computational tools at the time. So we were forced to develop some R tools. And I have to point that none of the member of our group is really a like informatician or computer scientist by training or biologists which are interested in bioinformatics. We gradually develop some some pipeline. Originally, this pipeline was utilizing some programs which are original design for expression sequence stack analysis. Later on, we develop some graph based clustering algorithms, and we develop the pipeline which we call the repeat Explorer. And this pipeline is targeted to basically yield as much as possible information about the repetitive content in the low pass shotgun next generation sequencing data. So, the history of the pipeline is shown here. When we develop some concept how to analyze this data it was in 2007, then we improve our algorithm but we were still using this command line version of the program. And we, this this tool was actually no quite quite hard to use for an inexperienced user, especially biologists because mostly biologists were interested in that. So, we wanted to provide this tool to other biologists than our collaborate collaborators. So, we decided that we will need to make this tool somehow a public because additionally this tool require quite some computational resources and normally normally wouldn't able to run this tool just on the desktop. And at the time, there was not really no good availability of this computational resources. So we decided that we will set up our own, our own web server, which will run repeat Explorer as a service. The question was, you know, what type of platform we should use. And the galaxy was actually one of the platforms we're considering can found at the end as the best one. And because it's sold for us a lot of a lot of things. We did some job management and scheduling for users because this computational job took some time to run. And so you need to somehow manage that your server is not overloaded. You need some user management so users can log in, can upload their data, and then run a job, and also the users need to share data between each other. And we also wanted to actually reuse our old PBS cluster. And we found that the galaxy was actually develop in a mind with that you know it could be easily. It could be configured to be used on the PBS server. And there is also good documentation about how to use a galaxy platform and how to set up your tool. And here on these pictures you actually see our first first cluster which was offered as a public service. And it needed for us to write a tool definition this is was something Dan was talking about. We need to specify that what are these jobs needs type of environment they need. We needed to configure PBS cluster so it works well with the galaxy. Set up the storage FTP server and postgres database. And for that, you know, most of these tasks we manage ourselves we need some some discussion with some it specialist but it wasn't really that that difficult. So, this was the first server. And because we were able to publish or announce it this server in 2013 and journal and bioinformatics, then people started to use it, like more often. And, but then we had a problem that this server wasn't really designed to handle as that many requests, but we are lucky that our Institute later become a member of Alex here infrastructure. Which helps life scientists with storing processing can analysis their data. And there are also, you know, we have also a lot of collaborators in the Czech Republic Alex here, which help us with moving this, this service, this service to a new to new server, which is now so and now it is a part of Alex here provided services. So, we moved on, you get a better hardware, we don't have to, you know, deal with this hardware anymore because we have some is administered by it professionals. And here you can see that you know, it's this type of the analysis is quite specialized and you know it's oriented on plant and it's oriented with repeats and not so many researchers is interested in this type of topic. But we can really see that the number of running job was was growing. And I think we think that galaxy really help us to increase the bills visibility of our tool, and also make possible that people are, you know, especially biologists are able to use these tools without any difficult installation. And we are also, you know, I think galaxy server is also good that it provides really good training platform so we are now organizing every year. Every year. Workshop where we train usually about the 4040 participants who, you know, are interested in the repetitive sequences and for all this we are using, we are using some local instances of galaxy. And also what is great here that you will get on the workshop some some feedback from from users. And I will do also mention additional benefits here that we are having repeat explore on the galaxy server. So, the great thing about this is that, you know, when once you have a data on your galaxy. And you have also available a lot of tools, which are available through a toolshed repository so we are providing some specialized tool but we can always reuse some tools which are already programmed by other programs and we are which could be easily with basically one click installed into the galaxy, galaxy server, and also guys, because you can create a complicated workflows. So, you can use basically galaxy server now for all the steps necessary which are necessary for your analysis, including data pre processing and then some analysis and visualization of these results. So you're putting your tool, which is special, specialized in one thing through a galaxy server you're always getting all these other tools which are programmed by by the others. And what is also great that you can share workflows and protocols and galaxies, and this also, you know, basically increase the reference reproducibility of your of your results. Also, what I like about galaxy that you can easily file a buck report, which goes together with your, you know, data. So if user fails to do analysis with your tool. And so he can, you know, pilot buck report. And what we will get is the access to the his data or her data, what were the conditions, how exactly these data were analyzed and what was the problem and it helped us to actually, you know, to improve our pipeline. So, and I think, you know, it's what you are getting also with putting your tool on a galaxy server that there is a real lower barrier for less experienced users because our users are mostly biologists that are really not interested in buying for. And I think, you know, because we move this galaxy server. So far, we know that from published papers that this server was used for analysis of over 10 hundred plant species for also for some comparative analysis, but also for a whole genome assembly project because our server could be also used for annotation of the complete genomes. And also, we had a little bit bigger impact because, even though we design it in my focus on plant biology. There was a lot of people who are actually using that for analysis of mammals, fish, insect and worms. So, I want to conclude that if you have any like tool, complicated tool you want to really make visible. It's really good if you will make it available in your tool shed. And then you can negotiate its publication either on a use galaxy or you can set up your public galaxy server your own. And this is our small team. Yuzy Matsov is a head of our team and we are basically all of us. We are mostly biologists who are just some interest in bioinformatics. So thanks for your attention. Thank you, Peter. And thank you, Dan and Anthony. Let's see. So I want to throw out some stuff as to why you would actually want to do this. So let me share my screen. Which screen am I sharing that one? Okay. Let's do it. Share. Okay. So a couple of things. The stuff Dan talked about, which was how to do this and the, you know, the mechanics and the resources for defining your tool to galaxy and getting it in the tool shed. Why you might want to do that? Well, you know some of the obvious reasons. It means you can, it means anybody can then put that tool on their galaxy servers. But it also seems to be that the act of putting it in the tool shed will also greatly increase the adoption of your tool. Okay. Because it then makes it easy for anyone running their own galaxy, including people just doing it on their desktop to import the tool into their galaxy server. And so it becomes much easier to use. And I recommend taking a look at this preprint, which Dan and I are co-authors on for that. For the stuff that Anthony and Peter talked about, we highlight any galaxy instances we know about, any galaxy platforms that we know about on the galaxy platform directory, which you reach by clicking on use here. And public servers like repeat explorer go here. And if I search for repeat. Yep, right there. And I clicked on that. And it tells me information about it. What are the clothes so on. Also for Docker images, we highlight those two. And so there's a bunch here that have been created. And as Anthony showed these are fairly easy to set up. And that means anybody can then go and can set that up on their local instances on their local infrastructure. So if you go through the effort to create a public server like repeat Explorer or a Docker image. We will highlight it will also highlight it in the monthly newsletters. It makes it much easier to find because we actually want to get your work out there. So really, yeah, think about the impact of doing this. It really is a great way to get your tool out there. So let's see. No questions yet. Okay, that means I get to ask the questions, which is an utter mistake. Let's see. But here we go. Okay, Anthony, I have a question for you. If you haven't set up a Docker image before, do you have an estimate on on how long it would take, you know, if your compute savvy, and you have a tool and you've got it defined, you know, in the tool shed. How long would it take you to set up a Docker image that runs that tool that includes that tool. Any estimates. And you're muted. Oh, there we go. Okay. I'm here. It's very, very easy to do. I think it's just a matter if you really haven't. Yeah, if you would never have tried Docker before, maybe half a day or a day to learn how it works. But if you know it is just a matter of minutes and minutes to write it, maybe a few more to build the image, but it's really short. Thank you. Most of the Docker images that we list we found about through publications. Yeah, anything that's an easy way to make your tool accessible. So, yeah, no questions. Dan, you talked about several different classes of tools, starting with command line and then going through different levels of visualization. What's the step up and I'm assuming it is a step up in effort to go from say a command line tool to a visualization tool to a data source tool on how much more work is required for those other cases. Yeah, absolutely. So with the case of the data source tool, you need to have some buy-in from that external resource because it does involve this callback URL cycle. Basically, when you click on the tool inside a Galaxy, you get the user would get forwarded in their browser to that external website along with a Galaxy URL link, that website at the end of the transaction where the users can figure in their data set, choosing what they want, applying filters and so forth. When they click send data to Galaxy, it has then that external site then these posts back to Galaxy. So that that that is a bit more involved that does involve some buy-in from that external resource. But as far as writing the actual tool it inside of Galaxy that part is about the same amount of effort so it's it's pretty straightforward I would say. If it's going to be your first Galaxy tool you're writing you should probably start off with one of the standard tools, one of the standard tutorials just so you can wrap your head around the way that that works. Now the the other question was about the external display applications perhaps so those are sort of using a completely separate framework in the standard Galaxy tools so they have a bit different XML configuration. But still if that there's this external resource a web server anywhere that basically can accept a get parameter that points to a URL, you actually do not need to have any buy-in from that external resource, because you can just leverage existing functionality of those external resources. That's assuming it's able to natively consume URLs basically you know a lot of these external resources allow you just to create links and you can think about the external display applications that basically just a fancy way to create a link based upon a user's data set that they can then click in their history. Excuse me, and they can click on that and then they just forward in their browser, along with the URL to along with a URL to that data set that does have some some token authentication involved there within the URL. Now creating a galaxy interactive tool. You know the first hump you have to get over is you have to be able basically to create a Docker container, or there needs to be a Docker container that exists for that resource already. And so that you know that that external is can be difficult if you're not familiar with Docker. But if there is a container released by a resource a lot of time by by by some group. A lot of times you can just use that Docker container directly. And then writing the tool is more or less just as easy as writing a standard galaxy tool with the exception that you then just need to map some ports. Configuring the galaxy instance is a bit more difficult there is a little bit of effort that you need to have as far as configuring or administering your galaxy instance in order to have it be compatible with with the interactive tools and to make sure the proxy is all set up correctly and so forth. And so that that administration is a bit more work, but the designing and creation of the tool I would say is about the same once once you have your Docker container. Thank you Dan. That's great. We now have three questions so yeah we're now swimming in them so you. Good. Thank you all for encouraging those questions. First question from anonymous is there anyone I can contact to review my tool. Anybody want to take that. If no one else is jumping on I would say absolutely so the best tools available are, you know basically released by the the IUC so there's a group called the interact galactic utilities commission. They developed the best practices for tools. They also accept pull requests for tools. And so if you have wrapped a tool you can always go ahead and you can try to create a pull request against that repo. You know it's somewhat of a badge of honor I would say if you can convince the tools IUC to accept your tool I think it's a great thing it's a great community that's really working to make the best tools available. And they will definitely give you great feedback on to how design your tools. Thank you. We have a question from crystal. And is there any difference in performance when using Docker instead of conda in tool wrappers that shows some advanced knowledge crystal does anybody want to tackle that one. Does anybody know how to tackle that one. Nobody's taken it. I can make stuff up crystal but I can't answer it. Yeah so I can say something if you wish. Well, it's not about the Docker but we are actually using just for the stink of singularity containers. And there are basically no no difference I think nearly no difference in the performance. I don't do that as a same experience but Yeah so I mean, you know once you have your Docker container, you know sort of downloaded pooled installed and your bio conda package installed that the performance tends to be pretty similar. Now again, depending upon where you have things executing you know there's typically are more giving as far as installing conda packages versus you know, you might have some difficulty convincing a cluster admin to enable Docker but you know you can go with singularity there and so forth and so there's definitely some benefits and to both or reasons to have different options. Now, you know that the great thing about Docker and singularity of course is you know, you get an even more reproducible sort of level versus conda you get a great, great reproducibility with conda. And there's still some some sharing of packages and so forth and so if you look at sort of the scale of like VMs you have VMs and then you have con or VMs you have Docker then you sort of have conda right at the different levels so you know that from galaxies perspective there's no difference but from the conda versus Docker they have their own differences. Thank you both. Umit and I apologize for that pronunciation. Can I use cloud storage such as Amazon S3 in a Docker container. I think it's possible to have to connect S3 as an object store in Galaxy. I haven't ever tested it but it should be possible. Okay. That's a strong recommendation. Yeah, so you can but you might be out there on very new territory. So, okay. Let's see. If my tool is in the IUC what is the next step to get tools into usegalaxy.org or usegalaxy EU. Okay, can I answer this one because I kind of know. A little so. So to get it on to a large public server like that. It needs to be a widely useful tool, and it needs to be well supported. And so if you have documentation, if you have a user help forum for your tool, that argues it should be there. I don't know if there's a formal process on usegalaxy EU. I think there was a formal process on usegalaxy.org, you can always post something in the help form that says, Hey, I'd really like to have this tool X on a usegalaxy.org server. Sorry. Star server. You know, and then people can upload it. But it. I think the main requirement is it needs to be widely useful and it needs to be well supported and well documented. Anybody else. I would just point out if you look actually so at some of the repositories available you can actually make a full request to add your tool to, for example galaxy dot EU. You, and if it's in IUC. There's a good chance it will be accepted. So if you go ahead in addition to having these conversations once that tools and tools IUC it's sort of, it has sort of a great great, you know, it's considered a very good tool. And so you can actually go ahead and make a full request against some of these repositories that are available, in order to actually have it set out. I think if you look in the chat, someone included a link to the actual repository. So, yeah. Oh, thank you, Meryl. So Meryl is one of the admins for usegalaxy you. So, is the admin. Okay. We are at eight o'clock sorry eight o'clock my time so we're at the hour, I should say, you guys. It's only eight o'clock in the world. So, if the presenters need to go, I'm going to say go. I'm going to keep recording and keep asking questions but we are at the hour and so I don't want to keep anybody past that because that's our commitment. And before we go Peter. Let's see you talked about the repeat Explorer workshops did you talk about the one coming up because this is a chance to plug it. Well, you know, last year we canceled that because of the COVID this year we decided that we are going, we are going to do that online. But, you know, the resources are limited. So, after four days of open registration we got like 100 of people register over 100 actually so we have to actually close the registration and our workshop workshop is coming in I think it's, we will start on the 25th of May, and it will go through three weeks there will be some interactive session but we will actually post some tutorials on the on the YouTube so so you have a channel so people can actually we can use it later as a as a manual. Thank you Peter congratulations that's a good problem to have. Thanks so, and Anthony you have upcoming tutorials as well is that right. Oh, we have the GCC training sessions. Yep. Yes, that's right. Um, thank you. Yeah. So while we're here. And before we lose even more people at GCC the Galaxy Community Conference which starts on June 28 and runs through July 10 it's all online. We have a week of training. And the week of training is what it starts with so the 28th of June through the second of July. And part of that is wrapping tools and part of that is Galaxy admin. Now Galaxy admin is a separate registration. And wrapping tools I think is part of the main how to be a galaxy developer part, but if you're interested registration is really cheap. And early registration ends in three weeks I think June 1. Okay. So think about that. Okay. What are we out here okay. So I'm going to declare victory and then I'm going to ask a couple of remaining questions. Okay. I answered that one answer done. Okay, so from Ben Wah and again I apologize. How difficult is it to publish on the tool shed and to make dependencies available if it's not present in conda. So, if what you need is not already in conda for your tool how hard is it to get your tool into the tool shed. My first recommendation would be to actually add your tool to bio conda. What's really nice is if you add your tool to bio conda, the bio containers project will actually go ahead and make a Docker container for that tool so then you'll actually have it available as conda and you'll have it available as a Docker image that you can run in Docker or in singularity. And that's just for the underlying dependency so I would recommend, you know, taking that step and actually adding it to bio conda or to conda for example depending upon which which tool you're looking at. And so if it's not available, you can always just, you know, roll your own Docker image and specify your name space, and then the name of that image for example if you want to. But I would definitely recommend trying first to get it available into bio conda for example. Thanks, Don. Thanks for the question. And a final question I'm going to declare this the last one. So someone I think I've actually met but whose name I hesitate to try and pronounce Aaron is Misha. I don't know. And I apologize. Can interactive environments be easily integrated as an interactive tool in our personal galaxy instance. I see it is disabled and use galaxy servers for some time. Thank you. Thanks for that as well. Yes, so before we had galaxy interactive tools we had this really cool thing called galaxy interactive environments and so you can launch for example, our studio and Jupyter notebooks and so forth on on a galaxy data set. You can actually take those existing underlying Docker containers that were built for that process, and then go ahead and you can just write a galaxy tool that references that same container. And, you know, I think it's quite easy. There's a, you know, there's a little bit of effort you have to put into create that new galaxy tool wrapper, but it is a relatively straightforward process. Thank you. And I'm going to declare victory. Thank you all. Thank you all for participating in the call. Thank you Peter Anthony and Dan for presenting. It's a huge favor I owe you all drinks with little paper umbrellas in them. And what else let's see so the recording for this and the slides for this will be posted online this week. The next webinar in this series is in two weeks which is resources for galaxy administrators and infrastructure providers. And that will be presented by Lucille and Mauro and I know Mauro is on this call because he asked a question or. So that'll be in two weeks time same time the same registration process. And thank you all for being here and I hope to see you again online. Bye bye. Bye