 Hi, everybody. My name is Asa Strabsky. I'm with the GHU hot power galaxy team. And I'm going to be presenting a lot of work that has been done by Alex as well as some updates on that work that has been done by myself and as Afghan on the continuous testing of continuous tools. In recent years, methods of deployment to galaxy has improved significantly furthermore. A lot of them have been deployed in single user instances. They're not in standard traditional methods. A lot of them being cloud deployments. The most common of these being and built in Terra, in which every galaxy deployment is a single user interface. To that end, a lot of tools that are run within these environments are being run differently than they normally would. And therefore, we have to account for that with testing. Alex Mach mood. Therefore, this is not switching slides. I'm going to set up a new get hub repo. A galaxy testing on the anvil get hub repository. That uses get hub actions to set up and deploy a GKE cluster to deploy galaxy Q-man on that. Access that instance as a remote single user. Pull all of the tools from a list of tools that are within that repo within a YAML file. Run all the standard tests of for those tools within that and then generate a report and update a really available in this repo. And this is what it looks like. You can see on the left there is this, the reports for all of these. It breaks this list of tools that it has into 14 groups. So and runs the tests twice a day. So over the course of one week, it tests all 14 chunks. So every week it tests all tools available on these instances. And it stores these results on the readings that you can access the results at any time for any given test. So currently this test is being performed on 211 tools with 1077 tests running every week. And last week 55 of them have been erroring and 91 of them have failed for various reasons. Some of them for a lack of test data that is required for that for that for it. So reference data, this normally available outside of these containers that would be deployed on a galaxy instance is not normally available. So that's one reason it could be an actual issue of the tool that we need to follow up on. Thing is that's not fully available from this repo. And this is where where I have been improving on this. So reacting to some of the failures that need improvements on this repo. We have very little time for resolution. So again, tracking tool failures, if it's a transient issue where it's failing a single time or it's an actual problem that we need to address. Furthermore, there's no real way to see at a glance how many tools are failing or if there's an actual issue that's going on or if it's just a single tool that's having a problem. So some solutions, we are converting this landing page that exists as now as the couple slides ago to keep track of some of the tools that are failing on that front page and using that current list of recent testing sessions on to a deeper page. And we will add some more resolution by having a list of errors and failures as things pass or fail back to the most recent past. So we can see it's failing several times in a row. If they're about a couple times or if it's just a single error since it's past, we might not need to check it out right now. Summary, new repo for cloud deployment for tool testing, which continuously tests all tools on a deployment and generates user-readable reports. And we will be adding at a glance summaries of the current state of all tools on those instance as well as technical resolution of those tests. Thank you very much. Are there questions from the people in the room? Yes. Is there a way to figure out like what happens since the last pass? Is there a way to find out? No, I was waiting for it to finish the question. Yeah, just, yeah, what went wrong? Is there a way to track previous fails? Yeah, to the last passing. Currently, it's not particularly easy. That's part of what this is meant to fix. So that currently the front page shows the most reasons and the one before that. And all of the tests are stored on that in the standard gallery tool test HTML and JSON files, but you would have to dig and dig to find that, which is part of one of the issues we're hoping to fix with this work. Thank you very much. All right. So our next speaker is Stephen Schenck, who's going to present us the all-in-one, all-in-one, all-in-one, all-in-one. All right. So our next speaker is Stephen Schenck, who's going to present us the observable HQ project for Galaxy. Thank you. Hello everyone. So yeah, my name is Stephen Schenck. I'm a software developer in Sergey Pond's lab at Temple University. I'd like to thank the organizers for the invitation for the chance to tell you all about the Observable Galaxy, which is our efforts to integrate the Galaxy project with a new JavaScript notebook platform as observable HQ. Okay. So first, I just wanted to say a little bit about visualization in JavaScript. So JavaScript has a rich ecosystem for user interface design and data visualization. It's the language of user interfaces for the web. So it receives a lot of attention from big players like Google and Facebook. It also has an ever-growing and very strong data visualization component. So this gentleman right here, if you don't know my name, I guarantee a lot of people have come across his work at one point or another. This gentleman's name is Mike Bostock. He's the creator of the D3 framework, which is a very low-level graphical primitive framework that spawned a bunch of other higher-level libraries. And he's now the chief technical officer at Observable HQ, where he's putting all his chips on this notebook platform. And so, yeah, so in addition to, you know, D3 and frameworks that spawned, like for instance, Vega-Lite is an interactive grammar of graphics. There's also structural viewers, interactive structural viewers that are programmable, like NGL. There is packages for multiple sequence alignments. Like our lab has one called Alignment.js and phylogenetic trees as well. And really more than I could fit on the single side. And so just a little bit about the platform. I mean, why not a platform? Well, I mean, number one, it's JavaScript. A lot of people might think Python or R in this space, but JavaScript is also a strong contender with a lot to offer. It's a very feature rich. Again, more features than I could go in on a given slide, but one that I've found really pleasant is they've solved this problem of rerunning the code when the variables change. So they use what's called a reactive data flow model, where all your variables are parsed in a directed acyclic graph is built up so that whenever one variable changes, the other only what needs updating is updated. And so it sort of solves this problem of what cells do I need to rerun when something changes and in what order. And so, you know, they have widgets for exploring this. And then it also has a really rich well abstracted stateful UI where the state is pushed all the way down to the level of the variables. So you can, you know, get a range slider, you know, with some with some Java script code, a one-liner and a view of direct directive immediately binds the input to that widget to a variable N and that variable N is now accessible anywhere in your notebook. And so there, I mean, there's a lot there. There's fork and merge based functionality similar to GitHub for, you know, collaboration. You can import and export modularity so you can import data from one notebook code from one notebook. There's markdown and late text support and plenty more. And so with that little introduction, it's my pleasure to introduce our efforts to integrate the two. So the Galaxy platform and the other way to know platform. So this is an example of a interhost variant calling workflow that I'll have a little bit more to say about later just to give an idea of what we've done here. So we've created a data type and an associated display application to integrate these two. It helps you productize these workflows and create shareable links over the web that are easily accessible from your Galaxy history via this display application. And the idea is that users, you know, with requiring minimal demands on your users such as the ability to load accessions and upload reference, you can run a workflow, get this display application and link out to a dashboard that can, you know, also be shared with a relatively short URL. So this can be shared, you know, over email or social media, et cetera. And so we really wanted to take, you know, to try to seamlessly integrate these two platforms and all spend the next two or three slides telling you how to get that. So like I mentioned, there's a data set and a tool. The data set is just JSON with a few required fields and there's a tool to generate it. One required field is a notebook, a notebook that takes, you know, username and the name of the notebook on the observable platform that you're going to be sending data to. And this is available to you in the tool. And then the tool will automatically extract the history ID. And I think some things are cutting out at the bottom. I'm not sure. I don't know if there's a full screen. But it should be utilizing. Thank you so much. So yes, so the tool will automatically pull the history ID out. And then it'll automatically pull the ID of this JSON as well. And this is really all you need to sort of really seamlessly interconnect these two. So I've given that its own name, what I call payload, because it really gives you everything you need. So you have a variable number of key and data set ID pairs. So you're allowed, you can select either in your history or from a workflow, a variable amount of data sets and this tool allow you to assign a key to those data sets and automatically extract the data set IDs in it. So as I mentioned, there's a tool and there's an associated display application as well. And so what's happening on the observable side, so we've built up using our code and some functions to sort of make this easy for our users. So we utilize the Galaxy REST API. We have some utilities for auto fetching the data. So try to make this as simple as I could. It's a two-liner. You have to import the auto fetcher from the main utility page and then call the auto fetcher and it will immediately populate the data sets associated with the keys in a JavaScript object on the observable side. And we also have, there's a few more utilities more than I can go into right now, but we have functionality to link back from observable into Galaxy. So if someone's looking at your notebook and they'd like to know, well, what was the Galaxy history that generated this? Can I tinker with this in the notebook? Can I tinker with the data sets in the history? There's functionality that again, we try to make it easy to just connect right back to it. So we're really aspiring here for reproducible, what I call Clearbox biological big data analysis. So Clearbox in the sense that you can look at any aspect, shine a little light with a notebook on some data set. I would say that you really want to do the giga and terascale data processing within Galaxy and cut it down to a kilo and megabyte scale in observable just because we will be sending data over the web by the REST API, but it's a caveat. So some people, it's a natural question to ask why payload and what does this really give you? And it's really, I hope I can convince you that it's for shareable, seamless integration and enhanced productivity to move data in and out of these two platforms. So you really, anything you need to share it, you just put it all on this observable JQJSON and you have the data set ID for that that allows you to go and fetch everything else. So for instance, the notebook URL tells Galaxy how to send the user to observable through the display application. The history ID allows you to fetch history information that you can link from observable back to Galaxy or to fetch data from observable. Multiple data set IDs allow you to pull these out of Galaxy and it's really all just if you just take one ID for the data set that holds all this information that's really all you need to get a nice short URL and multi-data set visualization from Galaxy, which is a continued topic of interest. So a little bit about how it integrates with workflows and collections. So you can use this in a workflow. In fact, I found it convenient to have several of these for workflow because there might be several different visualizations that you would like to see from history. And you can, like I said, multiple data sets per payload. There's a work in progress integration with collections. So at the moment, you can run this on a collection and you will get an observable JQJSON or payload associated to each element of the collection. And then much more work in progress is that we're doing some work trying to automatically track the data set IDs out of a collection. So that can be fed into observable and observable will have all the IDs in that coming from that collection. And it's there's a proof of concept there, but it's it's not quite as seamless, but it can be done and it's something I will probably try to pick some people's brains about this week. Okay, so I want to go to two examples. So again, this is this intro's example. So we have the privilege of working with the Galaxy team as part of the response to the COVID-19 pandemic and there are some intro host variant calling on deep sequencing data for coronavirus workflows that were published in Nature Biotechnology. And there was sort of a an associated dashboard that was kind of built up and used for several different data sets like the Boston data set as a popular one. There were a few others, but it's sort of really always involved several of the key people who knew the different stages of the workflow to, you know, go from raw reads to this dashboard. And so what this extraction really helps you to do is anyone can run it if they have accessions and a reference and then they can go share the results of such an analysis in whatever way they deem appropriate and just to show what it looks like on the observable side. So you have a table here, you know, list of accessions that you were at. We have a dropdown here in the lower left corner. So you can type in the accession and that will give you more information of visualization on that particular accession. So you have, you know, variant frequencies according to their mutation class. You have a genome wide variant browser with a brush functionality that you can select and kind of zoom in on a particular region. And then once you're zoomed in on that region, you can hover over a particular variant and get a lot more information from a tooltip. And then these variants are clustered according to these to their frequencies at which they were called with. So really kind of enables you to do a deep dive on your data and lots of data and drill down exactly what's going on. And then as another example, so this is SARS-CoV-2 structure and evolution. So this is the high-five fix effect likelihood method ran on 8,000 genomes from the Viper project. So the virus pathogen resource database. We do it on Galaxy. We map these to a reference, compress them down to, you know, extract out a few based on diversity and phylogenetic clustering. We go trees, build alignments and do statistical tests for either positive or negative selection. And then we map those selected sites to structure using the NGL viewer and, you know, show on the spike protein where sites under selection lie relative to an antibody in SARS-CoV-2. And, you know, we're interested in doing this because we have some colleagues that are interested in looking at the implications of negative selection for vaccine design. So in terms of privacy and sharing it currently requires either you sharing your history or utilizing your API key. And at the moment this is on our instance where we have enabled course for cross origin resource sharing for all of our users. So this does admit a small attack vector. You can, you know, for one example, cover up your API key. So if you're inobservable, so if you're comfortable with this as an administrator you could share this but there is a vector there if you're working with very sensitive data. So something like single use tokens, single use access tokens would be preferred which I understand is in the works and we're currently exploring full embedding but that work has not yet begun. So for future work, there's a demonstration on Wednesday to get better collection integration utilize the API better. There's more utilities to be developed on the observable side and embedding it fully in Galaxy. And then there's nothing really special about observable you could really target an arbitrary URL and then build some client libraries you know maybe just in a pure JavaScript like NPM library to sort of more seamlessly consume data sets from the REST API. So in summary we've demonstrated a proof of concept integration of the observable HGNOPL platform with Galaxy. We've created an associated data type tool and display application. It's available on GitHub in the test tool shed. We've integrated it with several charting frameworks which structural viewer, sequence alignments and trees and product ties existing workflows. And so with that I'd like to thank my boss Sergei Pond and Tom Nekrotenko who's enthusiasm for the idea motivated the development. Alex Strausky who you know I thought had a really great idea that you know was really interested in developing. Hadley King helped with administration and has some really good work to sort of tie researchers and full reproducibility in five compute objects to this idea. And some collaborators at Temple who helped get the analyses together the Galaxy team had really great suggestions on how to use the platform and our collaborators at George Washington University who motivated some of the analyses in the funding from the FDA. And with that I thank you so much for your time. Thank you very much. We have one time for one question from the audience and I will kind of ask you to repeat it for the people online. So if there are no questions and also no questions online. I would have a question and did you consider like integrating also other like popular visualization tools such as Plotly or like other Python based tools? So the question was did we consider integrating other popular tools like Plotly or other Python based tools. So the utility of this would probably mostly be in JavaScript at the moment and I believe there is a Plotly client in JavaScript that you can use to observable or you could also write a JavaScript client that you know a website that you wrote yourself that uses Plotly could consume. That goes to the arbitrary URLs and it's definitely adventurous but I could see multiple ways. So if there are no more questions. Thanks everyone. So our next speaker is Jada from the and the topic of his talks is galaxy in notebooks. Please. So Dr. all so as the title of my top just analysis galaxy in so as the title of my top just in this project we try to combine galaxy and books together so I want to give you a little bit about the galaxy and so as we all already there are galaxy and notebooks shares some kind of functionality in terms of like both the tools are web based application and they provide and then the analysis system so in terms of the galaxy system and the there are some sharing so like both the tools are excellent in terms of the functionality so next slide so basically an open source extension which is implemented based on the galaxy API and it is accessed via the extension interface it's an interactive GUI based interface of JIN provides capability to interact with the local as well as probably galaxy servers from jupiter lab via user friendly widgets with JIN researcher can graphically interact with the various galaxy instances and can access the tools history and different galaxy objects under the node so this is how a galaxy interface under a node where user can access a galaxy instance and as you can see the lab website you can access all the tools which is available on a particular galaxy instance so here we try to demonstrate like how the JIN like you can run a tool which is present inside the galaxy instance and once you run the tool so it shows you that the status update and once the job is completed you can also access like all the results and data inside jupiter lab I also want to talk little bit some of the unique feature of JIN so as that you can access multiple galaxy instances under a node and also at the same time you can access all the tool which is present in different galaxy instances I also want to share something about the data sharing tool where you can share the data objects between the tool which is not presenting into the same galaxy instances so it's a small demo demonstration about it so like here like I'm trying to share a data object which is present on the galaxy main server to load the server and not only you can share the data object but also the different galaxy instances but the other server so like here is the example is the JIN pattern server plug where you can send the data from a galaxy instant to JIN pattern server plug so little bit I want to talk about the implementation data so JIN has been implemented based on the galaxy API and the JIN extension is a NPM package which is implemented based on the python and java and it can access the objects and like the data objects and all this thing through the API call and can populate the tool form and execute the job and can upload the data from a galaxy instant although there are some of the limitation with this tool right now so you can't currently access the water flow through it and also currently the client site components of the galaxy are not available as a package so we have to build the form and the functionality by our own and the data sharing between the different galaxy instances and between different servers actually first need to download the data inside a twitterlabs server and then it uploads the data to various servers in the future we would like to implement the data sharing so that it can stream the data directly of the download and upload so in samvijin is an open-source Jupyter extension which is implemented based on the python, javascript and Node.js module it allows this researcher to intermix the python programming languages graphically and user can access all the different galaxy instances on the Jupyter lab but currently it's not available as a like npm and aipi package and it also can be run via . I also want to acknowledge my team at and the re-input team what's your quota so actually currently we are like there is no particularly in terms of the file size you can like upload and download as like you also have a question you mentioned in the talk this currently doesn't yet work with workflows what would be needed to make this work with galaxy workflows so we actually haven't tried it so it's like still in the process of implementation so like in the future we have like yes one more question from John very so these features can be set on the hard in a lot of websites so it's a piece of paper that can have this history of different websites and so on let's train your question the way we implemented the the important part of life can you repeat your question so these nice embeddings of experts that you can use on other websites wait actually it is based on the job in our and like all the port can be implemented for the website also you can't use the same code if you have some API functionality for other websites so it can be done but like you have to write the port it can be integrated as an extension thank you very much once more all right our next speaker is Sveinan Gunnesen from Alexia Norway and he's going to present the Galaxy Pro 2.0 Redux thank you very much yes thanks yeah so Galaxy Pro 2.0 Redux Dynamic Tool prototype and user using interactive tools so why the 2.0 Redux I have presented the Galaxy Pro 2.0 Redux in 2015 in Norwich and there's demonstration of real-time coding which failed and we remember that and yeah I managed to actually use an uppercase character instead of a low-case I found it anyway there's been no live demo today so I don't have time to delve much into the backstory here other than this was started actually a long time ago in 2008 we developed the Genome Tricorazer which was very beginning of Galaxy so we rushed off and did our own thing related to how tools were fine and then we are starting to try to bring our practical info after some years ago we are trying again with the 2.0 Redux so basically what we are suggesting now is to have a general suggestion of a new type of tool to be added to Galaxy and the protocol part is to containerize and that's not really what we want to do so it's more general than that so we get this accepted and merged then we can add hundreds of tools that we have for analysis of the Genome Tools or tracks but we should probably yeah so there's a poster so for now it's actually in a print house but it's also online so I'm just going to go through some parts of the poster to present what this is actually about so let's just go over here there's currently in my mind three types of tools in Galaxy and there's actually 33 of them yeah but mostly for the user that's sort of the main experience of it one type of tool is called the data source tool and that's where you branch off of Galaxy and look into external websites to actual data and then you come back to websites called API and there's a tool that started to actually import the data and there can be sort of anything that's that you can do on your site and then you have the interactive tools which are really neat that's where we then start up in Galaxy container and it's you open a new window and you can do whatever here like and then you have the general regular tools where things open up in the middle like things it's more limited and less than that so what we want to do is to mix all of them up and came up with our new type of tool which is a combination of the data source tool and regular tool which we call interactive client tool so basically instead of branching up into your window you get the contents into the middle handle so it's actually a data source tool but the source of the data is not external, it's internal so that's the interactive tool that started that which provides the data and the tool itself works on as a server and you would have the tool running for a base really and every time you sort of start a tool it just contacts the running software and then you have the regular tool mode so when you execute from the interactive tool it starts a regular job software job that runs and can run using the normal software language yeah so we have implemented a constitutional list and are quite close to having a tool that's ready I will not go into the product of the search so this is basically a way to go from your backcode into practical use in space and there's more to this so we also managed to use this as a way of programming so we can use Galaxy as an integrated development environment and do tool prototyping on the fly on Galaxy server that you're not going to follow just as a regular user so basically what you do is you open up the stage tunnel to the interactive tool running as a software container and then you can use simple syncing which is syncing the code from the laptop onto that container and then you get the initial session inside Galaxy yeah so I can go into that in detail in the demonstration and that's there and that's also a post-it tomorrow so I think over time moment I can also show it yeah yeah even at 4 server that I just need to browser so this is not running actually using all the local so basically you open up to the test tool well it says now that there's no interactive tool running so we can start up this service and then this interactive was done yeah so the server container is not starting and if I open up this tool here I'll connect with some more things I suppose yeah so that was that you know okay so on Wednesday you will see this slide thank you very much for this preview to this really cool technology are there questions from the audience are there any other questions from the online community it doesn't seem like that if not I mean you can have this as a finished tool if you like or you could create the regular delta tool if that's a dependency on the tool if you are using versionality that you can freely transmit the delta tool just at this time I would have one more question also yeah when you are done prototyping is it possible to export it then to an XML and to have it as like a normal galaxy tool everything is possible but that's not possible okay so you have to do that yourself do you think this is like really like is it really a lot easier for people to prototype tools in this way compared to the classic XML writing way I'm not sure to understand that would be up to the users to decide what is very powerful in this method is that we can have the functionality running very quickly for other users to try out so if you are developing for other researchers they can illustrate where we have large datasets and keep them running on the galaxy and you can just go in and change it and apply it okay thank you very much for this really quick talk and this will be the end of the panel