 Good morning. Good morning, everybody. Glad to see that so many people found their way here for day two. My name's Helga. I'll be the session chair for the first session today. And it's going to be about Galaxy in Education. And we have five talks coming up. And without further ado, I would like to introduce Marisa Loach, who is a first-year PhD student at the Open University. And she's going to tell us why we should use Galaxy in the first place. So please, Marisa. Yeah, so I am a PhD student. And I'm working on a project where I'm re-analyzing single-cell RNA sequencing data from public repositories using Galaxy. But that's not actually what I'm here to talk to you about today. I'm going to talk to you about the question that I've been asked a lot this year of why I'm using Galaxy and how I've added to my workload as a PhD student by performing my own review of these platforms so that I can actually have a good answer to give to people. Because the simple answer for me as to why I'm using Galaxy is that that's what my supervisor has told me to do. But I didn't think that that was really good to be an adequate response when I got to my final PhD biver. So I have decided to perform a systematic review of platforms that let you run different tools and create your own workflows. And I found that there's not a huge amount of literature that compares these platforms in terms of the features that matter to me as kind of a student and a biologist who doesn't really have like a lot of programming experience. Like there's quite a few reviews that look at the technical sides of these platforms, sort of what sort of container systems do they use and what sort of programming languages and how do they manage computing resources. That is meaningless to me, like as long as I can put my data in and get some results, I don't really need to know kind of behind the scenes how the tools are working. So I'm more interested in kind of what's it like to use these platforms as kind of a user. So because there was limited information in the literature about this, I've also kind of included in my review kind of direct usability evaluation to the platforms. So I'm just going to be going to the platforms as a new user, working through their kind of introductory materials and just getting an idea of what features they have and kind of scoring them against my own criteria. So in order to kind of decide which platforms I was going to evaluate and what papers I was going to include, I had to kind of come up with my definition of what I mean when I'm talking about platforms like Galaxy. So I'm talking about workflow management system, which is basically a piece of software that lets you choose between a selection of different tools. It helps you to use them and it gives you kind of ways of linking them together into building your own workflows. It probably gives you some kind of tools for managing and organizing your workflows. And it also should ideally for users like me take on a lot of kind of the computational decisions about how kind of computer resources are being managed and like deal with kind of software dependencies and things like this that I as a biologist do not want to have to learn how to do for myself. So when I started reading around the literature, I found that there were a lot of different platforms. There's like a small selection of them here, like I didn't want to inflict like the whole kind of confusion that faces a new user on you. And I'm kind of in the preliminary stages of evaluating these platforms. So I've kind of focused on the big three, which are Galaxy, Snakebake and NextFlow. These are the ones that come up a lot when you're kind of looking through the literature as like people have used them in their analysis or they've developed tools for them. So I kind of wanted to start off by evaluating these ones so that I can kind of refine the criteria that I'm going to use to evaluate other platforms. So in order to evaluate them, I kind of developed my own series of seven key characteristics or criteria that kind of emerged from the literature as being really important to users when they're choosing or using a platform. So first off, accessibility. Can I actually perceive and interact with the interface or can I be sure that all of my students or all of my employees are going to be able to access the interface? Sustainability, if I learn how to use the platform now, am I still going to be able to use it next year or in five years? Or am I kind of have to learn another platform because this one has disappeared? Reproducibility. How easy is it going to be for me to reuse my workflows and to give them to other people to kind of either try to replicate my analysis as closely as possible or to reproduce it in a slightly different environment with their own datasets. Fairness. Are the data and the workflows I'm producing going to meet the fair criteria? Are they going to be findable, accessible, interoperable and reusable? Usability. So in terms of my criteria, what I'm defining this as is kind of how quickly can I get started using this platform and what sort of features does it give me to organize my workflows and what happens when there's an error? How good is it helping me to fix that? And then learnability. Like how much of a learning curve is there for me to start running my own analysis on it and what sort of training materials are available? And so you can see I've run out of room here and I promised you seven characteristics but that is all part of the plan because the seventh one is a little bit different so we'll get back to that a bit later. Just for each of these six characteristics, I was able to give them a score from one to five based on how well they achieved it. So the kind of the simplest one was sustainability so if you've got a score of one for this, it was a newly developed platform one that hadn't been updated recently but it was a platform that was more than five years old but had been updated recently and it also had kind of an active community so you could kind of rely on it to still be around in the future. And for each of these characteristics, I gave each platform a score which I then plotted on these spiderweb diagrams and then kind of added a line which kind of gives you kind of a visual representation of which areas that platform is doing well in. So the first ones I evaluated next flow and state make which actually ended up with identical plots on my criteria so I might need to refine them in order to differentiate between them a little bit but basically for a user like me these are very similar platforms because they're both kind of text based workflow editors where you're typing in, this is the input I want to use this is the output I want you to generate and this is kind of the tool that I want you to use to do that and you kind of string together these like lists of processes or rules into making a workflow and they do kind of offer various kind of tools to help you produce reports about which tools and versions you use to kind of help with reproducibility and things like that. So for Galaxy it's a similar kind of shape because I think a lot of these platforms don't score that highly on accessibility but you can see when I overlay this one on kind of the inner ring there is like the snake maker next flow plot the outer one is Galaxy so you can see all Galaxy is doing better on some of these platforms particularly on kind of criteria on the left side like learnability and usability because the interface is so different when you're using Galaxy you're pointing and clicking on things you're entering a limited number of parameters and then kind of the key thing for me is there's not this huge learning curve of having to learn a new language and having to learn how to interact with the interface because a lot of the training materials you get for snake maker next flow is basically telling you what do you have to type in in order to make this happen whereas when you're on the Galaxy training network you have some introductory tutorials that kind of introduce you to the interface but a lot of them are actually about the biology like these are the analysis you can produce and sorts of results you can get and how you can interpret them and as a biologist that's what I really want to be learning I don't want to be kind of bogged down in the details of what I'm typing into things so you might think well this is like the answer to the question of why you should use Galaxy it's clearly scoring higher on some of these criteria but it's not quite that simple because these criteria don't always matter to equal amounts to every sort of user so if you are an educator and having kind of higher accessibility and learnability rating it's going to be really important for you so Galaxy is clearly a good choice for you but if you are a researcher who's quite experienced in bioinformatics you might not care so much about those criteria you might really be interested in the sustainability, reproducibility and fairness and on that these three platforms basically school the same so it's kind of difficult for you to make your choice this is where the seventh characteristic that I promised you might come into play so the seventh one I'm kind of calling it a sort of gradient between flexibility and support and the reason why I haven't included this on the plot with the other ones is that it's a little bit different because whereas on the other ones you can say one side of the scale is bad and one side is good on this one I don't think you can say the same because it really depends on who you are and what you want to do if you are a kind of experienced bioinformatician you have like really good programming skills you're going to want to get in there behind the scenes and decide how things are being managed and to kind of adapt or develop your own tools there's something like next that Flow or Snake make is going to be beneficial for you because they're designed for people who want to kind of take charge make their own decisions and really kind of get in there and adapt things and you might think well flexibility that sounds great like that's clearly the positive end of this scale but that's not always the case because for users like me like having too much flexibility is kind of too much of a responsibility for me and it really makes me have to make more decisions than I need to be making if I just want to run a standard tool and kind of set a few parameters to kind of adjust it for my data then having kind of too many decisions to make and having too many things to type in just increases my chances of making an error and then having to go back and try to find it and fix it whereas when I'm working in Galaxy it kind of it I have much more kind of support in kind of you only have to make a few kind of decisions on parameters and it's usually kind of guidance as to which range you might want to use for them so kind of you feel like have a much stronger safety net underneath you to kind of prevent you from making unnecessary errors and kind of to support you when something does go wrong and you also obviously have again the training materials so you can kind of develop your own workflows and analyses and adapt things without having to kind of think about all the things that biologists don't really want to be spending their time on you can really just pinpoint what you're interested in. So I think unfortunately there's no kind of simple answer as to like why you should use Galaxy rather than another one because it really just depends on what you're interested in doing like you know for some users they do want to they just enjoy programming and they enjoy coding and typing things in and or they kind of really need that flexibility and they do want to be working in something like next door state make whereas kind of the I think biologists and students then getting started in Galaxy is a lot easier for us and it kind of it does allow us to make the decisions that we need to make without kind of forcing us to kind of think about the things that we're not really needing to change in our analysis. So my next steps for the review is to kind of perform usability evaluations of more platforms and while I'm doing that I want to refine these criteria so that it can be quite sort of useful like measures for comparing these platforms and kind of with ultimate aim of sort of creating guidelines to help other people in the future to reevaluate them as platforms change whether they're kind of users who want to choose a platform for their students or for their own research or potentially even for like some of the people who work on developing platforms like Galaxy like maybe it will kind of help you to kind of understand what biologists like me actually kind of need from me. So I'd just like to finish by thanking my supervisor Wendy Bacon who is the person who made me start using Galaxy and also the GCC Fellowship for enabling me to come here and talk to you all in person. I'll be taking part in the poster session after this so if anyone does want to kind of share their opinions on Galaxy and other work from me I'd be very happy to talk to you but I think we do have time for some questions now if anyone has any. Thank you very much Marisa for this great overview. We have a first question coming up from Anton. Thank you actually. I would love to have these graphs because you know they're very helpful for grant proposals for example. But another thing I would also advise you to look at the GitHub stats for example number of contributors looks very different and you should also consider commercial things like DNA nexus or seven bridges whatever it's called right now because it's sort of it's also important to understand why for example some people prefer commercial products because they have this strange idea that if they pay for something it's better but you do need to understand what they do because they do a lot of things well as well. Yeah, thank you. Yeah, I think that's good advice because I think a bit more is like a student so we're always looking for free things whereas like if you are like an employer or a commercial company you're probably do you appreciate things that you pay for a bit more? Hello, we have a question from online from Lucille it is what could you suggest to Galaxy developers to improve accessibility? So I think the accessibility of Galaxy is kind of higher than for the text-based ones already just because it is this online interface. I think some of the problems are kind of easily solvable because when you run like an accessibility analysis like the Google Lighthouse one on the website you can see that some of the issues are just that it's missing kind of certain tags and things that screen readers need. I think as well like just the idea that for a lot of what you do on Galaxy you can either choose to use the keyboard or the mouse and that's kind of a big thing just to give people options is kind of a big plus for accessibility. So I think someone was talking about the workflows yesterday and how you can now kind of use like the space bar and things like that and the tab button to move between them. So I think it would be great if that could become a feature where you can do the whole like workflow editing thing just using your keyboard or using your mouse like that's kind of the sort of things that you're looking for in accessibility is just options for people to choose from if they kind of prefer or have different accessibility needs. Thank you very much for answering the questions. If we have no more questions I think we can finish on time and move on to our next speaker. So thank you Marisa very much for your very interesting talk. Thank you. Next we have Julia Jackela who is gonna tell us about her journey from being a new user to becoming a training community contributor. All right so hi to talk you through my journey from a new user to training community contributor. I am aware that most of you are already very advanced in using Galaxy but just to give you some flashbacks of how it was at the beginning. So me as a new user new newcomer to Galaxy I am still studying medicine and the biology of chemistry but I am a self-taught coding fan and I like to train in different fields. I was lucky enough to get some funding to develop some tutorials and to come here to you, not Julia. So just to point out some pictures that I think are quite important for newbies. Obviously GTN with so many tutorials on so many topics with those amazing questions boxes as well as different snippets which accelerate learning and make it easier. Also requirements section which guides the user through the tutorial series and now it's even easier with the learning pathways. Also the open and inclusive Galaxy community which is always happy to help and answer any questions on different platforms. So the next step from the new user would be to identify the gaps within the field and in my case those gaps where for example existing tools in a tool shed but no tutorials associated then there might be user's needs for specific tools or specific analysis method but no tutorials explaining how to do it actually. Or it could be also the aim for creating tutorials for users on different levels for those preferring Galaxy buttons as well as using the console. So here we come to the point when we can actually break this user developer wall and well for me I think all this what is needed is this curiosity and enthusiasm because Galaxy GTN actually has lots of tutorials that can actually bring you from a user to developer. Of course computing background might be helpful is actually helpful but you just have to enjoy it. Well in my opinion it was worth getting involved because you can feel full-fledged man you can feel the satisfaction when you can share the knowledge and passion with others and use your programming skills in a creative way. And obviously I started from small things such as testing tutorials, updating them, upgrading them with new features to finally be able to become an author of tutorials and slidex. So in my way I found it very useful to be able to reproduce the existing analysis from the code to Galaxy buttons and other way around. And it's especially helpful when we want to produce the tutorials for users who don't really like Galaxy buttons and prefer interacting with the console. And obviously on my way I faced lots of problems so there was a lot of troubleshooting on the way. Usually people only see the PR open and the tutorial published. However there is a lot of issues that you have to solve. So developing a tutorial is like an iceberg and you have to go through a lot of draft histories to actually develop a tutorial that works. And what I found very, very useful was the contributing to Galaxy tutorials which taught me how to write a tutorial in Markdown so that it's rendered nicely, how to test tutorials using Gitpod and also how to test workflows. So that finally led me to the point that I could make my first contribution to GTN and I want to stress here that for me it was very important to have someone above me who would give a review because I was very afraid that I will break Galaxy with my first PR. And yeah, after that you can call yourself a tutorial developer which is exciting. And it might sound like a success story but yeah, not everything works at first. So there's a lot of troubleshooting which takes lots of time and certificate is also needed to be successful. And what kept me on the track was, as I mentioned, the help from Galaxy community, which is amazing. My lovely SuperCube supervisor, thanks Wendy, and trying alternative routes when something doesn't work you just have to try something else. And of course, motivation to keep on going. And here's the expected timeline for this kind of journey. For me it took eight weeks full-time internship to develop a tutorial and a slide deck from complete newcomer. And it of course depends on your programming and biological background. So finally, Galaxy opens many doors to further development. And for me that was, for example, becoming a trainer for summer sports and for EDI courses and start new projects. So now I'm just looking to develop even more in new fields or developing tools. And last but not least, during this journey I learned how important is the community development and how important is it to have this communication between user training developers and tools developers so that we can actually maintain this kind of Galaxy circle of life. And with that, I would like to thank you for your attention and if you have any questions, I'm happy to answer if we have any time left. Yeah. Thank you very much. One question? Yeah, I think we have time for one question. Is there a question? No question online? Oh, okay, Anton. Do you think it would be a good idea to establish some kind of an editor-in-chief for GTN, sort of a person who would read all these things and essentially almost have like an editorial process for new tutorials where they get submitted, reviewed and so on and then they're published. Do you think it's a good idea? I think it's kind of like an officially stated. But having someone who reviews your tutorials and has broad overview of what is ever labeled in Galaxy would be very useful because we could improve tutorials that are just newly developed by like introducing some advances that new users, new developers are not necessarily aware of. Thank you very much. Thank you. I think our next speakers don't need a lot of introduction. If there is someone to be entitled as editor-in-chief of the GTN, it's them. So please welcome Helena and Saskia. Thank you. So I'm Saskia and together with Helena we'll do our traditional update of the GTN talk. So for anyone who might be new here this year, very quick introduction to what the GTN is. So this started in 2016. Before that, there was no real structure. A lot of us were using Galaxy to teach but we're all doing our own things. We're not communicating. So I had like a PDF with a transcriptomics tutorial somewhere. Someone else had the same thing almost in their Word document. Since 2016, Denise Batu from Freiburg decided, no, we need to organize this, get central repository and work together. So yeah, I immediately thought this was a good idea. So I jumped on that and a little bit later, Helena came along and was like, hey, you could automate a lot of this stuff. So she really helped us sort of to improve the infrastructure there. So at first we really focused on like getting these tutorials together and making it easy for people to find them and learn from them. So we published this and then afterwards we're like, okay, we need to really focus on how to make this good for teaching, or in education, whether you're running a master's curriculum or whether you're educating your researchers. So we focused more on that. Then of course the pandemic happened. So we're like, okay, now we really need to like get on to like, how do we do this remotely? So our focus changed again. So we have two papers, one came out recently. Do you wanna read more about this? And of course the three of us just leave this but most of it is by the community. So we have over 300 contributors who helped with these tutorials, who wrote tutorials or helped with the infrastructure or just get them up to date or test to them, all of that. And we have a bunch of topic containers who are sort of this editor in chief type thing for a topic. Yeah, so this is it, training.galaxyproject.org. Go there if you want to learn how to do data analysis in Galaxy. Like I said, this is really a community driven project by the community for the community. We now have 37 topics, 358 tutorials. And every time I present the slides already out of date, no matter how recently I made it. We have lots of FAQs, yeah, 309 contributors but it's already out of date. And it's been going on for eight years. And every tutorial is really meant to be sort of a hands on journal article. So a lot of these also really literally follow a scientific article that was recently published and takes you through it, gives you all the background and tells you what to do in Galaxy to reproduce this. So it's structured like a paper, it has all the authors, editors. It starts with some metadata, like okay, what is the purpose of this, learning objectives and other things you may need. Some have slides to really familiarize you with the background question and answer boxes. So for teaching again, this is very useful to sort of test your students' comprehension. And of course, we want you to get credit so every contributor has this page that lists everything you've done so you can really show off to the people who might not see the amount of work you put into this. And every tutorial can be cited to people use it for their analysis. Okay, quickly into teaching with the Galaxy ecosystem. So Galaxy we think is a fantastic platform for teaching and training. Often times when we talk about this, we use the phrasing of like, okay, we're not teaching Galaxy, we're teaching bioinformatics. We're using Galaxy, but we're teaching bioinformatics. It really takes away a lot of the complicated portions of working through data, working with your analyses, running workflows, so you can actually focus on the science. That's what we always want to do, focus on the science, how we can teach these concepts of what's an assembler, what's a tool. So Galaxy is fantastic, we love it. You can just bring a web browser. It's great for teaching. There are a lot of supportive resources for teaching with Galaxy across the ecosystem. Of course, the crazy number of GTN tutorials and FAQs, these are fantastic resources for support staff who are helping people with issues when they have to learn to do things like rename data sets, change data types in a collection. We have FAQs that you can easily link to, let's say here, this is how you do the step-by-step instructions. We also have the video library of course, which is a new addition as of the pandemic really, and it's grown quite large. I forget if we have a slide on that, but we have something like 116 hours of videos that are a combination of human recorded and automated videos. Galaxy itself, of course, has a lot of features for teaching, including data libraries, which make it really easy for you to put all of your teaching and training data in one place. We attempt to have a GTN teaching or data library that encompasses all of the data sets that are used throughout the GTN into one place where you can easily quickly import them on any of these Galaxy.star servers. We also have, of course, the click to run tools and click to run workflows. These workflows are a new feature this year. This is all part of the tutorial mode. So when you go to Galaxy and you click on the little hat icon, you activate tutorial mode and then you can easily interact with the tutorial in a more direct, interactive way, saying, okay, I'm going to launch a tool, I'm going to launch a workflow. And of course, TS has been fantastic. So TS is a community project from a lot of the CIS admins across Galaxy and funders saying, how can we provide all of these cool resources that we have and make them available and more available for teaching? So TS Training Infrastructure Service, it makes it really easy to run a dedicated job queue for individual training events that all of your users go to Galaxy, they get the same Galaxy that they've used otherwise. So you can really easily migrate users from IMA learner into I am an actual user. They all, it all happens on the same Galaxy. They have access to all of their same data sets and resources and really makes that continuum quite easy. There's a nice form. This is the old version of the form. I think Cameron has made it look a lot nicer. Thank you, Cameron. We really appreciate that. You fill out this form, you get a nice link. People click the link, they're in the TS group. Very easy, you don't have to do anything. As a teacher, you don't have to collect their email addresses, their Galaxy identities, anything like that. And for educators, of course, is the TS dashboard. This has made remote teaching really fantastically easy, I think. In the old world, we would go around the room and we'd say, okay, are you done? Are you done? We'll look over people's shoulders to say, have you finished this step? And now we can just check a website and we see immediately that everyone's done. It's made draining slot faster for us as well. As of 2018, we started TS and since then we've had 508 events across the four main TS instances, covering 143 countries, we have a nice paper, you should go read it. We finally got it published after six months. And it's funded, in theory, 24,000 learners. We think that number is roughly accurate. It's a fantastic number of people who've been helped by all this free infrastructure that's available. I am GTN Video Library, 116 hours, as we said. So all of these videos are recordings of instructors teaching their tutorials, teaching their materials, things that they know best. Yeah, it's great, please use it. We make it easy to embed in your course materials if you are using one of the major course platforms, like Blackboard, things like this. There's an easy like embed this in your course platform. This has also been wrapped up into the course builder. So if you want to rerun an existing course, we've taught a bunch of online courses these past three years. If you want to rerun or remix one of these, we have a little button that says remix this course where you can edit the description, you can edit the schedule, add or remove different modules from the library and get a course. Is this one you? Yeah, so if you want to learn more about this training infrastructure as a service, Tiaz, and how you can use it for your own education, there will be a webinar July 25th in Australian times. But I think the video will also be available after so you can watch that and we'll go into more detail. Yeah, so now quickly some new features that we've added this past year that you may or may not know about. So first of all, lots of updates, lots of new tutorials. So we have over 746 PRs merged in the last year. So that's really a lot of new tutorials by you. There's a lot of updates to existing tutorials to make sure they're always up to date with the latest tool versions and the latest state of the art, a lot of infrastructure work. So yeah, that's really great. So one thing I want to highlight is these automated video slides. If you provide a slide deck and you put good speaker notes on every slide and then you just say, yes, I want a video made of this, we will use automatic text to speech software that will create a video and you will have a voice narrating based on the speaker notes with the slides. So it's really a video you can watch. Often this is based on what like real instructors have said during teaching these slide decks. You can even choose your own accent. So if you want an Aussie accent, you can even specify that. And this really makes it easy to keep up to date because if you record something as a person, the slides change, you have to reinvest that whole time and this just automatically submit the pull request and the video is updated. That's great. Learning pathways is also something new. So you may have noticed this new button at the top of Galaxy, learning pathways. So this is really a journey around a topic that takes you from multiple tutorials around the topic, really from nothing to from zero to hero. So, and this is across different topics in the GTN. This is often like what you would see if you get like a week long course around single cell, for example, you can go to this page. So whether you want to learn or whether you want to teach and just have like an idea of like what are other people teaching, you can go here to get some inspiration. And if you of course want to add a new learning pathway, this we're very happy to include it. If you're teaching already something, courses of a few days or whatever, yeah, please just add what you like to do and it'll be useful for others. Choose your own adventure is something you can now do in a tutorial as well. So you can add this sort of a choice here. So this is from the RNA seek tutorial and users can choose whether they want to use star for alignment or feature counts or for the counting and then based on what they click here, the rest of the tutorial will change. So you can also do things like, okay, I want to use this reference genome or this thing or I use it for like the long version of a tutorial and the short version of a tutorial. So yeah, this is very nice new feature. A tutorial mode was already mentioned. If you click on this hat icon in Galaxy, it'll open the GTN inside Galaxy. And the nice thing here is that you can now instantly click on this tool name to directly open the tool in Galaxy. And since recently you can do the same with workflows. So you just click on this workflow, it'll import it into Galaxy and open it for you as you can again, one click run workflows. Helena's done some work to improve the tutorial search and every tutorial now also has this persistent identifier. Okay, and a quick tour through the accessibility options. So the GTN cares a lot about accessibility. We test with screen readers. We test with color blindness. I'm color blind, right? We test under a lot of different circumstances. Where possible, we have support for a lot of different impairments. We're really proud of this, right? All of the automated videos that Saskia mentioned earlier, those have perfect captions, of course. There's no need to manually caption those videos because we know exactly what's being said for them. So fantastic, easy. We have support for okay cognitive accessibility issues. This doesn't just affect people with those impairments currently, but also all of the rest of us who are maybe not remembering how to do this particular option because we are busy overworked people and we don't remember how to change the data type of the collection, something like this. So we always have these FAQs that maybe people who are confident in their analysis skills don't need these, but there are lots of us who do need them and benefit from them. So we make sure we include them wherever possible. Contributing to the GTN is fantastic and easy. As Julia has told you, okay, maybe it takes eight weeks for someone who isn't new to this. We're trying to let us know what makes this easier. We'd be happy to help. We've got lots of events if you want to learn, like the webinar mentioned, we've got lots of features to make your life easier as a contributor. If there are pain points you encounter, we want to know them so we can reduce those barriers to entering. All of the tutorials have feedback. We would love your feedback on, as a teacher, as a student, as a learner. And with that, we would like to acknowledge everyone in the Galaxy community, the Elixir for funding us here at GTN Community. Thank you all. You've made this truly a fantastic resource not just within Galaxy, but outside of it. Thank you so much. Thank you very much for this amazing review of the GTN history and developments. We have a first question. Okay, thanks. A lot of this is really, really cool. Like with the pathways, do you have insights in the people using this for complete courses, or is there a sort of vision for doing courses complete, like semester-long courses? We don't have any learning pathways that currently semester-long courses. We'd love to have them. I've taught semester-long courses, but they were not completely GTN material, so we haven't included them. We should do that though, because all of the Python modules that are currently in there in the data science topic, those were created for a semester-long Python course. So, fantastic suggestion. Thank you for the reminder. I will absolutely add that. Yeah. This feature is only a month old, so hopefully we'll get some input from the community. So please, add your learning pathways. We're really happy to have them. Thank you very much. And we have another question. Can you tell me a little bit about the video, the slides to video feature that you have and who can use this? And once the videos are created, how can people access this? And can the whole world use this? Because that's a kind of a lot, isn't it? I mean, yeah. There's some competing options there that you might want to use. Ours is very deeply integrated into the GTN format. It just takes these slides. It looks at the individual blocks, individual slides, extracts them. It renders them, each page by page, into a PNG. It renders all of the subtitles or the captions, the speaker notes that are attached to each slide, into text or into audio, sorry. Combines all of this together. It's really easy. There's nothing terribly complicated in there other than the muxing of all the data together. But that's just a bit of FFMPEG. Anyone can use this, of course. Anyone who is contributing to the GTN gets this for free. It does use AWS for the audio and GitHub actions minutes, of course, for the completion of the video. But yes, anyone is welcome to take advantage of this technology. Thank you very much once again. Our next speaker is Natalie Kutcher. And she's gonna tell us how we can use the Anvil platform in education. So please, Natalie, of course, yours. Thank you. Hi, everybody. My name is Natalie Kutcher. I'm at Johns Hopkins University in my Schatz lab, working in a couple of these different groups. So I'm really excited to talk to you today about using Galaxy and Anvil to diversify the genomic data science workforce. So as you all are very well aware, the growth of the genomics field and just the increasing amount of data that are being generated every day provide a lot of opportunities for bioinformaticians and data scientists. A lot of these spaces for students to get involved with a lot of applications like we heard about yesterday in conservation as well as biomedical research. But traditionally, this hasn't really included folks who are at institutions that don't have access, don't have funding to sequence their own data and generate data or have resources for storage and compute. And so being uniquely positioned with Anvil and Galaxy, we've organized the Genomic Data Science Community Network, which is a network of faculty at institutions across the United States that are at predominantly undergraduate institutions like historically black colleges and universities, Hispanic serving institutions, tribal colleges and universities and community colleges to help to provide exposure to genomic data science to students that haven't really had exposure to this before. So the main goals of this network is to introduce these faculty to one another. They often face very similar barriers. So to do this knowledge sharing for how to overcome these issues that they encounter, expand their access to resources and data through Galaxy and Anvil, as well as developing educational resources, which I'm going to focus on today. Since these faculty have a lot of responsibilities, have a lot of things that they need to do, helping to create materials that they can reuse and share across the network is really a big goal. So we published a paper last year about this. If you'd like to learn more, please check it out. So like I mentioned, a main goal here is for faculty to contribute the exercises that they've developed that they use with their students to the Genomic Data Science Community Network. We have folks who will adapt these material to run an Anvil. And then this is something that we continually test and maintain over time to support its use in the classroom. One example is this activity SARS-CoV-2 variant detection using Galaxy and Anvil. This was developed by Robert Meller, who's at Morehouse School of Medicine and one of the members of the GDSCN and adapted by Ava Hoffman, who's part of the Anvil outreach team at the Fred Hutch. So really the key goals of this activity are to introduce students to basic genomics concepts, introduce them a little bit to computing and bioinformatics, and then give them this authentic experience of discovery. So in this example, students align viral sequence data to a reference and can visually identify the Delta variant in the virus. So we've adapted this activity to run an Anvil, which has been used in a lot of really cool research activities lately, like the Telomere to Telomere Consortium, completion of the human genome, as well as the human pan genome reference consortium. So giving students this really authentic research experience using resources that the top researchers in the world also use. So the materials that we've developed as part of this, there are a number of lecture videos that cover some of those introductory genomics concepts and computing concepts. We've also developed a student activity guide that our outreach team uses. And thanks to Helena as of last night, we've been able to convert this also into a GTN tutorial. So really excited to contribute that to the GTN as well. So one, I think of the main ways that we can continue further integrating these resources that we're developing is to add an arm to this fun robot otter that's been developed by the Anvil Outreach Group, which stands for open source tools for training resources. So this will automatically push the content that we developed to a number of publishing and MOOC platforms. And I think that adding an arm that will also automatically add these tutorials to the GTN, it's going to really nicely integrate our efforts with making these resources accessible to students to also freely run on Galaxy as well as Anvil, which we're providing resources for the educators to use. So future connections with Galaxy, like I mentioned, we want to integrate otter to push and publish to the Galaxy training network tutorials. And then also we'd love to further involve the GDSN with the Galaxy community, as we've heard in the last talk, this and the others earlier today, I think there are really nice ways to collaborate with these groups, involve them more in smart support and MGCC. A lot of the folks are also really keen on making connections with other researchers. So trying to identify the communities of practice that are most applicable to them. And then encouraging also you all to partner with the GDSN to help support them in this endeavor to expose students to genomic data science and provide pathways for them to continue. So with that, I want to thank the GDSN folks on the Anvil team and the whole Galaxy community. Thank you. Thank you very much. We have time for one question. Yeah, sorry. Thank you. How do you manage personally identifiable information within this environment? Yeah, so Anvil is a cloud-based platform that runs on Google Cloud and so with this on top of GCP, Anvil is built on Terra and so they've undergone FedRAMP authorization to be able to securely handle protected human genomic data sets. I think there are additional processes that need to and certifications that it needs to undergo to handle more sensitive clinical data. But at this point, this FedRAMP authorization in the United States certifies this platform to handle human genomic data. Thank you very much. I'd like to welcome our next speaker, Sarah Williams from QCIF, who's collaborating with a core facility to build research-scale workflows. So please, Sarah. So yeah, I'm going to be chatting about some experiences we've had working with core facility to develop workloads in conjunction with them, talking about these sort of challenges and we've encountered and the solutions we've kind of tried out. So what the situation was is that the Griffith Central Facility for Genomics, they've started, they've been getting some more machines, sequencing facility and, you know, they're looking to establish some pipelines to process their data, to share with them, to sort of value add to the people, the users of the facility all around. So, you know, they've seen Galaxy, they're not Galaxy users, but they're interested and they're quite willing to, you know, see this is a good solution. So the goal of this project was to develop these practical pipelines for the routine processing of some metagenomics and single cell RNA-seq data. And the kind of stakeholders involved this, of course, are the Griffith facility people, these users, and then there's people from Griffith, sorry, from QCIF, so myself were involved in the workflow development side of things. And then also the goal with this work is to make this more broadly reusable and useful for the wider community around Galaxy Australia. So one of the first challenges we hit was actually we didn't have a pipeline in place. So this was all very new and where are we gonna start? So this is where the Galaxy Training Network was like a lifesaver, because it's like, oh, where do we start? Oh, okay, well, we're starting there. And we can adapt that kind of stuff into do what we need to do. And, you know, trying out tools and all of that sort of stuff is just a matter of time. And yeah, solutions for this was obviously communication, super important. And, you know, sticking with those sort of toolkits that are already in Galaxy that have to do suitable tasks. Getting support from Galaxy Ferk and, you know, the broader Galaxy community. Another challenge was, you know, when you know roughly what you wanna do, making it work end to end. So, you know, we, this is, we, you know, sometimes the tool we wanted wasn't available on Galaxy Australia. So we needed the support from, you know, the Galaxy Australia team to, you know, can we please install this tool and, you know, tool wrapping and that kind of thing. Another thing I wanna touch on is that as a bioinformatician, some of the Galaxy programming logic, and I am saying programming logic deliberately, it can feel a little bit awkward when you, like, an example, when you've got a series of, you know, results, columns of data for different samples and you just wanna join them all together in one big table. I'm in R, I'm like, oh yeah, C-bind done. In Galaxy, you really have to sort of think about the steps involved in, you know, making our collection, getting your names, joining it all together and testing what you've done. It's, it's a matter of perception, but it's just one of those challenges that you do hit. And yet, you are a little bit more removed from the debugging facilities because you don't have the machine directly accessed to the machine you're using. And the solution there was obviously a lot of support that we received from, you know, Galaxy Australia folks, the online Galaxy community, all of the useful, you know, help that really enabled this kind of to get over these hurdles. And yeah, no workflow is really useful unless you understand what it's doing and, you know, you can, you can drive it. So these things really need to be documented, appropriately shared, we've tried to, we've tried to sort of use all these good resources available through the Galaxy community. You know, the inbuilt reporting, workflow hub hosting and, you know, just documenting things. And again, there was the support involved in getting this to work. So you can see I've said support several times because that is really the rig solution for, you know, that is what having that little bit of hand-holding sometimes has what has enabled this project to sort of get as far, get this far. And yeah, so there's this real archetype for using Galaxy in, you know, a context like this to produce these kind of workflows for use, routine use and, you know, frameworks there. So that's right. If you're interested, there are some links we're kind of actively, we're approaching version one. So it's still a little bit actively developed, but there's some information there. And lastly, just everybody who's been involved in this project, especially Ahmed and Valentin who have been working, have been developing the shotgun and 16-hours metagenomics workflows. Mike who's been wrapping Soul Ranger for Galaxy. And Amanda who's been our contact at Griffith Uni who's been testing all of this stuff and really helping, like the discussions and very lots. Thank you. Thank you very much, Sarah. We have time for questions. Any questions from the audience? First, Marius. This time, sorry, he's gonna get first. Yeah, thank you. I want to acknowledge that really, you know, thinking about work-crossing Galaxy is a little different than thinking about work-crossing, like that you added in the command line and the sample tracking is different. Do you see a, I mean, maybe we can also discuss this later, but do you see a way to make that logic a bit more digestible? I'm not sure. I don't know enough about the, you know, the workings of Galaxy to understand. I think sometimes it's a matter of the way a tool is wrapped. You know, if it takes a collection and outputs on the collection or it takes a collection and outputs like a summary file, yeah. Maybe it's a tool wrapping on a tool. Is this in production already? Do you have some feedback from users? Not yet. Okay. Yeah. See. Okay, if I can sneak in one more question from my side. What has been your experience on using Galaxy Australia as sort of the institutional processing platform? Did you consider using like a custom Galaxy that is integrated and built in your facility? Oh, yeah. So that just doesn't know because the Galaxy Australia platform is kind of there for Australia and should be used. And yeah, we just didn't have the compute within the group already to be using that. So that was one of the big calls of Galaxy Australia is that, you know, it is provided for Australian users. Okay. Yeah. Then do we have more questions? Questions online? All right. If we have time, yeah, sure. Why not? So for debugging things, are you, I mean, you mentioned that the debugging is a little complicated. One could imagine that admins get different permissions and could, for instance, enter a running job on a shell terminal or something like this. Is that something that would help this or do you have other ideas for like how the debugging could be made easier? Yeah. So one of the, so a particular example with the debugging thing, knowing the, I think maybe having access to the actual singularity container or, you know, VM, again, I'm showing my ignorance here, that's actually running the commands. Because sometimes it's like, oh, actually I think it's, was an example where I thought, okay, maybe this is due to an underlying library of the, that's used by Python package that's wrapped that might be wrong and causing an error, but. Or maybe, you know, just get like a script to redo it on your machine or something. Yeah. Well, it's just one of those ones that you run that exact command and it gets different. Thank you very much. And with that, I think it is time to close the session. Thank you very much, Sarah, for your presentation. Thank you.