 Alrighty, can folks see my screen? Okay. Great. Okay, so today I'm going to be talking about the genomic data science community network. This is an effort that's organized by folks at Hopkins and the Fred Hutch Cancer Center. And this is an effort that's been going on I think since around 2020. So, we've been meeting together with the network for a couple of years, and they're really excited about some of the work that we're doing so. There are some exercises that we've used that leverage galaxy on an bill and a lot of our network members like Mike was describing at tag love galaxy for education. So I think this is kind of another venue that we can use for genomic data science education. So a little bit of the background as I'm sure this group is well aware. The amount of genomics data that's been collected is already massive and then massively growing. And I think this provides a lot of opportunities as well as needs to analyze the data. And there's a lot that can be learned from it. So I think that this is definitely an important area of research and one where a lot of folks who are kind of new to genomics and genomic data science can get involved. But I think that there are a lot of barriers and disparities that kind of exist in this space a lot of institutions lack their resources to set up computing centers and places where they can just house all the data. Institutions that are more under resourced and there isn't as much than outreach to students who attend some of these institutions to access education that can kind of even expose them to genomic data science. So relative to kind of the whole population, students from kind of minority groups are matriculating at a lower rate for bachelor's degrees and biological data science. And so there's this gap here with, you know, lots and lots of opportunity, but really limited access. So this is motivated the creation of the genomic data science community network. So so far this is about 26 members from underserved institutions across the United States. So these are faculty and educators at historically black colleges and universities, Hispanic serving institutions tribal colleges and universities, community colleges with a focus on undergraduate students. So we have, I think a pretty fair representation geographic distribution across the US representing 18 states, Puerto Rico and Washington DC. And it's been really exciting to meet these network members learn about their research interests, what about the kind of education that they're really excited about. And together with this network, the goal is to help to connect these faculty members to one another. They have a lot of rich information on the kind of institutions and systems that they're within and have really great knowledge of how to do education in a way that's really exciting to these students. And also helping them to create these connections, cross institutions, as well as expanding access to resources and data. One of those barriers of course is access to data, but then also access to resources like a computing centers and so this is kind of a place where it can fit into help. Students and educators access to these sorts of resources that other institutions don't really have the capacity to create and support for them. So at the Hopkins group in Jeff Leaks group, which is now at the Hutch, they're really incredible at developing these educational resources that are scalable and modular. So one of our goals has been to create educational content that the faculty members and educators in the network can kind of plug into their courses. And it can be really exciting for students and help to kind of expose them and pique their interest in the field. So earlier, I guess middle of last year we published a manuscript about all of our efforts and a lot of the ways that we've identified and learned from the network members, what is needed to help to kind of over time through accessing students and educating them to diversify the research community in the field of genomic data science. So there's a lot of really good information in this manuscript and you can access that in the link here. The kind of main ideas in order for us to kind of get to this point where we're really supporting students is to really leverage the resources that are available at our one institutions, connect with stakeholders at underrepresented institutions like the administrators and funders in order to support faculty who have this really rich knowledge about education to support them in supporting students. So this involves recognizing the strengths that they have at their respective institutions. A lot of this is around kind of belonging and involvement people get really excited and have really strong networks at underrepresented institutions. So this involves also sharing resources and being able to collaborate kind of across the board across institutions to develop resources that we can share. So within our group within the network we've met, I think four or five times virtually and had our first in person meeting at the end of the summer last year. And a lot of our efforts have been initially kind of building up this community, building up those connections and collaborations. And last year, we worked on setting up some working groups to let the network members really take the network in the direction that is most exciting to them. So there are a number of members who are really interested in microbiome research and education, some that are really interested in developing curriculum, and then some who are parts of community colleges and interested in finding ways to bring genomic data science education in the community college space. So that for the rest of the presentation I'm going to focus on a little bit about our educational resources. This has been something that is really exciting to build we've been able to leverage some of the courses that some of our network members have created and adapt them to run on Galaxy and Anvil. And are kind of creating this system where this is something that we can do scalably to create, create fair educational content, which is, I know a really important goal in Galaxy. So the content developers in our group have developed some educational infrastructure that supports this work being done in a scalable way. One is Otter or open source tools for training resources. So this is a resource that can create courses from markdown files, and these can be published in Book Down, which is the primary format that we use, but then also can be published on Lean Pub and Coursera. So if educators are interested, they can create kind of a Lean Pub course to direct their students to as they go through the materials. And another really cool piece of infrastructure is this Mario package that's been created that can render videos with kind of an automated voice from Google Slides. So it can kind of take the slides, run them in order, and then the speaker notes will be converted to kind of an automated spoken text, and that way we can quickly create these videos, but also easily kind of update and adapt them as Anvil and Galaxy on Anvil might kind of adapt and change and things might look a little different. So it makes it a little a bit quicker to create a new video when there might be just kind of one slide or one piece to update. So I'll just highlight a couple of the activities we've created. One is this SARS variant detection activity using Galaxy on Anvil. So it's a great activity because students have all heard about COVID, understand the importance to society. And this is something that, you know, maybe they've heard about, but haven't really had the exposure to understand a little bit about how to understand the different variants. So this activity is a really big kind of topic and help pique their interest. And then it also introduces students to some of these essential kind of skills and analyses in genomics and exposing them to some of these concepts that can be more broadly applied beyond this one application. And then also introduces them to genomic data science and using the cloud for computation. So just a quick overview of this workflow, students will get signed into Anvil and launch Galaxy. They will run FastQC to check the data quality. And these are SARS viral sequences. So, you know, you can pull in any sequence in this activity, but one is provided. Align the reads to create the viral sequence and then visualize the sequence using J-brows so students can see what it exactly looks like when you're looking for a variant and can do a little bit of that exploration. Manually. And it helps students to understand kind of these real life questions. They've, you know, heard about a couple of different variants. So it gets them to concretely understand some of these things and hopefully kind of get them interested in other applications for genomic data science. The activity also includes some lecture videos, so introducing students to variation, sequencing, alignment, genomic data structures, and proper computing. And there's also this pre-lab lecture video that walks students through everything that's going to be done. So the goal is to create something that educators can use, but this is also that something that can be kind of done individually as an activity by students. And kind of reused in different settings. Another kind of more advanced set of books that we've created is based on the statistics for genomics course. So this goes into some more advanced concepts into kind of more programming skills. So these cover topics like differential expression, RNA sequencing, single cell RNA sequencing, and PCA, and this is in bioconductor using bioconductor in RStudio. And then up next we're kind of polishing up the statistics for genomics, and then also creating course about epigenetics. So some of the feedback that we also heard from the network is that it can be really challenging for instructors to kind of find these materials and learn and kind of be really prepared for implementing them in the classroom. So we hosted a train the trainer activity that walked through the whole exercise and kind of showed them exactly how it might look like to run it in their classroom. And we're really encouraging and kind of hoping to support the network members in plugging this into their classes and actually running the activity. So we're kind of making ourselves available through the and the kind of outreach working group we have a discourse forum like Galaxy does. And then we also are kind of opening up the office hours and providing to support to them while they can test out the activities. So I'm going to stop there. I am really, really excited about a lot of the work that the GDSN has done. Again, if you want to read more about that and some of the other lessons learned you can check out the link to the paper here. I think the slides are linked in the agenda. But yeah, thanks and I'm happy to take any questions. I see your comment in the chat you're in about creating a blog post. Yeah, I think this would be a great thing to share more widely. Yeah, yeah, it's great I didn't know about that and I think we should just link it out into various other initiatives and and say, Hey, this is also used to actually teach and train students on that. Yeah, I have a second question. So the the first. So your network these GDSN is an opportunity to to give talks. So Berenice and Berenice students, they have created wastewater workflows wastewater detection workflows, and two other microbiome workflows and they're searching kind of for a venue to present that work. And maybe that would fit. I mean, you also do in this network, a lot of microbiome data analyzers. Is that correct. I have a lot of members who are really interested in microbiome research and education so we have a working group that's working on collecting various educational materials, specifically with microbiome for kind of the undergraduate level so I think that would be a really great place to get connected. Yeah, I can make that connection. I think that would be a perfect fit. Thank you for sharing that work. Cool. Just, and then just to sort of prompt discussion. The Mario stuff's like really cool right you get like a slide deck with just notes and then it'll automatically render it into videos. I don't love the, the automatic voices, you know the automatic reading but it's like pretty good. And then there's an index so if I just need to watch, you know, a snippet of like where, you know, under which of these 50 menus is this thing hidden, you know, having a video is useful because then you can actually watch it sort of play out. So, you know that technology, you know, kind of renders from Google slides today, and maybe mark down. I'm not totally sure. I'm wondering if we have anything equivalent from like the GTN, or if, you know, if it makes sense to open up a discussion about how we could have that. Having just gone through this training at tag, where we're talking about how to launch this like very complicated workflow and there's like a million steps involved. It would be so wonderful if we had some videos that were like, you know, nicely organized where if you know if you just needed to get, you know, the 10 second version or the 10 minute version or the 10 hour version. You know we could do that on the fly I'm just wondering what's available. James had been working on some stuff a couple years ago, and that was at the same time. Helena was working on automatic slide and video generation for all the GTN stuff. I don't see her on the call. But yeah I know there's interest and this was the one thing that I was going to ask about is there a way to, you know, merge the two efforts somehow or, you know, really compliment each other compliment them I don't know. It's not going to be a five minute job but now that they've gone through all this work to kind of bundle it and sort through the issues I bet there's a lot of lessons we can learn from them where it'll still be work on us but it would just be so wonderful if we could take some of the GTN resources and have those videos automatically generated. So the GTN slides are automatically put on on YouTube so you have these slides and you have the presenter comments and the comments are then put into an automatic generated voice and and to my understanding. They are an uploader to YouTube. So every slide should have a attached YouTube video from GTN. Okay, I guess that's very comparable to what's available through the Mario. Can you like script. You know, kind of interacting with the galaxy UI in that way or is that sort of beyond what's available. We kind of can right we so we have tours and this is one of the places we wanted tours to go eventually was you have a whole analysis that's executed in front of you that you could talk over or whatever. It's sort of what we want to whatever we want to build it into though. I see, I see. So maybe maybe that's the right technology I should be taught and should be thinking about because you know, some parts it's like you know click on this menu go here then go to this text box and enter this and then, you know, watch it run and then click here to kind of view the results it's there sort of, you know, kind of the more abstract parts can be presented through static slides but I think it is useful to be able to see in the UI like exactly where to go and what to do. And I'm trying to find Helena's repo. There's a training how to add auto generated videos to your slide so I linked it in the chat. Yeah, this isn't, I mean, Helen and the GT and folks put a lot of effort to make that really really smooth for contributors as well. Oh cool. So that's actually I mean that's an amazing technology and they are using poly I think. Yeah, from from Amazon. Yeah, Mario uses something very similar. I forget what it is on the back end. This is cool. This is so cool. So nothing I think you're just joining one of the GTN calls or maybe directly get in contact on the GTN channel. I've already picked and Helena to pointer to Mario. So, yeah, maybe we can learn from each other. Yeah, I love to see that I mean I mean maybe you know maybe they're so different that you know they don't need to be fully integrated but it'd be nice to compare notes and see if I'm sure there are lessons learned in both directions. Should we mirror do you have. So the lessons that you have for. I'm going to botch the acronym. Yeah, are those on the GTN to or is that content we could mirror and they're starting to. Okay, there are you know there are kind of the first couple GTN tutorials that has sort of the base information. It's honestly it's been in flux quite rapidly recently so we have some catching up to do to make sure that the tutorial matches the kind of the current versions. And then we're getting my mortgage tricky is it's like it's many levels it's like, there's the scientific goal of like getting these amazing genomes and doing comparative genomics. But then there's kind of the nuts and bolts about like how you can, you know, interact with histories extract workflows, you know, you know, make sure that you're you can pull data from buckets you can write data to you know other places like there's so many levels around levels that that you would really need to go through. So it's hard to do it all and like all in one event. It's like you can either focus on the science or you can focus on the nuts and bolts but trying to do everything is like overwhelming at times. The science was mostly quite comfortable with galaxy so you know their on purpose we focus mostly on the science but I was just sort of thinking it'd be helpful to have resources at both levels. I have a question this direction if if we have tours actually working for all these tutorials. So is that addition actually needed, or is tools actually the preferable way instead of a video. I think you do it for different types of content. If it's strictly interacting with the UI, probably the tour. But if you want to sort of interact with the UI and then, you know, interleave a scientific discussion. Yeah, and I think videos can be shared at different resources and link differently. And so there will be always an advantage for that while the tours. Yeah, you have to be active on the UI to to run them. I mean if you can run a tour though you can record a video of it so it's not like we have to pick one other. Yeah, I guess you just record the tour and then take that content and you know edit the video and cut in your other chunks or something like that. Yeah, I honestly don't know how to do this so I really appreciate input but you know another events. So we'll kind of, well, like I did an event with Natalie where like we're interacting in the UI we're like doing live analysis, then we flip over to side to static slides to provide kind of a more scientific context, and then flip back to the to the UI. It works okay to a point. But I feel like people get exhausted if it's you flip back and forth too many times, you know. You're trying to follow along and get get a little bit behind and you're. Yep. I mean that's the beauty of a video, because if you get lost you just go back and you know, given these types of trainings, I'm curious if people have any sort of strategies that they really like or maybe a strategy they tried and they didn't like that's I can only second that it's a challenge. I'm thinking back to my trainings, which are more on the dev side of gal of galaxy and that adds a third variable which is the shell or the editor, whether it's an idea of him or whatever so you jump between the slides which give you the concept you jump between that and the UI, we operate galaxy and then there is the editor when you do where you do the actual core of the tutorial and that's a challenge and whether it's virtual or like last conference we did it in person. The challenges are the same and sorry I haven't found anywhere around it. The first complexity is you know it's really intrinsic and you know we can we can try to hide it we can try to dress it up but you know some of these things are just really involved and you have to get your hands dirty. I don't know does anyone else have any either positive or negative experiences they wanted to share I'm really really genuinely interested in ideas here galaxy studio what's that. What are we describing we want to have different windows and portals into the application all in sync with each other. Yeah, if you go to like, you know our studio right there's a different windows that that. Oh, I see you're saying ours yeah. Jennifer is right I was about to say split screen is the only thing which is the only thing which makes it slightly more more palatable. Have people seen any really affected I was thinking about this. A little bit on the flight home, and it felt a little bit like trying to learn I don't know like Photoshop or Illustrator. Some other, you know piece of software that has like a very complicated and sophisticated you I personally hate Photoshop. I've been unsuccessfully trying to use it for 20 years. Sometimes it's necessary the problem is I don't use it often enough. And then a year later I try to go back and do it and I've like totally forgotten everything. But I'm wondering if other other examples that you really that maybe from other scientific software or maybe not even science just other software packages. You know is there something out there that people really like that that have been effective. This may or may not be relevant. I taught Photoshop. Over the course of five years at my previous university job and my students were not. How should I put it the brightest. And Linda tutorials where the most helpful for them Linda it's a let me see it's a it's a portal which has thousands and thousands of tutorials on all imaginable software packages. Adobe included so and they were these. They would have let me see they would have collections of, they would have a sequence of short videos describing a feature of Photoshop and the, the total sequence would be say three and a half hours. It would be split up into videos of three, four, five, six minutes long. And those would be walk through videos it would be not a headshot of a person it would be the actual UI of the software and the, the, the narrator would be very slowly describing and showing what's happening on the screen. So what what what was helpful to my students was that they would get this huge overwhelmingly complex topic but it would be split up into meaningful chunks of roughly five minutes long. So they could pick and choose and then go back to what they did not understand. And needless to say the assignment was never watched these and learn these three hours of videos but the assignment would be learn this and this and this and this technique and here is the three hours for context but these five videos are what you need to figure out how to do this. So this the, the ability to go back to exactly that part. So that was that's something I found to be much much more usable than the say Coursera transcripts where you can go to a certain part, but it's not as explicit as when you have an actual table of contents which is split into three or four minute chunks. I like that. So, you know, you know, divide up whatever it is two hours of content into five minute chunks so that you can, you know, the nice part is that you get, you know, just the table of contents becomes the overview of Oh yeah, these are the 20 things you need to do. You know, we'll walk you through each one five minute intervals, you know, five minutes is like a digestible amount of content that you can kind of understand it and kind of work through it. Yeah, I think that I think that's a really good framework that could be effective. I just totally got the sense that you know some people were like really trying to keep pay attention and follow along and they got lost and they got frustrated so we're trying to get ahead of that plus then those little snippets are easy to share online and so that if someone wanted to do this they could do it asynchronously when they had had a chance. So Linda.com now redirects to LinkedIn learning. So I guess LinkedIn aka Microsoft bought them out but I'll check that out. Does anyone else have any have any favorites of other sites that they really like that have been successful. Kind of different but for bioinformatics software I like the sarah vignettes they do. Or they have a series of, I think they get like PCA analysis or 20 like single cell RNA a taxi and it's like very focused direct our code and explaining why you do it that way. Is that built out of a bio conductor or I guess our our vignette or or is there technology on top of that. I believe it's built out of an oven yet. Okay. It's been a little while since I've done them. I remember I was learning it. It was. It was nice because it was super focused super narrow scope. It's sort of like a little similar to how we do some of our tutorials for the bioinformatics except rather than like the entirety of a single cell RNA seek run it's just like one really narrow. Yeah, focus on it that you kind of dive into and it's like here's the exact code and what's happening why you're doing it. And like the plot to begin out of it's a quick way to learn through and go through it. Cool. Yeah, I guess we should. That's a good reminder. You know a lot of the bio conductor packages have like really nice vignettes that I've used those all the time that you know they walk you through an example analysis you know really step by step. And then from that you can kind of really understand and follow along. I'd be curious to know, and I don't know if this falls into the same category but has there ever been any discussion among the group of doing like a course Sarah or you to me for some of the basic galaxy dance on his head. Okay, go ahead. It wasn't galaxy specific, but there was a jhu bio is a bio stat, whatever. James did a course with a whole series of stuff that use galaxy and things like that. And I mean that was only six or it is enormously successful like millions of students have gone through that. It was, it was part of the genomic data science sequence on course Sarah it was one of the I believe eight courses. I'm, I suppose it got out of out of sync it will got outdated very fast. My guess, keeping a course Sarah course up to date is a challenge. Most of those courses in those genomic data science sequence, which which which which is, which was put together by john by john Hopkins. Most of them are quite old, but some of them hold up galaxy would never. Excuse me, galaxy would never last for. Sorry, boys, like keeping it more than two years, it would be a challenge so it would need to be constantly updated. Well that's why I think we, you know, some sort of automatic rendering system would be ideal. And, you know, the parts that are, you know, kind of the same API calls same entry points that would work and then you could incrementally update just the parts that needed to be updated but it would always be But I understand I mean that's like, it's a that's that's no small task to get there, but it was inspired towards with course Sarah or you Demi or something like that you have to keep it up to date because the consequences are severe, like the galaxy course was the was the lowest rated one among all the eight and it was like the difference was significant I don't mean statistically but you could see it it was three something stars versus 4.9 something stars for the rest. Why, only because it was outdated and eventually they took it off. I think with that one we had some issues with the cloud resources to. Yeah, I remember people starting instances and maybe not knowing how to shut them down, that kind of thing. So that was rough. didn't specify properly that the instances had to be shut down after the course was run I think something like that, or that you paid for them. I don't even remember if it's possible to come up either with a galaxy course and one of the universities that we're in or a bioinformatics course that uses galaxy as a tool for all the chapters, if you will. I mean, the like online courses better because there's a bigger audience but maybe we can start from Penn State Opkins Friberg somewhere with some course and then build on that. It probably has to be like. I don't know 500 level course something because you know it's not a basic science or anything that we want to teach but it's more the tool. So galaxy is used a lot in teaching if you use a tears at the tears starts. It's amazing and how many courses galaxy is used for teaching. All kinds of different students. So if you're interested in that I just recommend look up the tears starts and then you can get access to the you one if you like but this is really impressive how many people do that. And here in Freiburg we have also week long courses that we offer for our students. So, if you're interested, get in touch, but we more or less use GTM material, all the way down and offer a variety into our students. Yeah, thanks for that and thank you, Jordan, we ended up using that exactly that service in EU for the VGP. Things are just a little bit in flux in on main here in the United States so that that was what that was a platform we directed everyone to that was super stable. That worked out spectacular. Happy to provide that super cool to see that. And if Kevin meant having a galaxy designated course like a regular university course right is that that that's at least thinking back to my previous life that might be a little more complicated, because that has to do with a whole lot of like core formal course proposals and how the course will fit in us but into a specific program of study. And in general, tool specific courses, at least in my university where I used to teach were not looked upon favorably a more conceptual approach to the material was preferred because how do you fit a course which is say, learn this compiler will learn this particular tool to learn these so and so concepts. So that would be probably a challenge. I feel like a special topics course those are more flexible and less restrictive in terms of what you can teach. But if it has to be like one of these official up to 400 level courses that takes a couple years even to get into the Yeah, even special topics they would be usually language related like C++ that would be a special topics and usually it would not even count at least my university for CS major, because it's too specific this is something you would assume majors can learn on their own. I have one more to get back to this question. It wouldn't be possible regarding the automation automation sorry off screencast could we use the tours small tours and automatically generate the record the videos for those systems and maybe even as a next step automatically, if we annotate the steps right automatically also at the audio for starters so purely automatically generated short tutorials from tours is that something which makes sense or is it feasible. Yeah I mean that's what I was thinking we could do it for the annotate I mean we would have to annotate the tours a lot more than they currently are because we have just a little text that goes in the box and you'd want more explanation there. But that's not. If we were if we were actually going to use these then that'd be the way to do it. We could we could double dip and use this for testing as well right so if you had particular analysis that you had perfectly specced out, and it all of a sudden didn't work, you know, you know something was going on. Okay. I feel like we have many of the ingredients it's kind of just just piece them together I realize this is like years of work. I mean, we could bite chunks off of it though. Yeah. Over the next. That's definitely the right approach right let's let's make let's aspire towards it and then, you know, each quarter we get a little bit further down the road. But it's crystal clear to me that this is like a set we got to do this right. Yeah, I think that what I love what I hope and think is like, you know, if you're just like running one or two tools great. That's straightforward. But I like to think our galaxy user base is getting more sophisticated over time where the workflows are getting deeper and the types of operations they want to perform or just getting more sophisticated so we've got to. I really think we got to break it up into these little bite sized nuggets, and it sounds like the tours is like exactly the right framework to get started on animating this automatically. What I didn't mean I'm sorry I didn't mean to blow up the whole conversation around. But it's really helpful to me I mean there's this. So there's sort of the broader discussion about the GDS CN and sort of trying to support new users. You want to really like your idea, you know third workflows like, like ready to go on microbiome related activities. I think that would be a great interest. You know, the network is, there's sometimes when we're trying to just, you know, give information but a lot of it also is this sort of matchmaking. You know, it can be intimidating to work in a new topic and new technology without, you know, someone to kind of guide you through it so you know, I also thank you all in advance. A few of my lab members have met some of the GDS CM members and that's gone like spectacularly well. You know, people are just so appreciative for all the help and support. I think john, weren't you at like one of the recent dinners and it was just like, you know, it's just just just you know, it's just really meaningful and impactful to do that sort of activity I think that's what we're all about. Do you guys have any comments thoughts about I don't know teaching or I don't know maybe new workflows that are available or new technologies under the hood that might be relevant that we should all be aware. Okay, well I know we have this mortgage work coming up and what were the final dates for that again. I think they're aiming for end of May the week of the 22nd. I don't know if that's been finalized yet. Maybe a good target to take a, you know, take a half step down this path where we could create a few, you know, we'll start really modest I don't know three five minute videos, you know, on some of the maybe some of the newest features, you could take a baby step down this path to get some hands on experience on, you know, something like that. I think could be a really good goal obviously if you do more that would be great. But I just feel like we're gonna have to make incremental progress on this. Thanks for your note, Marissa. Okay, thanks. That was everything I had to share. Natalie, is there anything else you wanted to go over today? No, I think that covers it. Thanks everybody for joining today's community call. The next one will be on February 9. So we're going to have the release testing team share about the outcomes of testing the 23.0 release. So thanks everybody for joining. And we'll see you in a couple weeks. Thank you.