 Okay, so thanks for an introduction, Natalie. As you said, I am when I look like I'm part of the vertebrate genome lab at Rockefeller University. And we are one of the production hubs for something you'll hear about in a little bit, the vertebrate genome project. And we've been using galaxy to do all of our reference genome assemblies. So I'm going to just tell you today about our experience using it and all of our how basically our pipeline in general just some general the pipeline, and the actual like experience of using galaxy to workflow wise our pipeline and get it bring it up to scale for to attain those goals of our well large large project. But before we got into that topic, I just wanted to also give some background about myself and my own bioinformatics journey to give some context about why I personally vibe with galaxies mission outside of how our vertebrate genome project is using it in a general partnership. Next slide. So my master's project was population genetics of field mice, which meant all the wet and dry lab work was done at a field station miles out from New York City. Another compositional infrastructure was a desktop Mac iMac, I'm not sure what model it even was because we call the trash can Mac, and job scheduling was a post it note, and just real life nice values of trying to coordinate amongst five or six lab lab members to get all their DD red sea all their short or short read analysis done on the same machine in the same time span all trying to graduate at the same time. And so I've gone from that to now my current job where our infrastructure is a variable lasagna rack of nodes dedicated to generating genome assemblies. So you can see on the bottom right here and you'll learn a little more about the assembly graphs later on, but we are using a newer technology called high fire reads that are both long and accurate, which is, in my experience a fairly recent thing that came about like around 2020, and it blew my mind coming from only Illumina short read sequencing world, but it's really revolutionized how we approach generating reference genomes of high quality and high continuity. So, as I mentioned, we are a production hub for the vertebrate genomes project. And the goal as you might expect is to generate the genomes reference genome of all the vertebrates. And this project is part of like a large blossoming or blooming of initiatives recent have recently sprung up and are generally dedicated towards creating reference genomes as resources. The VGP is in fact a part of the earth bio genome project, which is dedicated towards generating reference genomes for all you carry out a glyph on earth, also as exactly what it says on the tin. Currently, so that's a big goal. So we've broken it down into phases, stepwise phases of getting representation across the vertebrate tree of life. So we're wrapping up phase one, hopefully this year, which is getting an ordinal representative from 260 orders across all of order but so reptiles amphibians fish and mammals. We also have a signifier for sharks, you know, they're not their own highlight plate. But in addition to getting to just getting a reference genome for all of those representatives. We want them to be of high quality to facilitate the sorts of conservation genetics and population genetic studies that really require a long that really require high quality reference genomes. So for our lab as one of the production hubs this process looks like. From start to end we first need to identify the sample source to begin with, and then do all the paperwork required to get that sample from wherever in the world it is over to where we are in New York City. And the arduous task of data generation is undergone by our very, our very talented wet lab team. So we, for every sample generate at least two types of sequence it to types of sequencing technology for it. Every sample has at least pack bio high fire reads and high C long, high C long range information. And if the DNA is good enough, good enough quality we also have bio nano optical maps for it. And the reads are what we actually are what are the backbone of the assembly because they make the actual contigs, and the optical bio nano optical maps and the high C sequencing provides scaffolding information to help piece those contigs together into something that's approaching chromosome length. So the part that I'm responsible for that we use galaxy for is the assembly part part here. We get a high quality draft assembly that then goes for manual curation to resolve some parts that the algorithms might not have missed or may have missed joined. And that's after that is finished and that gets submitted to NCBI, or for some parts of the VGP they submitted to ENA European nuclear tide archives in order for these to actually be available as public resources for others to use in their studies. So before I talk about the assembly pipeline, I just have a little slide on what are what is an assembly anyway. So we start off for us at least with long reads that are since they're packed bio high fire reads with 99% accuracy, they have basically perfect overlaps for coming from the same region of the genome. So high fires and finds these perfect overlaps and assembles them into contigs that we're very confident in. And then you have these pieces called contigs, and then we have technologies, or orthogonal technologies, such as bio nano optical maps which label long strands of DNA at predictable motifs. And using those motifs, since you know what they are on the label DNA and also in your context because you have the sequence, you can use those maps to orient and reorder your context, and join them together because you know where they would be on that larger map. We also use high C technology which is a former of chromatin capture chromatin confirmation capture. So it fixes DNA fixes chromatin or how it looks like within the actual cell. And so DNA being linear it when it's all scrunched up it usually tends to interact with sequences that are also from the same piece of linear DNA. So it's more likely that sequences and chromosome one, for instance are going to be interacting with other sequences with sequences on chromosome one, instead of sequences on chromosome 10 because they're going to be in another scrunched up ball. So these are the scaffolding technologies that come later and those just as I mentioned really helped towards bringing the contigs up to a chromosome scale reference. I'm not going to talk about it as much in this presentation, but there is purging because sometimes the assembly plot process is imperfect. A lot of times it's imperfect and then we can have false duplicates and one of the haplotypes and that's when the same region of the genome is being represented twice. And that's because the haplotypes there look different enough that the assembler thinks it's two totally different regions of the genome, when really they're representing some say the same gene or the same part. So purging takes care of that by removing that and it usually does that for looking at coverage analysis and other other information to make being mapped onto your primary reference. So in this point, I have just a little example of a snippet from that assembly graph. So the assemblers really put out, they don't put out fast files, natively this point they'll put out a assembly graph. And so each node here, each node is representing the sequence and then these edges will represent potential overlaps. And then. So for when we create the actual fast file that people can use downstream, the assembler will walk this graph by going say along this path. And then this path is where the genome is currently homozygous so it looks the same and both maternal and a paternal type. So both, both assemblies will have this, and then there is some sort of variation here. The assembler has to pick a certain way to go, and then keep going and then say every time it reaches a little bubble like that. And sometimes it can reach a really tangled complicated region and it will have to break and that's when that content ends and you'll have to start from making a new content. But these bubble regions are kind of a bone of contention in assembly because it can be hard for the assembler doesn't know how it, which variation belongs to the maternal habitat and which belongs to the paternal If you don't know what the maternal and paternal haplotypes look like a priority. So there's the classic approach for fixing that has been just sequencing the parents, but I'll talk later on about a new approach that has come up recently that uses the same high C information to help identify like, Oh, these two are actually having the same haplotype together, and it helps phase these context properly. So this is, I'll start talking about our actual pipeline now, and it has quite a few steps to it. But the first step isn't actually contiguing. It's going to be kind of a quality control step, just to make sure that the data, you're going into the pipeline is your expectations, because at least for us, since we're sequencing vertebrates only, they have pretty predictable patterns based on what clay they're in. So birds are usually like one, not not like one gig, one gig pace pair long, they're not going to be super big. Mammals are two to three gig and then amphibians can really range the gamut from smaller frogs being like one to two gig to bigger genome frogs being eight gig. So, it's just a way of going into the assembly process with set expectations based on the data you're generating, and it double checks to make sure that there's like no obvious sample swap. For instance, if you think you're going to be sequencing a bird, but then the genome scope tells you that the genome length is actually something like six giga base pairs, you might want to look into what happened and make sure the data is actually correct. If you're going to genome length genome scope also gives things such as coverage so you know how much sequencing coverage you have of your genome, because we usually try to go into the assembly process with at least 30x of pack bio hi fi data. There are various statistics such as the unit how the percentage of unique sequence in the genome, like, which can help you know how repetitive it is. A lot of the larger amphibian genomes tend to be very repeat heavy so we'll, it's a pretty distinctive pattern, and it can also help you kind of predict how much of a pain the assembly process is going to be for that sample. So, the next step after getting this general picture of things is the actual contouring step, which is what I mentioned before with hi fi asm. So just a few terms, getting them out of the way. The real genome, as you know my, well, as you know will have a paternal genome and a maternal genome inside the f one individual that we're sequencing maternal haplotype and a paternal haplotype. A lot of approaches in the past have kind of just collapse them, because the assembler is agnostic to what which haplotype that sequence is coming from. So, we try to get the round out with pseudo haplotype assemblies that create a primary and alternate assembly. So you'll have perhaps a block of maternal sequence here, but then it'll switch to a block of paternal sequence. And this is this results in an overall sequence that wasn't actually present in the genome, because this switch. These, these markers should belong with these markers, and vice versa. So, this can happen when a pseudo haplotype assembly is trying to just piece things together and it just adds, it sees that these overlap and they'll go together. And, as I mentioned before, one way to get around this is by actually sequencing the parents. So you know what the paternal maternal haplotypes look like. That's called the trio bending approach. And it's been done in the past with hybrids that just make it a lot easier because if it's a hybrid though to usually look much different from each other, but you can also do it just with a child and two parents. And this approach works by getting short read sequences of the parents, and finding markers that are present, say only in the mom and a child, and markers that are present only in the dad and a child. And it allows you to bend the reads from the child into ones coming from the maternal haplotype and ones coming from the paternal haplotype, and it assembles them separately. So this is really the ground to ground to ground truth, cleanest approach we have towards generating haplotype resolve assemblies that don't show this switch of this switch of false sequence where maternal and paternal haplotype, maternal and paternal haplotypes interleaved with each So this is like very good approach, and it's the standard for getting haplotype resolve assemblies, but it's also pretty prohibitive, especially for wild species where you have a sample, and you're not necessarily going to go find the parents, you can't be sure of it and maybe not even actually get the sequence from more than one individual. It's also more perhaps it's cost prohibitive because you have to do two additional library perhaps in sequencing. And it's a bit more computationally intensive so there is a recent approach that has come about that uses the same high C scaffolding reads in order to phase the, in order to phase the contigs instead. So we have seen pretty good success with that so far, and that's our new default. So I'm just going to show you some of the results you have with that with a zebra finch. So these are called blob plots. And on the left, we have the results from when without trio facing approach with a zebra finch. So this is when we assemble the genome of the zebra finch child and we know what the mom and dad look like so we're able to fully separate out the haplotypes. And then in the middle we have that pseudo haplotype approach I mentioned before, which can show those switches between maternal and paternal blocks of sequence. And on the right hand side, we have the results from a approach from the high C phasing approach. So just a quick rundown on how to interpret these plots. Each blob is a content in that assembly. And the blue blobs are colored by which assembly they come from. So for instance for the trio phasing one, these red blobs are all contigs in half, let's say half one assembly, or called assembly a, and then the blue blobs are all contigs from C. And the axes are the parental markers. So say that the x axis is maternal is maternal markers and the y axis is paternal markers. That means that all the blobs from assembly a only have maternal markers. They don't have any paternal markers, and all the blobs from assembly be only have paternal markers they don't have any maternal ones. So this is what a proper a really properly phase haplotype what a properly phase set of haplotypes would look like, because most importantly, there's no blogs in this center part, because this center part would mean there's content both from the maternal and paternal marker sets. So that means a switch is happening. And that's exactly what we see in the pseudo haplotype approach. There's a lot of these larger blobs in the middle that show say 500,000 maternal markers, and 1000 paternal markers. And that means your switch is happening and switches are not real sequence. So that's what we're trying to avoid by using trio phasing or high C phasing. So now that we know what the ground truth looks like. And what the previous approach also looked like we can compare it to the high C phasing approach. So there's you'll notice that there's much fewer of these blobs in the middle with switch content with switch content with switch error content. The blobs are properly phased. And so you'll see that it's alternating kind of like, it'll be blue and red whereas it was all from one assembly here. That's fine because even though high C phasing can tell you if to sequences are coming from like the same run of DNA, it doesn't have that DNA belong to the mother or the father. So it can't put them in like a properly like labeled mother father file, but it can ensure that the context it's making do contain only maternal or only paternal sequence. And that's exactly what we'll see here because all the blobs are exactly along the axis so all these contigs only contain maternal markers and only paternal markers. Just kind of a really revolutionary and upgrade to genome assembly and since we do have high C data for all of our species that we're generating at our lab we have switched to using this approach and this is how the galaxy pipeline works currently it's our default. So that's a lot of words about the contiguing or step. There's also scaffolding happening afterwards, which is what brings the actual contigs which are much longer than a limited context but they're still separate from each other and can be resolved to chromosome length with additional scaffolding information. So we have things like bio nano and salsa, which can use those as an image I showed earlier to bio nano maps help you orient and order the contigs according to the motifs that bio nano was able to detect on the long optical and the high C range the high C information gives a long range information and interaction because it's operating under the assumption that sequences from the same linear strand will be interacting more often with each other, compared to sequence it from another linear strand. So the key C we use for this is called like a pre text map, it's basically heat map, and on the left hand side we have before high C scaffolding, and then the right hand is after high C scaffolding. So this is a pretty well resolved one here, and each box just represents one sequence, or yeah one sequence of status chromosome one two three four five. This is just saying that chromosome ones largely interacting within itself, largely showing intracromosomal reactions interactions and not interactions with any other sequence, because that might indicate those need to be joined, which was the case here in this before picture. You'll see that sequence like fives interacting with sequence 11. That's not supposed to happen if these were separate pieces. So the high C scaffolding process joins them to make these larger contiguous parts, but there's still some work left to be done after there's still this like off diagonal signal, and they also need to actually name the chromosomes based on sometimes size or sent to me with closely related species. And that's what happens when these draft assemblies produced by our pipeline gets sent out for manual curation. So that's a lot of worries about the nitty gritty of the pipeline itself, but so the way the pipeline is actually implemented in Galaxy is thanks largely in part to Delphine's work of pipe porting this entire pipeline into workflows. So, I know I mentioned only a few programs before, but there's actually way more programs being in software being involved in the pipeline, because not only is all the data generating and data processing steps. We also have all the quality control being automatically run within the pipeline. So each step each workflow. This is an example of a high fiasm one I believe will run high fiasm, and then run all the QC on it afterwards, such as us go miraculously a whole bunch of other the contig stats basic stats such as that, and have all the outputs ready for the user to view in their galaxy history right then and there. And so our pipeline has is kind of organized by that work by that flow chart we saw before into modular some modular steps, it'll start with contiguing. Sorry, it'll start with that QC, then go to contiguing and then the scaffolding steps, and they can be mixed in match depending on if you have bio nano data if you don't have bio nano data. If you have paternal information if you don't have paternal information. So it's really adaptable for any users situation based on data availability. So this pipeline has is what I've used to make these, like 50 plus assembly so far, for the VGP. And the species we've sequenced so far and phase one really run the gamut across the cities designation scale from critically endangered species like the Australian corroborous frog which is being decimated like many other species by Kittred to least concerned species that are still important for our ordinal representation such as Amazon in Hudson. And so these just really highlights our contribution or goal of contributing towards conservation resources by creating these molecular by creating these reference you know that you need for molecular population genetic studies, because you know, and the more continuous it is and that can really enable studies such as selective sweep analysis, you can get longer runs of homozygosity, and you can also identify population specific structural variants, which you can't really do with a highly fragmented reference genome. So a lot of so of these like 50 assemblies, about a fifth of them were run on the galaxy EU server which I'm very thankful to the Freiburg team for letting me use up a lot of their terabytes and a lot of compute for a while, before we were able to get our own instance running within the VGL like local infrastructure. So I have, I currently do maintain an instance that takes advantage of our compute. We have within that server rack to about, we have 28 nodes of 32 CPU about 400 gig of RAM, and one big node of 64 CPU and 1.4 terabytes of RAM, so the really large assemblies go on such as that Corabri frog you mentioned before. And so in addition to like speeding up the process because we can use the resources we bought for genome to do genome assembly. This also helps me teach lab members, like in my own lab how to do assembly and how to actually, how to actually analyze the data that they're generating which I think goes a lot towards helping them feel ownership over the work that they've done to generate this very painstaking work of generating the Hi-Fi, Bionano and Hi-C data that enables these high quality reference genomes. It also helps us be a service center on campus, because sometimes we'll have collaborators outside of the VGP but on Rockefeller campus who want to sequence and assemble their own genome, and instead of them sending their study species to us and it just disappearing in the black box for a bit, and then they just get an assembly file that they can run with, will they generate the data, and then they'll, they can assemble that data on their own on this galaxy instance and I can provide customized help, one-on-one help with them. Since I'm the admin, I can actually look at their history and I don't have to go through the peer-to-peer bioinformatics experience of trying to figure out how people organize their own directories. So, because everything's just laid out very nicely in a history and standardized in that way. And that just leads me into how I really do enjoy using Galaxy for Teaching. So we have, the VGP has two high quality tutorials available on the Galaxy training network. There's a short version focused on using those workflows that I mentioned, and a longer version that goes more in depth and it's really for like self-teaching and goes in depth as to why we have some of the parameters we have, why we make some of the things that we can do in the pipeline. And so they're both available on GTN and they're also what we use when we teach workshops about how to do genome assembly. And, sorry, and then these workshops have run from smaller in-person tutorials that we run on campus to larger like 40% attended workshops that this one from Faisalia took place across the world. And they had to be run through TIAS with resources generously provided by Galaxy EU. It was super helpful to actually be able to see where everyone is at so we can all kind of coordinate and sync up the course at the same pace. It kind of helped mimic the experience of teaching a workshop using my local instance where I can also see where everyone's job is at and we can make sure that everyone's following along properly. And if not, we can see maybe where people are getting stuck and where I would really need to drill in and go back and make sure people understood. And as I just that's personally what my favorite part of using Galaxy is for it's really streamlined experience of teaching people how to do bioinformatics because my experience in grad school was a lot of peer to peer teaching and when you would try to teach someone how to use bioinformatics, it would end up just teaching them how to use command line, which is a different thing in my experience way more frustrating, and worst of all discouraging, because my cohort was a lot of ecologists who wanted to answer questions that are ecology questions, they ended up getting sequencing data and having to learn command line, and it would just feel like it's not an it's not insurmountable but it is a barrier and a point of frustration for many people who are approaching this field, completely new to sequence analysis. I personally just really love that Galaxy does help me teach people without without having to go through teaching them a little bit of comp sci first, and I might have spent through that a bit, but that was my presentation and I just want to give big thank you to all the members of my lab who helped generate the data needed for these reference you know exist and also the galaxy team, who puts up with my very silly questions and constant pinging, and helped us actually get this work for up and running at the level that it is now to enable all those genomes that you saw before. I have a QR code that you can scan for a link tree, which can take you to all of our GCN tutorials workflows, and more information and etc. And also if you know a genomics core facility manager who wants to manage a crew of genome assemblers in New York City then let me know where nice, I promise, and thank you all for your attention. Thanks so much for your amazing presentation on now. I'm gonna open up the floor. Does anybody have any questions. I have one. So, super great presentation. I actually so I'm new to the galaxy team I'm a communication specialist I just I just started a couple of weeks ago. I'm also getting my PhD at Florida International University, and I'm currently doing DD Rad and will be transitioning to high C, but I work with invertebrates you. Is there like another side of like a team similar or some a team similar to yours that does invertebrate studies. So there's, there's the Sanger team which does the Darwin tree of life which just sequences all the life in British Isles. So they have more experience with invertebrates us, especially on the actual extraction side. But I have ran our pipeline with some modifications on mosquitoes and spiders, and it's worked fine with some just there's some checks that check for like genes expected invertebrata. So I think we can change that. Yes, stuff like that but it's a good starting point I would say and I think that the GTN tutorials have a lot of good knowledge as a reference if you want to like, just read it up for that sort of that sort of thing yeah. Yeah, I just I just went on your little QR code and I saved that reference it later. I'm glad people use it. I'm super excited about this and yeah my my lab right now is building the crab tree of life. Yeah, it's super cool and I yeah I just reminded me a lot of that and we've mostly we've mostly done like single gene sequencing with funding for funding reasons but my project will be transitioning to high C and full genome sequencing so I'm super excited to bring that in so thank you so much for sharing this was great. I'm excited to hear more about the crab tree of life in the future. Oh yeah, a publication should be coming out soon and we're present. Well I'm not going but they're presenting in New Zealand at the cross station genomics conference so it'll be coming out soon. Okay, thank you. Yeah, thank you. I mean that was really cool. I wonder how much, how much effort was it for you guys to set up the galaxy instance and have it turned through like these future months of data. The setup was mostly just a lot of learning on my part but that the admin training really helped I attended that last year to smorgasburg admin training. So it was a lot of just self teaching in my part and some coordination with it because our HPC team. They're great but they also like they keep kind of a feel like a hands off approach. Yeah, everything's like on one like user space I have to, and I think they installed like four packages for me and then I had to install Postgres locally there was there was something. It happened though. It was a learning experience, and I'm happy it happened. Yeah, very cool. Very cool to hear that the admin training gets you on a good path. Yeah, definitely. So are you eventually. So all this data they go into GenoMark. Yes, oh yeah I forgot to mention that they're all in GenoMark and they're all publicly available through the bucket. So we part of our workflows that Delphine had designed are also export workflows that get all of the final assemblies as well as all the intermediates in case you want to do like an analysis that's if you want to compare a scaffolding methods, then you can just start off the content file instead of the scaffolded one. They're all in GenoMark in a very predictable structure and publicly available to use pursuant to our data use policy which is just don't publish on it before the collaborators to we're always open to emails about that in case anyone has a specific question. Yeah. And then I have a last question. So what's the thing that frustrates you the most about galaxy. Let's see. Really put me in a spot with that one huh. Well that's good then. I mean, it's very personal but I guess it was that learning how to do all the system in parts of it. Because I did when I talked to the HPC guys about it. They were familiar with galaxy from a couple years ago, and they looked into getting it for the campus but then they realized that would be a full time employment position so they didn't do it. So that was just my main hurdle. I mean you've proven that's not the case right. Yeah. But is this more of a failure of documentation or that wasn't a failure it was just hard. Yeah, I haven't run into any major big failures like yet. Thank you for an awesome presentation I have about a million questions but I'll try to. Just a few on a different so you mentioned you've been sort of teaching and sort of ranked these tutorials, or if you could talk a little bit more about how that goes for you. You know if you had any sort of thoughts or comments about you know how to, I don't know how to simplify it how to make it, or maybe the other way maybe make it more detailed so you know kind of take away some of the magic I mean you tell me, you know what what if anything is going through to make that process even easier, or using the torturials one and what are using them for like teaching. Both. Okay, because we're using them mostly for teaching like workshops with like an instructor, like not asynchronous workshops. So we've been doing them by having like an hour or so of lecture on theory and concepts behind the library and then a bit of practical, where after we just learned about how high fiasm and we'll see graphs work, then use then the students can actually run high fiasm under own, and then get out that GFA file we can look at it in bandage and see what that assembly graph looks like. So we have had I like the approach of having lecture interspersed with some practical parts to give students kind of like a little bit of a break from just like an onslaught of stuff and actually get to use that knowledge and use it practically. That's my opinion for like synchronous teaching, with regards to like self teaching of off justice tutorials on their own. I think, I mean everyone self teaches a little differently, but I think currently the long tutorial with some parts of practical to follow along is, I think that's a good approach. And then for the synchronous, it sounds like you're not running the whole workflow then you'll just kind of run individual steps like yeah we do. So the way I had it set up was like we'll do it. Like alternating for the contiguing and for the bio nano scaffolding because those are fairly quick. And then for the more involved parts like purging, which is like a whole, as you know, like a whole bunch of stuff as a pipeline, or the prep for BWA alignment and the high C scaffolding, I'll have them do it once, and then I'll show them how a workflow can do it faster to also like bring in the idea of workflows and pipelining your stuff reproducibility. And so we've been talking about potentially having like, like a really simplified interface, where you just like, you know, dry, you know, we don't have it today but you can imagine just like dragging and dropping in. So I think that the answer to that question would depend on who you're asking it to. I think it would be good for people who are a bit newer to it, but then once they want to go, maybe they want to look at some of the intermediate files or what's going on that might get a bit frustrated. Personally, I like how the interface looks now. Yeah. I think it's an appropriate level of detail. Yeah. I agree with you, but I, you know, I don't, I no longer have that sort of beginner's mindset, you know, it's like obvious to me oh yeah we have tools there we have you know history there. Yeah. But you know so I've kind of lost touch with that sort of, you know that mindset but but it sounds like, you know you're kind of also if you sort of feel like there's an appropriate level of detail where it's not overwhelming but what exposes the key results. Yeah, because I think sometimes people, like if they just run, people will want to know what else like comes out of the tool, like they click and they just get only one result you're like, it's a little, it gets a little black boxy. So the way it is now I can show them all the other like intermediate results and some of the other outputs, like with bio nano instead of just a file. So they can, I can show them the, the count the hybrid scaffolding report and so show them what happened in it on or like with the purge dupes workflow, instead of just getting a single purge assembly, I can be like oh this is the actual mapping that happened and this is the coverage and so forth. It really depends I guess. Yeah, yeah, yeah. I think for people who are like working on their own project that is helpful and they want because they want to know what's happening to all of their data throughout it. Yeah, I guess, you know, if people are actually signing up for a longer form tutorial, you know they're quite invested in outcomes and wanted to be successful. Yeah. And I don't want to monopolize all the time here but but what about on the research side I mean I think, you know we're at sort of phase one at the level of a few hundred genomes and then the ultimate is like was it 70,000. Like, you know, how do you see that playing out. Are you just going to keep growing growing and growing your cluster at Rockefeller or I think we need to have other sequencing hubs. Yeah, I mean like ideally in like the area like local based sequencing hub like fruit is setting up one in the Middle East, the Sidra hub. They've got their own little galaxy instance he's assembled Arabian horse, dromedary camel and McQueen's bastard all there all very locally important species that he has a connection with and so I think that's the approach we'd have to use to really scale things up. That's awesome. Are you in in communication with them as they get stuck or need help getting. Yeah, I gave them like all my notes and stuff on how I had set up our own cluster because to interact with slurm. I don't remember what the Sidra HPC uses bottom. Yeah, because like a long term collaborator of ours so yeah. That's so let's award you a medal for being our ambassador. Well, thank you, I should, I should, I should thank you so much for all your work and your ambassadors. So, in the lab, who's, who's the primary user is that you. It's mostly me and then some of the lab team will do assemblies from time as they're as their own time permits but I'm the one pushing a lot of them through yeah. Just so that Mike's question about simplified interface for have that might work for them. Yeah. I mean, maybe you could have an option of like, I don't know, like toggle. Yeah, an option or I mean it. You doesn't necessarily need to hide information it's just getting it closer because now I have experienced this to like you have to push users to using the workflow you have to tell them explicitly go do this with a workflow now, because they know how to use the tools because they're really in your face is there always on the left side, but you have to tell them, you know, don't waste your time the workflows. And if you know just the workflows were a little bit more prominent. I think that would already make a big difference. Split the sidebar workflows and tools. That's what's happening more or less yeah. Thank you for agreeing to do this. Thank you for all your time and attention. I mean now, if you have any requests whatsoever. I mean you have our full attention. That's for what I want to know. I mean kind of building on Marissa's question, you know, is there anything that we can help with from, I don't know installation management functionality display I mean you tell me, you know, you have our full attention, is there anything that we can help you I can try to collate some like feedback and like people like the more because I also don't have like as much of a beginner's mindset anymore, but I can try to think of what some of my students have said about interface and to say get feedback on that. But right now off the top of my head I don't think I have anything. Sorry. I mean there I think there are some you all know how to reach me so and I know how to reach you so I think there would potentially be some really one term and you can kind of drive feature development. I want the checkboxes back on the main history page I don't know if I'm like not. I don't know if I'm getting it wrong. Yeah, I use the, I like using the big history in the middle and not really decide bar one. But I'll make it I think I was going to make an issue before today but the checkboxes disappeared from the main. Maybe you want to come to our one of our UI meetings to sort of explain your pluses and minuses so we can. If you just send me the details here. So there's a hand up from Francis. What's up. Yeah, no just one issue or one way of getting feedback is actually asking the students when you're doing course development. Like at the end of your workshops to actually have a few questions directed specifically about the UI or galaxy works and so forth. Yeah, I'll do that in the future. Thank you for bringing that because I think we do ask for feedback it usually ends up being about our teaching though, because, yeah. We can always throw in more questions about the actual UI like what make it easier and stuff. Thanks. For most of the tools and your work was already in galaxy. Sorry, the first half that cut off. Sorry, where most of the tools that you're using in your pipelines already in galaxy and if not, what was that process like to bring them in. When I came in to the project was like 20 late 2021 so they were already in galaxy, because that was I think what everyone was working on before I got there. So I came in at a, a good time. Adderson, a couple tools we've had to add like we've changed scaffolders from salsa to yes, we've also just continually had to update tools as they've been upgrading. And so I enjoy that process like I've been a little bit more involved in just pushing like little tool updates I get to use GitHub and I'm no longer a little GitHub baby. So that was enjoyable for me to try and figure out how to tool. It's also good to know how to tool rappers are actually like working and since I have, since I am in charge basically of the instance I like having I like knowing what's happening like on the back end, and it helps me troubleshoot if something goes wrong. And sometimes it's a simple as like someone gave a wrong file type, or just the genome is too big and it just isn't going to work. Yeah, yeah, yeah, that's awesome. I'm sorry just just one more thing from from me. So I'm trying to highlight some more of the work that's been doing that's been being done in galaxy and yours is super awesome would you mind if I reached out to you so that I can write up a little bit of more about it for for our communications. Of course, my email is, I mean, it's on the slide and I think now I have some I can put it here. Okay. Thank you so much I'd really appreciate it. Of course, I feel like this is something a lot of the community would enjoy. So, I just put my email in the chat. Awesome. Thank you so much. Any questions for when we've got her here. If not, thanks everybody for joining today's community call. The next one that we'll have is on May 4, and we'll have Jeremy talking about Galaxy for cancer. So we'll see you on the May 4 community call. Thanks everybody. Thank you. Thank you. Thanks.