 Hi everyone, my name is Brian Robinal aka the pineapple pirate on slack and on github I'm a postdoc in Dan Blankenberg's lab at the Cleveland Clinic And today I'm hoping to talk to you folks about what we've been doing to bring quantum computing into the galaxy ecosystem So first of all why quantum why are people pursuing this well the fact that matters There's a ton of problems that we can't properly address today And within that there's a subset of problems that we can kind of address using classical computers But a much larger subset is the problems that we could possibly address using both quantum computers and classical computers together All right old-school computing. We've got binary bits zeros and ones right that is a basic unit good of information And then we also have classical logic circuits Which are a set of gate operations on bits and this is the unit of computation in a classical volume and framework Quantum bits or qubits on the other hand a little bit more intricate Just when you look at it, it's basically a sphere. So you have zero and one I call it the north and south pole But a bunch of different values in between That are based on spherical and polar coordinates. So just off the get go There's immediate advantage because your basic unit of information again isn't just a binary representation it can be many many different forms and a Quantum circuit on the other hand is a set of quantum gate operations on these qubits and it's a basic unit of computation So as the name explains, I mean these computers use essential ideas from quantum mechanics one of them being superposition, right? Superposition is creating a state. I guess a quantum state where you have a combination of both zero and one And these conditions allow us to then map this onto a block sphere Now what's really cool is that if a and b are non-zero then the qubit state contains both zero and one This is what people mean by saying that you know a qubit can be zero one at the same time Now where things get really cool is when we introduce entanglement So if you have two or more qubits, we can get combinations like zero zero zero one one zero one one And what this means is zero and one means that the first qubit is zero and the second qubit is one And then they have their coefficients a b c and d which are complex numbers And if two or more of these a b c and d's are non-zero then we can't separate the qubits and they are entangled with perfect Correlation and they are no longer independent of one another. This is what Einstein used to refer to as spooky action at a distance And by the way the reason why you know these zeros and ones have these little brackets and Pipes around them is because what they really are is their solutions to wave functions So solutions to Schrodinger's equations written in direct notations Which is basically a you know the solution that tells you the energy levels and the dynamics of subatomic particles in this particular case of qubits electrons Now another prop in Tarnist is this concept of interference, right? One of the things that makes quantum mechanics really hard as especially like for me is that things are no longer deterministic, right? Things are probabilistic Sorry an interference allows us to increase the chances of getting the right answer while decreasing the chances of getting the wrong answer And if you're wondering what the hardware actually looks like this is Osprey 433 a qubit chip It's IBM's latest step piece of hardware that's put out all these fears are your qubits and The this is a connectivity map. So the all those are the gates in between each Sorry and You know obviously there's been a lot of questions like a is this some sort of like full gazes is actually gonna work out You know like is this all just theoretical? Well, there's more and more papers being published that are showing proof But it's actually going to be advantageous compared to classical computers and IBM put out a paper last month That shows clear advantage by producing much more accurate expectation values compared to classical Computers so it's like one of the first bits of evidence that yes as we scale things up It things are gonna get much much faster All right, forget about everything I just said right what exactly makes these machines very special So every day you got a bunch of stuff to do right You got to work. Maybe you want to travel. Maybe you want to make some food Some points you got to eat. What about some exercise, but the fact of the matter is they just aren't enough hours in the day to do this, right? What if I were to tell you you could do all of these things at once? That's exactly what quantum superposition is What if I were to tell you that you could somehow coordinate what all these different copies of yourself are doing at any given important time? That's exactly what I'm entanglement is Now what if you gave this power to a computer the result ladies and gentlemen is the ability to compute things simultaneously Which means you can deal with much much larger matrices by extent to this you can handle some of the world's hardest optimization problems And what better optimization problem try to tackle in biomedical research than the protein folding problem So like all organic matter Proteins tend to adopt the most thermodynamically stable confirmation as we heard very eloquently from Kate yesterday Sorry and what this means is they adopt the state that has the lowest free energy Every spontaneous event in nature I mean literally anything you possibly think of from planets rotating around a star all the way to Electrons jumping through different energy levels to molecules just changing weather shapes randomly to proteins adopting through stable confirmation It all happened with a negative of change in free energy and along these lines Another thing to consider is a second law of thermodynamics, which is the concept of entropy right entropy always wants to be maximized When a protein folds into its shape its entropy decreases right because you're going from I mean entropy is a concept of chaos when you think about it And you're going from something that's like highly chaotic Disordered to fold in more order so you think well that kind of violates the whole you know entropy being maximizing the fact that matters when this happens The water molecules around it actually adopt a more disordered orientation in the bulk solvent And this is what actually drives the protein folding process This is the very basis for the hydrophobic effect, which is the reason why also oil and water don't mix and Why the blood things don't cross the blood brain barrier that easily and so on and so forth Now Mother Nature always takes the optimal path to protein folding, but computationally as Kate Mentioned yesterday. We're really not concerned with that We're just concerned with trying to find the most optimal solution right because no matter how you get there It's going to find that low energy state So we're really in other words, we're trying to basically get to the bottom as well no matter what path we dig So just real quickly why it's this important didn't we already solve the problem Well, no we didn't as you know as it was explained last couple days that programs like Alpha fold that were trained on the protein Databank, which is a database of 200,000 proteins, which is a very very You know relatively small amount of proteins compared to all the biodiversity that's out there I think we've sequence somewhere around 300 million genomes These programs don't know what to do when you interact with I guess when you give it an anomaly right like some sort of mutation Which is good for science right if you find a new exotic species species or genome We haven't seen before it's also good for healthcare if you have a patient with rare genetic disorders Which ultimately leads to mutated proteins, which means that if you use one of these programs You're not going to be able to have an accurate representation of these proteins. Therefore, it'll be hard to design target therapies So how hard is this problem it is extremely hard and that's why you know programs like Alpha fold were developed because if you try To do this using a physics-based approach Most programs out there that have tried this tap out at around 30 amino acids So when you think about it just think of a Rubik's cube, right? I mean Rubik's shoes are pretty hard to solve I've never been able to solve one I don't know anybody who has I'm not too smart my friends aren't that smart But if you keep increasing the dimensions of a Rubik's cube the problem gets harder and harder and harder the number of possible solutions Increases dramatically and the time to solution really blows up So to put it in perspective for a small 100 amino acid protein, it would take approximately the age of the entire universe To sample all 3d confirmations by time you get the most optimal solution This is of course an NP hard optimization problem And it turns out that by way of superposition and entanglement one of computers are supposed to be better in handling as well So we're using a quantum algorithm Developed by IBM that starts out with a tetrahedral sub lattice So each one of these possible turns is 120 degrees, right? And I know what you're thinking. This is really rudimentary I mean amino acids can rotate here in 60 degrees But you got to simplify the problem somehow and 120 degrees is enough to discriminate between crawling along the path of negative or Positive free energy right and you want this thing to crawl towards lowest free energy And it's one hot encoded so each one of these possible turns you could take either one two three or zero is associated with a four qubit bit string as you can see over here and There's one qubit assigned for the turn that it actually takes now the mathematical framework I'm not going to go into too much details But there's a Hamiltonian and Hamiltonian really is a fancy word that physicists use to describe an energy function, right? And there's three terms this Hamiltonian you have a growth restraint Hamiltonian here Which basically prevents the protein from collapsing onto itself as your computation is growing this thing You have a chirality constraint Hamiltonian over here, which basically enforces chirality as many might know chirality is a fundamental property of proteins But the most important term in my opinion is the interaction Hamiltonian which Reproduces the free energy and the potential energy of how amino acid bay a likes to interact with amino acid B So this is kind of what the workflow looks like so we you know There's this paper that came out in nature that really highlights the main point We're not trying to make classical computers obsolete here But we are trying to give it the hardest 10 to 20 percent of a scientific workflow in the case of protein structure prediction It is prediction the coordinates of the coarse-grain alpha carbon backbones in the very rudimentary representation of the protein So what we're doing is we're handing this off to an IBM quantum computer And then we're using some Python code that we wrote to convert this to a PDB file Then we're rebuilding the all-atom structure of the protein and then we validate it We compare it to alpha-folder experiment if it's bad We've run it and change a couple things if it's good to move forward and run molecular dynamics And just to look at the initial results. We saw some interesting stuff We gave it the catalytic site of the Zika virus helicase protein It's a very very vital protein what it does is it unwinds a double-stranded RNA into single strands allowing the infection to go forward Anyways long story short We gave this sequence to the quantum computer and it gave us a more accurate prediction than both alpha fold as Well as trying to run this using a classical solver brute force and a heuristic solver roby It was accurate by almost a factor of two and what's interesting about this is that this program has never seen this sequence before Remember, it's not machine learning. It is just simulating the sheer physics of white proteins in the right way that they do And the other measure the other metric that we measure is this concept of the radius of generation which basically tells you how open or how closed a Protein structure is and as you can see here when you compare it to alpha fold The loop is much more closer in structure to the x-ray or experimental structure, which we hold as the ground truth One thing you guys might be asking yourself. Okay, that's all fine and dandy, but how exactly does this scale? Well, it turns out pretty well We have a quadratic relationship in terms of the number of compute resources We need it as a function of the number of amino acids or how large the protein is What I want you folks to focus on is yes, all three curves are quadratic But the slope on a quantum computer is much more gradual So this thing really blows up when you try to do is on a classical here And over here on the right we have the time it takes to generate the icing Hamiltonian So it's the initial formulation of the problem It's this giant matrix that contains all the bit strings for every possible configuration that the program is trying that you're trying to sample So I've been working on trying to get a workflow ready We made a docker image. Thanks to Dan who did the bulk of the work initially We made a docker image that contains Jupiter lab Kiskit, which is IBM's software stack And we're currently it's currently working. It's implemented. I'm hoping to come up with the final version by the end of the week And then we take the output from a quantum computer And then we use all the classical tools to rebuild the structure and conduct molecular dynamics simulations And then basically an entire biophysics work was what we look at If you have IBM credentials, all you have to do is pass your IBM token Through user preferences, and you're good to go. You can submit jobs to a quantum computer from a galaxy How's it working out? I got about 40% of it working pretty well so But most importantly I kind of ran into some issues at the molecular dynamics phase of things But most importantly the quantum computing part is working a lot less. So you pass your API token You specify the back end you can use a machine that's anywhere from five qubits all the way to 127 qubits You can build your circuits. You can have a lot of fun and the The way we're doing I guess we're making a docker image is it actually pulls in all the tutorials from Kiskit So if you're really new to quantum computer, you can hit the ground running fairly easily how we're gonna implement this Does this work? Yes, it does Right on So this is a what it looks like when it was executed on galaxy here. I'm sorry at IBM quantum lab At the top and at the bottom This is what it looks like when we submitted the job from a galaxy instance And as you can see the final confirmation is exactly the same Of course, none of this would be possible if it wasn't for a great team I'm incredibly fortunate to be part of bands group Dan Fabio J. These guys are phenomenal programmers I'm sorry my screen just died, but that's cool It's been really an absolute pleasure learning from them as well as my counterparts at IBM But most importantly, I want to thank the galaxy project man. You guys are pretty awesome group of people You know, it's a it's quite the unique opportunity to be part of this So thanks to everyone just for being super cool and obviously providing support along the way I've only been doing this for about a year and a half man So I didn't know what an XML file was, you know, I started all this I was just used to being stuck in a broom closet connected to the terminal running my molecular dynamic simulations completely oblivious That's how to do any of this. I'm still learning and I'm still annoying, you know, most people but it's cool You guys the organizers of GCC Garrett Jen and it's the entire Australia crew. This is this is pretty cool I mean, I never in my wildest dreams that I actually think I was going to be in Australia And here I'm getting to talk science. This is wonderful. Thank you for the opportunity and the University I mean, this is a really beautiful campus. I got the pet the Australian possum the other night That was really cool didn't expect it as I was walking through a campus nib down my finger a little bit, but I think I'm okay But oh And yeah, that's it Up next is Steven and I can't read his title But I imagine it's going to be a related to Apple because I came in here quietly to do something in the last training and Mike naturally and Steven started talking Apple and I lost two hours of my life to a very very good training session So, thank you very much Well, thank you for that nice introduction and I'm super excited to be here and Thanks Natalie for giving an awesome Workshop on how to do and go together with Mike. Thank you. Okay, so today. I'm gonna talk to you in Across five minutes really really fast about how we can plan for costs when doing genomic data analysis and leveraging commercial cloud resources so So first before I go any farther this is the end of project is a very big project it is across several institutes in the United States and Particularly a number of folks that are here today. I have done a lot of work here Doing this work put in a lot of effort to get this worked on and I'm this folks person so What's going on with an villain? What is that? Well, we're trying to invert the model of how we share data so in the olden days You'd create lots of duplicate copies of the data and shared around between different institutions but we're trying to find a way that we can bring the researchers and compute So the cloud so that everyone can come and work together on the same data set so that way we can Collaboratively move forward. We've laid out this vision in an article in cell genomics So, please do have a look at this where we talk about the anvil vision in how anvil can Relate to some of the work that's being done in the space of human genetics genomics and beyond So today I think perhaps we've seen this more than once But very quickly and there was a lot of things and there are three big pillars that we want to point out That anvil is data. This is how you can get access to a ton of human genetic data And you can pull it together Collaboratively with your team your colleagues your research lab as well as people across the country across the world and Do the data analysis in your favorite Tools so anyway that you would normally do it. We want you to be able to do that in the cloud on anvil galaxy in particular front and center So again, uh anvil is based on terra which we can think of as a orchestrator of cloud resources And by using the tera tera data repo we can pull together synthetic cohorts by pulling together Data sets that have the samples of interest that we really want to study so we can pull them together from publicly available data from my own data from your data because we're collaborating and then make one cohort and do our analysis Critically we can pull in workflows already existing in doc store and at the click of a button We can launch an analysis across 10 000 or more samples And then of course we want uh to point out that you can do the things that you know and love you can run jupiter You can run galaxy your own galaxy that you don't have to wait in line for it's yours and ready to go critically this is all in a secure perimeter so that uh, we can do this analysis on human genomic data in a way that is consistent with applicable laws And protect the rights of the donors of the samples critically we've got five petabytes of data Across a number of consortia there are the data ingestion team is working incredibly hard right now We are anticipating the ingestion of another five petabytes of data by the end of the year So it's breakneck speed right now Across the next year we're anticipating maybe a doubling of that and then so now we're looking at I don't know 20 petabytes of data in a year or two And then together with all of the other data that we're looking to interoperate with across other nih cloud platforms That number starts to grow to many many petabytes um, so some really true uh, high-scale opportunities exist here So we're re-envisioning how we do this computation and it requires us to re-envision How we're going to fund that compute so cloud computing has several pay points you you got to pay to store data You've got to pay to transfer data To the compute node if it's in another geographic region And then you have to pay for the node itself for the analysis And what kind of an analysis do you need to do? So what kind of compute resources should you get? We have access to gpu's But maybe that's not necessarily the most efficient way To get across the line So how can we do this more efficiently? So how can we pull together the uh information we need to know how to do the Analysis that we have in front of us. We have some basic tooling that will allow you to come up with an idea of how much To prepare a budget justification and come up with some numbers about how much we need to Put together in our budget justification to come up with a plan of how much Analysis will cost but what are the exact most efficient resources that we want to pull together to do an analysis So if we take a closer look at the rna seek workflow from the galaxy Intercollective workflow commission we can see that there are a number of steps and if we Stratified this by the input side of size of data We can see that the time that is required to run this process Mostly comes down to cuff links and rna star And if we really start driving in we can see across different compute Architectures that we can bring the run time down, but it's inversely proportional with cost, right? So critically We can tune a tool's resource to impact its run time and cost and modest reductions in time can be expensive over thousands of Samples but inexpensive compute could be really lengthy over sample a lot of samples So this might be a really important consideration fine-tuning that for the priority or the urgency of your your work And then We can enable users to select the most efficient tools by doing this benchmarking So some tools are more performant than others finishing the jobs faster with less resources So if you're going to choose some aligners, I'm encouraging to look at hysat It across different compute configurations and across different sequencing coverage Seems that we're going to get the best results there Okay, and then across different compute architecture. Just be aware that that could also cost money. So Uh, I want to thank you all for listening and I apologize for running a little bit over, but uh, Thank you. Thank you very much Steven Keeping ollie is up next We're sporting a very nice t-shirt ollie is from annett and has been one of our recent partners in galaxy australia and part of a Very useful collaboration together So Take the floor ollie. Thank you. Can you guys hear me? Can you hear me at the back as well still awake? Amazing. Congratulations on making it that far Um, I'm going to present something maybe a little bit different than all the science stuff that's been going through I mean still science computer science, but um There's a clicky thing work Let's try that. Yep. All right. Uh, let me talk a little bit about our net I kind of wish garret tell me what I can do to have the logo put here at some stage maybe We're the australian academic and research network So quickly on this founded in 1989 not for profit organization Owned by 38 australian universities and csiro, which is pretty much the biggest research body in australia Licensed the communications career, which means we can dig your lawn to put a fiber in there if we want to We have the right to do that Basically the isp for all education research institutions and galleries libraries archives and museums We also provide Cyber security data collaboration services the last one being what put us in touch with Galaxy Now just a little bit about me very quickly. Uh, you can call me ollie Whatever spelling anything else I might not really like how you pronounce my name. So just stick to this Originally from france as all of french people in the area. Yeah. Hello. Uh, I studied telecommunications and networks Systems and networks spend some time in japan Uh, koko ane ni hanji inai ne no no one's from japan here not this time. Maybe we can push galaxy to japan as well That'd be nice Worked at red hat the asia pacific network information center and our net I'm a linux user linux forever since 1997 starting with This thing if anyone's actually ever seen this come and talk to me will be friends because yeah, that's just unheard of Galaxy at arnett. It's a partnership It started really out of a long-standing relationship. We've had with bio commons before 2019 I wasn't part of it yet at the time arnett was involved with the bio commons for a proof of con proof of concept That eventually opened the possibility of hosting galaxy in a more operational capacity That translated in galaxy officially moving in parts again, not the entire infrastructure But in parts in march 2022 up and running in may 2022. Yeah, there's a bit of difference, you know march may I'll get to that in a sec Uh, this is australia. It's pretty big. Okay currently here brisbane This is where the data was and we had to bring it back all the way to here Okay, it's 3,500 kilometers essentially from the posi supercomputing Supercomputing center in western australia to a website. Sorry not website data center. We have in melbourne at The moment it's hosted between posi and melbourne, but before that it started uni melbourne then it went to qcif The rf6 mean if I'm wrong on this and then it went back to melbourne So next step is pushing it to sydney. Maybe at some point we'll put it in northern territories. I don't know Large user data volume. Okay, not five petabytes. I'll get back to that later Only 150 s herobites that we had to transfer that took forever. So we used globus, which is a tool that uh on it Provides essentially to move data around so it's currently still running in melbourne. Uh, we have 99.9 percent uptime Might be responsible for the missing dot one person. Maybe we'll never know What's happening next well sydney, so we have a data center in sydney that's better connected We are more security physical and logical newer version of open stack. Yes, we're running open stack Not sorry about it. Uh, it's already deployed. It's being tested What is it made of? okay We like hardware. Well, we have a few of these control plane. We I know some people here know about open stack We have a few of these for uh compute. We have a bunch of these for discs. Okay That gives us a lot of disc and ram and cpu in a few chassis Network nothing too fancy, but still running 25 gig per second with nvidia and melanox stuff Open stack. Yes, because open stack because private cloud because aws bread, but open stack my cloud I do what I want with it Lots of internal talent and expertise at our net. We actually run multiple large-scale complex deployments Various versions of open stack for years And these are the versions that we run. So we have focal wallaby in melbourne and next we're going to go to focal yoga We have vendor support for these as well, especially in sydney I haven't dinged yet. That's good. Um, how we deploy to open stack. We use plumi So I've spoke to a few people about that what it is essentially it's terraform libraries with a python interface We could use javascript. We could use c something I just went with python. It's historically what was used in melbourne. So I'll start to that I rewrote it from scratch to use the ammo definition that kind of stuff so I can have one definition per Client let's say and this tracked in git easy to modify easy to track etc etc So we have multiple users two minutes ding. Yes, uh, we Are working with galaxy predominantly But we are also working with other people and we might actually bring cvmfs as well on this platform More to this more about this later We can do a full deploy under 10 minutes for 20 galaxy vm's with all the network security storage, etc It's pretty seamless. So plumi not too bad. I must say when you know what's happening. You know what's happening security and performance improvements Base image hardening all the vm's come from an image that has been tuned We have strict open stack security groups host guest performance tuning get more squeeze some more performance out of it Storage we got some from something more from that as well improved authentication security and a few other things Say that for later Performance testing because we learned in melbourne that we needed to enable the hardware enablement kernel So this time around we're testing side by side with melbourne and sydnia doing Tested for network this cpu. So this is one of the problems. We have people ask me What is the use case? What kind of data is galaxy using? What's the workload? I'm like, it's everything files At least we know how it behaves. We know it behaves better with one megabyte block size Uh, base testings completed So in synium test will happen at some point that new one has written when we redeploy Future started already Hardware and players configured deployed tested playbooks adjusted. Thanks so much So just an encountering for that. We would be working on migrating, but instead we came to the conference Continuous user data synchronization. So we don't have to wait a month for it to finish And we'll do a final polumi push when we are ready for that testing switch over when it's all set also user data storing and archiving I don't have five petabytes. I wish but no not yet. Maybe someday Disaster recovery site improvements to the galaxy infrastructure Other collaboration projects. It's a priority for us to actually work together with the bio commons in australia To get more of this happening as I said before cvmf is one of them and other things as they arise Perfect Just a last personal statement I want to thank simon for inspiring me to be part of this journey with galaxy because as he says Galaxy is ace. Thank you guys Alessia is up next Um, and apologies. I missed the title of your talk, but you're going to tell us all about it. I promise. This is my last stock in jcc Okay so good afternoon everyone and This is my last stock in gcc as I said before my name is Alireza. I'm a software engineer in galaxy friber and in this Presentation I'm going to introduce you the new galaxy new notification system in galaxy so uh In this presentation, I will uh talk about what is the notification system in the mini And what do we have in galaxy? It has the notifications And what types and channels are supporting right now? And at the end I will do a live demo. You can also do the live demo. I will tell you later how Uh, all right, so what is the notification system? Uh in In its simplest form we can say notification system is one of the crucial part of each platform each if each software Including galaxy and their main purpose is to deliver The important news announcements events or anything to the users in timely manner And This is the simplest form I can see so Uh, I'm doing doing it so quickly to have a live demo at the end. So this is a new notification looks like in galaxy and As you can see users can access to all their notifications see them mark them as read delete them or anything or filter them Previously we only supported by a Notifications in push push notification and the email for only job Complication but from now on we can have more and more things in more areas so In this new system, uh, you can send Notifications in automatic Ways you can uh schedule them to send them later to the users you can uh as an administrator you can Select individual users and communicate with them or select group of users and communicate with them Tell them some specific things and you may want to Tell all the users something right? Do you have do you have a schedule maintenance in someday? And then you can use the broadcast and you can also use markdowns to make it more beautiful at links or something like that So Currently we are supporting two types in notifications the message one and the shared item message one is The way that administrators can get in touch with the users or for example send some Some no notice from a tools or so shared item types is that that's from when when a user shared something with another user Another user gets some notification about what did they share and who did that share with them? For example, they if they share a workflow history page or in application So, uh, we are working to bring more and more channels to galaxy that Makes more convenient convenience for users right now We are supporting email and push notification and we are really working hard on to bring it to matrix And later we can add more and more channel and bring it to the hand of the administrators that then they can customize it for their own users So That's it. So let's dive into the live demo. You can get access to The instance that has a notification enable with these ip others or scan this one So i'm creating this Oh So if you want to get access as an administrator Use the user koala and passport one two One to six And as a normal user you can use kangaroo So, okay This is koala user who is an administrator of this instance So let's send some notification to the kangaroo gun and let me sign in as a congo to here I'm doing this so faster So here Just as an example. I'm sending some hi And hey to this user I can say, okay Just send this message to the role of congroves and you can also see Uh, but the final notification that the user can takes And here in this build Yeah, you see We've got the notification and say hi Or even for example, I want to share this lone pine data history Uh with koala Going through the share center and typing the email koala at sign koala that's a you Here you got a notification that said history shared with you by kangaroo This user kangaroo shared the lone up high data and you can click on it and you are on it So and also as an admin you can send some broadcast notifications and tell to all the users about for example updated terms Select the variant of that message like warning info and it's depend on the Variant we show differently. Hi for example for origin 1v Stick the banner to the top that makes takes more awareness from the users and say Okay, it's done almost any second So, yeah, we got it. So that's it and Yeah, and At the end I want to say thank you david my lovely colleague from germany is currently in germany And yeah Thanks for staying everybody. So the very last talk We've saved the best to last of course is going to be a complete change Of direction because we've had four days of incredible display of how galaxy has been taken and used all over the planet in amazing ways What I want to talk about In galaxies 18th year as a project is how we can sustain the project for another 18 years maybe So to do that Whoopsie How do we get the next slide right to do that? We need to understand what it is that needs to be sustained And that means we need to think about what the actual galaxy project is Now you all know what galaxy is because you just had four days of it But you know when you think about what you see when you think about galaxy You see communities and there are dozens of them I just want to give a few examples of a few possible anatomical categories as it were for thinking about galaxy the communities the Grants and all the hardware that underlie all of the services that we provide The dogfooding of the of the source code that's been going on since 2005 So the source code is now served at enormous scale as dog food by the used galaxies That hardens the code and all the user feedback that we take very seriously Makes it even more fit for purpose every time we get an idea. We try and incorporate it So Those services are an enormously important honeypot They brought me into the project and I guess many of you here in this room Found galaxy through those online services. So they're a really important part of the of the way the project succeeds There's all the open source outputs We could go on for days about those if we had time And then we're only just scratching the surface of what's downstream from galaxy But you know the really interesting thing about this exercise Is that no matter how you dice and slice you won't find the project It's not here Yet all of these things are part of the project But where's the project? The reason it's so hard to understand is because it's the sum of all the parts in a way And it's more than the sum of all the parts Because every one of those individual components Is independent of all the others The deliverables can only arise Through incredibly efficient work that's collaborative and global And crosses all of those institutional and individual paycheck boundaries Without that The globally distributed work couldn't possibly Function efficiently And so the most important message that I have for you is that there is somewhere A virtual entity which we'll call the project with a capital P That coordinates all this incredibly complex and intricate effort Across all of these different institutional boundaries So if you think about all the things that are going on Our little sample What I want to sell you as an idea Is that the project itself Is a kind of underlying Communication network that enables all of these components To work together efficiently To deliver all these wonderful Fabulous deliverables that we've been talking about the last four days And I guess the most important point I want to make Convince you of if I can Is that our future sustainability Doesn't depend on individual components. It depends on the project infrastructure And I say that because This infrastructure collects connects all these independent components None of the independent components has control of any of the others all the grants. They're independent. No There isn't a PI who controls all of the grants so All of the all of that coordination has to happen at a higher level and it happens because the collaborators all agree That we're going to work through this kind of project entity And efficient collaboration is surely the the major element of our next 18 years I have a public service announcement. Some of you may not be aware But participation is in your interests if galaxy is useful to you in your work It's rational to participate if you need galaxy There's this concept of enlightened self-interest Which is sometimes described as self-interest correctly understood Where for if you want to keep using galaxy You should think about becoming engaged Because you'll make the deliverables better by contributing And those better deliverables will increase in scientific value. That's going to attract more users That's going to attract more usage And it's user demand that drives all this global research investment tens of millions of dollars a year And You know it it's That investment will sustain the project for all users. So it's up to you really to kind of Do your bit if you can and And there are lots of things that I think we can do where where your bit is going to vary depending on what you do For me the most important parts of what we need the project infrastructure to maintain efficient collaboration Participants be ambassadors show galaxy to your friends Show it to your boss show it run a training session for your lab contributors make galaxy better for everybody And probably one of the things we really need to focus on is community leadership because that leads to Enormously increased scientific value for all of our users. I'll stop there. I could go on for days. Ennis would yell at me And I just want to thank Everybody who's been involved over the last 18 years because that's how we got here. Okay. Thank you Do we have time for questions? Should we give Ross time for questions? Yes, we have time for questions. Oh, thank you There's a co-fest. I'm trying to organize a co-fest if anyone cares My alarm that told me I should stop talking. Uh, so I should stop talking, but I'd like to organize a co-fest Okay, questions come and talk to me during the co-fest