 Okay. So thanks for the Galaxy Developer Roundtable. This week, the GPU team here at Penn State, I'm going to talk about GPUs and Galaxy. I'm excited to see what the conversation looks like. So I'll just hand it off to you two. You want to introduce yourself and what you're doing and take it away. Hi everyone. So I'm Tashwant here. I'm a PhD student at Penn State where I work on cloud computing under my advisors, Dr. Canberra and Dr. Das. And Dr. Anton as well as in my PhD committee. And so myself and another junior PhD student person, we've been working on an interesting project which is about bringing in GPU support into the Galaxy environment so that we can accelerate the tools which already have GPU capabilities, but then they are going to be running in Galaxy. We're trying to accelerate them by bringing GPU awareness. So, both of us will be presenting the work which we have done so far and progress we have made on that front. And so, let me start off with my presentation. Also, can you actually share the screen. Can everybody see my screen. Okay, so, so, as I said, we've been doing this work which is accelerating bioinformatics tools in Galaxy. So, currently, we have written a workshop paper out of this which has been published in high comp. I mean it got accepted and it's going to be published actually soon. So, so, so I'm going to be talking about the work we did on that. I'm going to go to the next slide. So, as we all know that Galaxy is a very widely used, you know, software framework to to to run a wide variety of applications ranging from computational chemistry to genomic sequencing to bioinformatics and whatnot right. So, one of the key applications which benefit from Galaxy would be. Next slide would be, you know, the genomics sequencing and you know drug discovery or vaccine discovery which applications and for example even with COVID-19 right now. I think a lot of the sequencing and vaccine discovery has been accelerated by using software frameworks like Galaxy running their experiments so it is extremely important for a wide range and especially speeding, you know, speeding up the application. Just one, sorry, can I interrupt you? You don't need to convince anybody here that it's important. Okay, just let's just get to the point because this way we'll have more time for discussion. Yes, yes, that's going to start off right in the next slide so we're going to go to the next slide. So, as we all know that you know accelerating these applications is important. The most the one of the key playing factors here are you know the compute infrastructure which is graphics processing units. So with GPU support these applications can run a lot more faster. So, what we are trying to do in this work is we'll go to the next slide. So, here what we try to what we try to do is the current Galaxy software infrastructure it doesn't have the capability to run in a GPU infrastructure where though the tools do have support to run in GPUs on a bare metal environment but if I'm going to be running in a Galaxy then they can't be run on a GPU setup. So, and from prior research we know that GPUs can give you a lot of speed up for some of the example applications shown here. So, what we do in our proposed design is to go to the next slide is we are trying to make Galaxy GPU aware. So, what we try to do here is we try to make use of the existing plugins within Galaxy itself, where in Galaxy can know about GPUs which are there in a cluster. Further we also try to you know how Galaxy would behave and there are multiple GPUs in the cluster. How is it going to handle this scheduling and and you know running running the applications on that front. Of course Galaxy is not running a standalone mode it is it is typically run in a cluster with the cluster scheduler, but in this work we have focused on the local runner execution of Galaxy to see how it can be made GPU aware and we'll be focusing more on that, and we do have an upcoming word going on to see how Galaxy interacting with the cluster scheduler like HPC scheduler and then how GPU availability is going to play a role there. So, now I'm going to leave the floor to cool some who has been working predominantly on the engineering aspects of how we brought in GPU awareness. So, also we can start off right from the next slide. So, most probably most of you know the overall system flow of how to run a tool on Galaxy. So, what the end user does is they press execute on the web UI and then the job configuration XML file is used in order to expose Galaxy to which destination to use. Then either a user can specify a script to map the destinations or there can be readily available destinations. Later, at this point we introduce a new destination mapper which maps the jobs from GPU to GPU or CPU depending on the conditions. Then from the script what we did was we expose an environment variable which is used throughout the system flow, and it is used both in the local runner. It is used in other files like evaluation.py which actually assemble a command line, a command to execute the tools, and then later that environment variable is exposed to a tool wrapper. And from this tool wrapper, a tool developer can actually use the value of this Galaxy GPU enabled environment variable and that they can allow Galaxy to choose between a CPU executable or a GPU executable depending on the availability. So here we had introduced several challenges and our first challenge is how can a user, a tool developer tell to Galaxy that their tool requires a GPU. So to do this we introduced a new requirement type of compute in which its value can be GPU or CPU. Later using this information we have another challenge. How are we going to expose this GPU availability to the Galaxy runner? So to solve this challenge we actually implemented this CPU GPU mapping script which again as I said introduces the environment variable. So here our script what it does is it learns it understands the GPU availability on the infrastructure that our Galaxy instance is running. So what it does it also gets the requirements from the tool that is to be executed and depending on the requirement and the infrastructure availability it exposes the environment variable. Later, our next challenge will be implementing this for containerized GPU tools. To do this, we added some small additions to the readily available Docker utility and Singularity utility files and what these do is they allow a media Docker to be executed. Our last challenge is, which is challenge four, how to implement this on a multi GPU setting because some of the tools are embracing with parallel and they can benefit from multi GPU infrastructure. So in order to execute in order to solve this challenge, we actually enabled the multi GPU computation mapping support, which allows the user to specify the IDs of the GPUs, not only the GPU requirement. And then it allows us to obtain real time GPU information in order to design a location strategy to multi GPU aware setting. Sorry, I can't change the slide. So first I will explain how we're specifying the new requirement type. So this is, so I will go through the example of record GPU tool in this presentation. Here we have the macros dot XML file which is imported from the tool XML file record XML file. And here, you're seeing that now we have a new requirement type of compute and the value can be GPU or it can be CPU too. So what this does is it allows the allows Galaxy to recognize that the tool requires the GPU. Next, we have the, I will explain how we are creating the environment. So here we have the destinations that pie script that we wrote. And here, first our galaxy GP enabled environment variable is false. And we get the requirement value and the GPU count. So if we have more than one more than zero GPUs and if we have the requirement, we make it true. This is the job conf file. It just shows that we're using our dynamic mapper in the job configuration. Now, I would like to explain how we're exposing the new environment variable to to the tool wrapper. So you can't just use OS, like environment variables from within the tool XML, the wrapper files. So what we did is we found this built param dictionary function in evaluation dot pie script and then we added this addition where we just get the get the environment variables value from the Python implementation, and we create a parameter dictionary for the tool. So we can just access the parameter dictionary entry like this from the tool wrapper file. And if this is true, we will use the GPU executable of the tool. Otherwise, we will use CPU executable. So this is the sample execution of galaxy. So here we were using the GPU, GPU infrastructure node, and we had both GPU and CPU versions of the tool, the executables of the tool, and we use the tool wrapper that I showed you in the previous slide. And then when we executed it. Galaxy found out that it needs to run the GPU executable and this is the result of it. So as you can see here, it is using the record GPU executable. Our environment variable also can be used to expose selectively the parameters of a tool because for example, record GPU tool has the extra parameters that it's, it's used in GPU version. So you can just use the parameter dictionary entry in order to choose which parameters you want to expose to. So this is just a short slide on how we added this support to Docker and Singularity tool. I think, let us maybe give a pause here to know if, you know, the audience were able to follow or they have any clarifications to make. Yeah, does anybody have any questions or clarifications. I just a real quick thought was that I think you found the right places in the code to modify the basically the, the parameters you're sending to the tool. It might be, it might be good to do that with like job configuration parameters though like things in job comp right because you can imagine like a heterogeneous system where some of your nodes have GPU enabled and some of them don't. And so just using a single environment variable that that sort of runs within the galaxy. I can't hear. Yeah, I think I lost his audio. They only concern so far. But otherwise it seems like you've created an abstraction for getting that to the tool. Well, and I think that's great. John, I think we missed you for a few seconds. So yes. I'm only got a new Chromebook and it's, it's terrible. I'll just retract my comment. I'll take things offline. I agree with John. So if you look at the job comp XML dot sample exit advanced example you can see how you can define parameters for the job runners themselves and that's probably where it can be defined right John. Okay. Yes. Yeah. But after that point everything looks great. Okay, okay, we can discuss it with you later also about like how to fix it. Actually, what we did was we learned the galaxy implementation ourselves so it can be sometimes it may not be very suitable to what the developers has been doing so we can adapt it and we can change our implementation to what's the proper way. Go on. Okay, so now I will explain how to do this support to Docker and singularity containerize tools. Actually, this was one of the easiest parts because as long as we have our environment variable, we just check its value and then we actually just append a new flag to the build Docker run command and build singularity run command. So they are very both very similar. And what they do is so if you have an media Docker installed on your infrastructure. This is a requirement for this to work. Then if you just add this GPUs all or selectively select GPUs like GPUs one or anything like that. Then we will be exposing the GPU hardware and GPU drivers to to your containers to. So these two flags allow us to add the stock and singularity support to the GPU functionality that we're adding. We just tested this by creating a Recon GPU container and it can be accessed from this Docker hub link repository and from this Recon Docker file image. It's actually Recon GPU image. This is a Docker to singularity tool which is an open source converter tool to convert the Docker container to singularity container easily. And then we created our singularity image from Docker image. And later we tested our Recon GPU to evaluate it and it worked as intended. We evaluated our support with no that has to and media tasks like 80 GPUs and 32 work size which means 32 threads are executed at once in the GPU. And we used put a 10.2 and Python 3.6.9 at the time we tested this. And this is just a hardware architecture that we used. So now I'll explain what improvements we got. So we actually got very good benefits of using GPU tools from Galaxy and for Recon GPU tool we got almost 2x speed up for local and containers execution compared to the CPU only version by running it from Galaxy. So we actually saw that our support doesn't have any extra overheads which is the which is what we want. And later we also wanted to analyze why are we getting these speed ups so here we tested Recon with 17 GB data set which is the Alzheimer data set and we actually profiled our tool from within Galaxy and we understood why we cannot get more speed up so actually it was reasonable because we have some extra API calls that are associated with GPU tools. So this shows that these extra API calls the transfer data from memory from CPU to GPU then back to GPU to CPU actually explains our speed up reasons. Later we wanted to test another famous tool, which was Bonito. Maybe most of you know so I want to ask how many people know about Bonito and how many don't so I can maybe do it shorter or longer. I don't know anything about Bonito. Okay, then I will explain it more in detail so as we all know Oxford nanopore technologies, it is a famous new technology that can generate longer sequencing reads than next generation, next generation sequencing. So now I will explain what base calling is in one sentence so base calling translates the original electrical signals that are coming from raw data from the sequencing sequencer to the nuclear life sequence. So what Bonito does is it uses deep neural network in order to base call the reads easily. And Bonito's DNN architecture is composed of convolutional layer, followed by three stacked bidirectional gated recurrent unit layers, GRU layers. So what Bonito does is it uses PyTorch in order to execute this neural network. So it is one of the most suitable applications that we would want to test our support because it both uses GPU. It has CPU version GPU version. It has neural networks which is which are becoming very frequently used. And it is a very recent tool and it is frequently used. So we also did some experimental results for Bonito. So we used two data sets. One of them is larger than the other, like 5x larger. And we compared CPU and GPU executions. As you as you can see, we have almost 50x speed up with both of the data sets using GPU version of Bonito. And here we also profiled Bonito and as we can see we still have some overheads because of GPU. However, the speed up that we're getting is much more dominating these overheads. So the speed up reason can be attributed more to you know they using a deep neural network and generally DNNs are more embarrassingly parallel when running GPUs so you get a lot more speed up and compared to Recon. And next I will explain the multi-GPU support which is I think one of the most interesting parts of our functionality. So for multi-GPU support we need several information from the infrastructure. So we get this information using our get GPU usage function which resides in the local.py script. So this function captures the executing processes from each GPU and it returns a list of available GPUs and all of the GPUs in the system. So here to implement this we use the NVDSMI query and in this query you can query the GPU with several attributes. So it's not shown here but we query the GPU with the process info and etc. So then we use a beautiful soup in order to process the XML that we're getting from the NVDSMI query more easily and then from that we found all the processes and the process info. So here we what we did was we process this and we just return the available GPUs where the available GPU is a GPU in which no process is running on it. So this was our first approach and a second up. Yeah, I will actually explain that later but this was the approach that we return available GPUs. So when you're running a new tool it will allocate available GPU to that tool. Next we have the command line function which resides in the local.py script as we all know. And here we did the check again we checked if the tool requires a GPU. And we added, we used another XML tag that is already available in Galaxy instead of adding a new one. So as we all know there is a version tag for tool requirements. Normally it is used for software library version. However, we wanted to reuse that and we actually use that for allowing a user to specify which GPU devices they want to run their tool on. So using the requirement version and requirement type, we actually saved the GPU ID that the user wants to use. And then we check if that GPU is available. If that is available, that's good. We just allocate that GPU to the user. However, if it is not available, the tool still has to be run. So what we do is we allocate one of the available GPUs to the tool. So this availability definition will change in the future and I will explain that also. So here we have four cases of experiments that we made to actually test our support. The first one is, okay, I have two tool instances. I want to run both of them at the same time. Recon wants GPU zero and Bonita wants GPU one. So if we run this, it works as intended and Recon is allocated to GPU zero and one. So this is base case. Here we have two instances of the same tool. So we know that Bonita wants GPU zero. However, GPU one, sorry. However, if I execute one instance of Bonita, it's running on GPU one. So GPU one is not available anymore. Next, if we start another instance of Bonita, then it will be allocated to GPU zero because GPU one is not available. It's a better allocation because if you schedule both of them to the same GPU, then there will be context switches and you will not get the best performance you are intended to. Next, we have, we tried four instances of the same tool on a two GPU machine. So there is an overflow, obviously. And first, Recon is allocated to GPU zero and another instance is allocated to GPU zero. GPU one. And then next, since all of the GPUs are not available, third instance, which is with PID 41105 is divided into two GPUs. So in our first approach, we were actually, sorry, we were actually dividing tools between GPUs which can cause overheads because this GPU can be more memory allocated or being used more. So actually, it's not a very efficient way to allocate it in case of overflow. So what we did was we proposed another way to allocate the extra instance of the tool. So what we do is we get the instantaneous memory allocated from the GPUs and we give the GPU which has the least instantaneous memory allocation. In this case, we execute Recon and then we execute an instance of Bonito. So Recon is in GPU zero and Bonito is in GPU one. So what if we execute another instance? So in the case of this overflow, we allocate GPU zero. Why? Because that was the GPU which has the least memory allocation with 60 megabytes for megabytes. So this is our last approach for multi GPU support and we believe this is more, this is more, this is a better approach because of the reason that I mentioned. So are there any questions about this? I guess not. So I will continue. So these are just command line outputs of those tables. So I wanted to include to show that they're working as intended. And that's it for our presentation. So if you have any questions, please let us know. And I think we are always welcome to some kind of suggestions and improvements to, you know, some of the implementations we have done. I have one question. Somebody has a lot of feedback. How are you going to deal with specialized containers? So somebody can unmute. So basically what we do is we have these software requirements that say whatever, Bonito, and then we would automatically get the correct image from Docker Hub or KIO. So these would need to be annotated as GPU compatible, assuming that these are not the same containers because, you know, you need the GPU libraries, whatever, NVIDIA things, specific things. So how do you see that? And is that something you're going to address? I'm still actually trying to understand your question. So by specialized containers, you mean that they have some very specific GPU libraries which are actually being built on to the container image itself? Yeah, if you want to, I mean, you know this, right? So your Bonito container probably contains NVIDIA, SMI things. So you need to have this specialized container, right? And there's a lot of tools that have a CPU and a GPU mode. And all the dependencies that Galaxy uses are in Conda, right? And so you would need a variant of the container that is GPU compatible, right? Because if you're running on CPUs, you don't want to have all that extra library. Yes, I understand. So I think what would be the most simple way of doing it would be to have two different versions of this specialized kind of image. Farin, yes, if you do want to run it on a GPU, then we want to build the image which will contain the GPU specific libraries and maybe have another version of the image which won't have those packages built in. And as long as we have an image probably tag, which says that whether it's a GPU compatible tag or a GPU incompatible tag, we can use that to actually map in Galaxy, I believe. And so for that, for that, would you work? How does Galaxy know which one to take? So I mean, I'm going to make use of the container image tags is what I'm saying. So I'm so also, I think the user has to do some kind of an annotation there on the tag saying that these are the two kind of the existing images and which one they would want to use. So based on the node availability in the cluster, Galaxy can choose between them, right? I mean, you would need to produce a GPU specific tool then, right? Yes, yes, yes, yes, we do. Yes. It would be a manual effort, yes, but I think that would be kind of the initial stuff in doing so. I want to jump in a little bit. Marius, do you think it's better to have two separate apps? Like I guess what is the, if I'm understanding the problem correctly, it's that if they add the requirement for GPU libraries to the wrapper, it's going to make the tool always have those and make the container bigger. I mean, all of our dependencies, all our containers are built from conda packages, right? Yeah, and so there needs to be an infrastructure change for there to be GPU ready containers available, right? I mean, I don't know if it's as simple as adding a different base image or, you know, adding another stage. We would also need a convention in Galaxy for how that would look like. So for instance, this tag that you mentioned, the GPU tag, like how is it going to look like we would need to think about how we want to do that and how the community is going to build those containers. Yes, they need to change the image as you said. Are we talking about tool specific GPU requirements or are we talking about in general something to be able to run GPU tools? Because I think those are going to be two separate things. Like something like the invigil drivers or whatever GPU drivers are needed or binaries to use those drivers, whatever. Those can be uniformly added as the base, like you said, but like in the scenario where there's a specific binary to run this tool with GPU versus CPU or something like that where it's a specific requirement for this tool to be able to run as with GPU. How do you think that should be handled? Because I mean, for my, I don't know how common this is. If that's a common thing, that's probably something we need to talk with the corner people, right? If you want to continue using that as our ecosystem. And for the station to need to bring there. Because it's, I mean, it's the same in condo, right? So if you install something in condo that's supposed to run on GPU and meets additional things, and that needs to be figured out on the condo level. Don't make two separate appers, right? So there would be one app. I don't know if it's wrapper or. Because in condo you also have different builds for different R versions for different Python versions and so on. Okay, so like something conditional that Galaxy can. Yeah, exactly. So you know, I mean if, if it's going to a GPU route. And then there's a GPU tag for that container and then you want to take that GPU container, right? So should the mechanism be somewhere in the way it determines the mold container, something like that. Yes, but that also requires that on the condo side, there is something that we can use. I mean our old thing is just checking which requirements are needed in which versions and building a hash out of those requirements. So it will be easy to say, you know, at a CPU specific hash. So the third version of the maltech could be including whether it's CPU or GPU. But those things need to be built. Unless we want to do that one by one, which I don't think we want to do. The entry. As a start, it's fine. As a start, it's fine, right. So that's sort of the challenge there for the driver things like we do have an operator that's just installing drivers on each node. Oh, so I think what we need to worry about is like the tool specific things more than cluster specific things. I mean, what is different between the bonito container and I mean the bonito GPU and the CPU container because I'd assume those drivers doesn't help if they're on the node, they need to be in the container right. No, it also has to be on the node as well. And that would be some kind of a redirection between the content image driver and to the actual node driver. But it will need something like something to interact with that. Yeah, I think that would be that would be an interesting space to explore and then see how to come up with much efficient automated way of doing things. Yeah. And one more thing I forgot to mention in the talk so a lot of these limitations if you have done was was based on a lot of feedback and suggestions from actually beyond. So he has helped us a lot throughout this entire process so I just want to mention that. So what are your other clarifications are kind of an improvements which you would suggest, but a couple of comments in the chat. Yeah, one I was wrong with the line what Mario said about. This condo support I think that was explored at some level, not in bio condo yet though just in anaconda or condo for sure. And the others that we are kind of try to keep the requirements galaxy two requirements as as general as possible without specifying, apart from the version of all the tool that's required. So that's kind of one of one of the goal that also let us move from the previous packaging system that we used to condo and now to singularity and docker without having to change the tools. So, ideally, heavy heavy being able to keep this as general as possible. It would be good. I understand that in the first stage, it's probably it's probably tricky. So I think the inherent assumption we have here is that you know the tools which whichever they want to be executed are kind of are kind of like known, whether it is going to be running on a CPU or GPU. So, we assume that the container images are actually being built for this specific hardware they want to run on, but I think whatever you were suggesting would be like I just have the tool in general and it might have both ways to execute. So we would want to have a genetic system which can say that okay this tool does have a GPU capability then I need to probably like pull in or package a GPU container image and then try to run on a GPU right. So that is something we can try to look at yes. And also, another general question. I'm not a GPU expert at all. I've seen that in your example, Racklin was being able to run with just one switch if it was GPU or CPU. But I think there's possible to have, I think it's a part of the talk, maybe you already answered that. Depending on which underlying GPU hardware you have, it may need different switches. Probably just a binary GPU enabled or not may not be enough. Yes, so yes, yes, yes, so there I mean we do have a pre-compiled executable of Racklin which is for one is for CPU and the other is for GPU. And we have the assumption there that the node we are going to be running at 10 has the necessary packages for that pre-compiled Racklin. But that is a scope for improvement as you said to know that whether it is going to be compatible or not on the node you're going to be like running it on, yes. Thanks. I can maybe quickly say something about the condor status so condor forge is building packages against specific CUDA versions. And in theory, we can do that on Biocond as well but most of the packages like TensorFlow and so on are existing already for GPU usage. The problem here is that you not only need to build against NVIDIA but against specific CUDA versions and yeah it will just increase the complexity and we really need to not only that it's a GPU but also the kind of the architecture of the GPU or the CUDA version and then pull down specific packages or specific containers. And that might be a problem or we need to take that into consideration so we cannot simply assume if it runs on one GPU it runs everywhere on GPU. So that's the problem for all packages. How does that look like when you say build against? Is that at the, I mean I've forgotten the terminology but is that at the build stage or is that just another base image? For the condor package you mean? Yeah. So that's the build stage so I think I'm really not an expert here but I think it's like if you compile against a specific Lipsy version, right, you need to have this version around, you need to have this CUDA version around otherwise it will not work with the next CUDA version or something like that. That's my understanding currently. Which is okay. Right, it's in the condor hash. So. So that means we could also possibly exchange just the underlying layer, transplant sort of the file system of the container. No, because your binary is still compiled against the specific driver version and we cannot control the driver version. And like Jamarro figured out, specific CUDA versions only work with specific scheduler versions. So this is an entire mess if we talk about NVIDIA. The good news is that we have direct contact to NVIDIA and we can actually ask them for support so they they are eagerly trying to get into the scientific market and they are willing to put engineering effort in this. It stays closed. Proper trace software so we need to deal with that. How is, is there any efforts on ARM 60 whatever the new thing is, is it ARM 64? Yes, so. So this package is compiling a subset of packages against ARM. And we have two grants currently open for BioConda to actually support that on the BioConda site as well. So we have some hardware we have also an opportunity to access a supercomputer with ARM to compile packages on that. But I think the, the, the larger problem is to actually work with the community to port over CC++ code actually to ARM architecture. I mean, this is the way bigger problem, I think. But we are most of the packages, I mean like like Python and so on. So the basic stuff already works on ARM and we could submit to ARM. So if we really buy and make packages, we are close to it, we will support it soon I guess and in the next year or whatever. I mean I'm asking because I consider this sort of different problems of the same class, different architectures and different. Yes. Yeah, compute environments and I think whatever solution we come up with should be extensible. Same same thinking here. It's, it's, it's also a little bit. I mean we also need to discuss how far we would like to go because essentially it's, it's a reproducibility nightmare in the sense of, you have one tool but depending on the environment that it's running it's actually produced different results. And we, I mean one way is to restrict that or the other way is to actually annotate it also to a user in a way that you cannot rely that this tool is actually producing the same result on use galaxy.org on EU because whatever they have different architecture underneath. That's a good point. But I'm not sure where to start this discussion actually. I mean it seems like a good place but so maybe we can pull the architecture. You know after the fact pull out the architecture which it has been run. I mean we should be recording anyway, whatever dependency we used for executing a job so maybe we need to also pull out architecture make that sort of a first class job parameter. So we're recording a C one thing and and I think we do that with the metrics plugin right but I mean, let's assume we export then a workflow and then the workflow we transfer to a different galaxy server executed. And we have the assumption that it's kind of reproducible. But I don't know if you have an arm cluster and I mean I've never tried that, but how different are the results. Are they supposed to be different. I don't know. I don't know if we need to explore cannot just something that comes to my mind right now. I think most of the GPU application in GPU tools needs to know the the CUDA capability of the GPU devices also just to figure out if that's the right GPUs where the tools can run on. I think that the galaxy should expose to the tool also this information. Just, just an idea. Yes, I think I kind of agree to what you said that yes we do need to know the GPU capability of that note for the tool. The CUDA capability I mean. Yes, yes, yes, that's right. That's right. The CUDA capability that's right. That's part of the scheduler. I mean in the idea word. I think just before this scheduler, there would be another component to check this configuration compatibility or configuration management. So we I mean in the in the in the cloud computing world we typically have tools which try to do that, even before going to the scheduler, but I'm not sure how that happens in the galaxy setting. Can you read how does it work in setting I mean you want to submit a tool and the tool checks the note manually. No, no, no the tool I don't think it would check but we do we do we do have some kind of external afterwards which you can use to check the configuration compatibility between the node and the tool. I don't know about other runners but in Kubernetes you can just specify resource requests GPU theme as we specify the quest for memory or CPU and then it just leads out the job to whatever node can. Which is what scheduler does in HPC right. So you you don't, you know, you don't ask what is available you just say I need this. Yes, that's right. I think it's learn or anything like that we can do that. I think I interrupt you sorry. No, yeah, I was similar point I think we need to find a way to match the expectation of the tool with what's available as resources in the cluster GPU whatever so right right now there's always the middleman which is the admin that decides. Okay, this tool will use eight cores because otherwise it's too slow or this tool needs to go to this queue because it needs this amount of memory. And if we need to do that also for all the possible combinations know of cool that capabilities whatever it's becomes a more an issue for for another issue for for the admin to find out which is the the right queue for the rich tool. I think it's. Sorry. I didn't want to interrupt you. So I think we also discussed having expressions within requirements so that you could say, you know, based on the input, give that much CPU memory and so on so I think we could try to go that route, or we could do what people in. So if you look at NF core, the next flow sort of community thing they distribute files that specify resources required for workflow runs. So you get inspired in that direction and start assembling sort of files that specify for a given environment, what are the resources that should be allocated that are reusable. So if you look, there's for instance, different HPC centers that just submit their definition of what should be available to give them workflows I think that's maybe something to investigate because it seems like that's more flexible than embedding the expression directly within the tool. So what, what, what we are trying, but we need to have solid matrices on that so we would like to take our database and the matrices that we have collected during the years for all tool runs and and create a model so I mentioned learning model out of that so we actually can predict it our priority and the and in the individual case admin doesn't need to do that anymore so our idea is to have a web service where you submit the tool ID and you get configuration options that we know that have already succeeded in one of these Galaxy servers or in other environments, and you give, and you get back kind of CPU memory GPU requirement back for the tool ID. So we are looking currently at in this direction so maybe someone wants to help you the problem that we are currently facing is that the matrices that we get back from C groups, we're not super reliable on our old cluster. Wouldn't it work for us, or I mean, like an advantage if we did sort of a bit more for more manual process that when the resources are that are available are different, you can sort of choose and specify your own things. Instead of assuming that historical data from a database that you know it's sort of close because it's only it's only only you that you have this data. I mean, you know if you create a repository of resources that seems like something that is a bit more open and friendlier. We collect with whatever the galactic radio telescope we collect all the matrices and we and we just do, and we create models out of that so it will in the end be a giant database if you think about that with a nice API that you can request to get this operation I mean currently we do that with sorting heads and Nate has also a YAML file somewhere around where we do a best guess. And this best guess should just be automatized and based on real proper and backup by real proper data from our database that's just our thinking. But of course, if you can collect matrices from these tools, you can provide them. All right, guys. It is one we are out of time. I encourage you to continue that conversation there is a pull request. And I just wanted to thank Gulsam and Josh for their presentation. Thank you for coming to the roundtable today. Yeah, thank you a lot. Thank you for having the opportunity.