 Good afternoon and welcome to the latest webinar in the BioXcel webinar series. My name is Adam Carter and I will be hosting today's webinar. So what I'll do is I'll spend three or four minutes just giving you a quick introduction to BioXcel for those who are not familiar with the project in the centre and what we do but then I'll hand over to Michael for today's main presentation. So a very quick overview of what BioXcel is doing, what we are. We are a centre of excellence for computational biomolecular research and there's kind of three pillars of what we are doing, three main aspects to what the centre does. The first one, an important one is excellence in biomolecular software. So one of the things the centre is doing is supporting some development work on popular biomolecular research codes that are being used widely in Europe at the moment including GROMACs for molecular dynamic simulations, HADIC for docking and the QMM interface in CPMD as well for quantum mechanical calculations. So we've got some of the core developers from these codes in the project so we are able to help develop these codes and to support them as well. Another aspect of the project is usability and an important aspect of the usability is workflows. So we've been looking at various different workflow environments, data integration aspects of that. We've been involved in some development work in the common workflow language for example and we're also making available building blocks that can be used in workflows in things like NIME, Galaxy and so on. Finally, we're also involved in various consultancy and training activities. We've been running some summer schools and things like that. And also if there are things that you would like to speak to us about in terms of working together in the future, you can contact us at the address that you'll see at the end or myself or Rossan and we can speak to you about how BioExcel can help you with the work that you are doing. So we're happy to take questions for today's webinar. That's an important part of what we would like to do but we'll leave those to the end so it doesn't mean that we have to switch between questions and the main presentation. But you can record your questions at any time in the GoToWebinar control panel. There's a little questions tab. It looks slightly different probably from what you can see in this picture here but there's a questions tab at the side. If you type your question in there at the end, if you've got a microphone I will invite you to ask your question directly to the speaker. Otherwise I can read your question out and the speaker will answer the question. So yeah, you can type those in at any time and we'll take them at the end. So today's presenter then is Michael Gecht. He's from the Max Planck Institute of Biophysics in Frankfurt. And Michael received his bachelor's and master's in biochemistry from GoTo University in Frankfurt. His master thesis addressed biological questions using combined theoretical and experimental approaches. So he's currently a PhD candidate in the theoretical biophysics department with Professor Gerard Hummer at the Max Planck Institute of Biophysics. And his main focus is the interaction between proteins and lipids within biological membranes. Michael's also interested in workflow automation, data visualization, and the modern web. So I'm now going to hand over to Michael and he will be able to give the rest of today's presentation. So Michael, I'm just about to make you present the presenter and you will be able to take it from here. Okay, so thanks Adam for the introduction. My name is Michael. I'm a second year PhD student with Gerard Hummer in the Max Planck Institute of Biophysics in Frankfurt. And together with two colleagues, Mark Siegel and Max Lincoln, we've created a tool called MD Benchmark. It helps you, helps us to scale molecular dynamic simulations across a different number of nodes on high performance clusters such that we can gush how much performance we can get from those settings. So I guess everyone listening is someone that uses high performance computing to do some kind of calculations and you are therefore aware that computing resources are not abundant and they are extremely expensive. Whereas looking at different questions, for example, in our case, at biological questions, we're interested in those, well, the questions we're interested in require lots of computational resources. So we have a problem here, we, computational are expensive, but we actually need lots of resources to do the calculations in a past manner. So we, everyone using computational resources should use those in an optimal way such that they can gain the most performance in a short amount of time while not wasting resources off your colleagues or the funds of their PI. And another thing that one has to think about when doing molecular dynamics calculations or computing in general is that running any computations on a single node will give you a certain kind of performance and you can increase the number of nodes, the number of computers that your calculations are run on and the performance will increase, but it will not continue increasing as you add more and more computational power. The performance will actually at some point start to degrade. So there's some, there's somewhere, there's this optimal number of nodes that you're supposed to run your simulations or calculations on such that you are running those optimally. So there are different factors that one can optimize in this region, one of them being the number of nodes, what I've just explained to you. And there's also, you can either run your simulations or your calculations only on CPUs, but you can also run them on GPUs, which would give you different performances depending on the software that you're using. But you could also try and optimize cluster-specific settings like hyperbreading, MPI, OpenMP or software-specific settings. For example, if you're using Gromax, you could try to do some optimization with PME and different kinds of tasks. So, and the tool and the benchmark, the tool that we wrote, tries to help you with a few of those things and so far looking at a short list of things that one could optimize and the benchmark helps you with the first two, which are the number of nodes. So the benchmark is able to scale your simulation on a different number of nodes, on one node, two nodes, three or four or five nodes and also run those simulations either on CPUs or GPUs so you can compare how those work out. We haven't yet implemented the features of parameterizing any cluster or software-specific things, but we're looking into that to have that in the future and future versions of MD Benchmark. So what is MD Benchmark? In brief, it's a comment line interface, a CLI, that we've written in Python. So it's a tool that you can use from your terminal, from your shell, and call to create benchmarks. It's open source, freely available on GitHub. We're using the Glue Public License. So if you're up to doing any editing on the code, you're free to do so. We have a big documentation online that tries to cover all usage patterns that we could think of, that we think are valid use cases, but if you have specific use cases that are not covered by the documentation or by MD Benchmark, we're happy to receive any feedback. And generally, it's installable on any computer that is running Python, so this should be basically every, I guess, every high-performance computing system should have Python installed, so you're free to install it there. What does MD Benchmark do, now that I've talked about it? It boils down, currently it boils down to four different functions. First of all, you're given a system that you want to simulate, for example, some protein in the water box. In Gromax, you will have a TPR file, a run input file, that defines the whole system, and you can give that file to MD Benchmark and tell it to generate, from this file, generate a bunch of different benchmarks on a number of different nodes, so it should run the benchmark on one node and two nodes up to five or ten nodes, and then you can also tell it to either run it on CPUs or GPUs, and then you'll end up with a variety of different benchmarks that will run on your QM system. Speaking of QM systems, MD Benchmark is also able to, after the jobs were generated, MD Benchmark is able to submit those jobs to your QM system, and here we don't care which QM system you use because MD Benchmark will try and figure out which one you're using, so if it's SunGridEngine, SGE, Slurm, Slurm workload manager, or the old IBM load leveler, whatever you're using, MD Benchmark will figure it out and use the correct submission commands to submit those jobs for you. And after you've submitted the jobs, you can actually immediately start looking at them with the Analyze command, which will tell you which jobs are currently running, but it also will go through the log files from your jobs and grab the performance that, for example, Bronx puts in there, and in a nicely formative way show you on this number of nodes with this number of CPUs, you've got this performance, with GPUs you've got this performance, and from this table you can then create a plot that you can use to compare, in a very quick way, to compare different benchmarks with different settings, maybe put it in some report, in some ground application, or just share with your colleagues. So one of those plots, how one of those plots can look like, I'm showing on this slide. So basically a plot in our case is the performance, nanoseconds per day is a function of the number of nodes. And here I'm showing two different parameters, either benchmarks running on CPUs, blue line, or benchmarks running on GPUs, orange line, and going through the first two points, we have a fit that shows the optimal linear scaling that you would expect to get from those first two points. And we can clearly see that the benchmarks on CPUs managed to almost completely scale in a linear way up to 10 nodes, while degrading some performance here, whereas benchmarks on GPUs are optimal in the first few nodes, but quickly start to degrade performance as additional more computational power does not yield any more performance in this case. So running on two to three nodes would give you the best performance on GPUs, whereas on CPUs you could actually think about running them on seven nodes, which would give, in this simulation, give you almost 100 nanoseconds per day simulation time. So to get in the benchmark, you have, there's a variety of ways, but the two most, I guess, most popular ways of installing Python applications are through Python package managers. And the two most used are PIP, the Texas's IPI, which you could use to install in the benchmark with the command PIP install in the benchmark. And the other way that we highly recommend is if you're a user of the Anaconda distribution that has a scientific distribution of Python, you could use Kanda install in the benchmark providing the Kanda Forge channel here for installation. We do have documentation talking about all the different installations, so you don't have to write that down here. So installing the Python package, we think it's important that you isolate the installation of our package or of any other package from your system. So Python provides a few different ways of doing that and in our documentation that you can, the URL you can see here, it's docs.ndbenchmark.org. You can find our documentation on the installation, all the usage. Basically what we're saying here is you should create an isolated Python environment where you install and the benchmark in, which by itself will install all its dependencies independent of packages that are on your system. So it doesn't collide with your system in any way and will always function as it should because it's in this isolated environment. So our documentation goes over a few steps that you have to take to actually do the installation. In this case, you first create a visual environment, a Kanda environment in this case with the command Kanda create minus N benchmark where the name benchmark will be the name of the environment and afterwards after you've created the environment you can install the package and the benchmark inside this very environment. And this will take a few minutes depending on internet connection and your processor, whatever. And after some time when installation has finished you can activate this environment and the activation of this environment that the Python interpreter, the Python program that you're calling is located in this isolated environment that you just set up and it's not the Python interpreter that is located on your system in any case. So you activate this environment and then you're able to use any benchmark with its isolated dependencies. So doing that after activation of the environment with the command source activate benchmark where benchmark again is the name of this environment you can just type in MD benchmark in your terminal and start using it. And now I'm going to quickly switch the presentation to the terminal where I hope this is working. Yeah, okay. So I guess this is working now. So I currently in the terminal that where I have installed MD benchmark and I've already run the source activate MD benchmark commands and it shows me that I'm in the benchmark environment. So now I can just go ahead and run MD benchmark in this environment and I, without any options I will get the short help provided by MD benchmark showing the options that I, the commands and options that I can use to call it. So basically if I don't give it anything it will show this message that is also accessible with minus minus help. And the benchmark provides you with four different commands all of those that I've just shown you before. So basically you first generate a benchmark given some input file for Gromax it's a TPR file with the generate command. After generating the systems you would go ahead and submit those benchmarks to the Q and system with the submit command. After submission you can do the analysis with the analyze command and last but not least after doing the analysis you can, after retrieving the data you can plot the data with the plot command. And so MD benchmark inside this tool the tool takes care of everything that you need to know. And just, so I'm going to clear this terminal window every now and then to, so I'm typing all the commands I'm going to type at the top so don't worry if everything disappears. So every command that we are looking at every command that I just listed here before also provides a minus minus help option that lists options that you can use that you can supply to the command for usage with it. And so the generate command has somewhat lengthy help help text talking about the different options what you can, what they do and what they are used for and then there's also a summary of all the options again in a tabular review with the option name and the short description of what the option does. So and to run the generate command you have to use three specific options in this case. First option would be the minus minus name option where you provide the name of your for example, the TPR file you can use the minus minus module option which would which waits for which expects the name of the module that you're using so it would be chromax and the version of the module and then it needs the minus minus template option that is the job template that you submit to your curing system that you're probably already using and then all the other options are optional in a sense so running the generate command you would always generate benchmarks for CPUs that you could also supply the minus minus GPU option to run on GPUs you can change the number of nodes that you run on so by default we generate a total of five different benchmarks so running on one node and two, three, four and five nodes and each of those benchmarks runs for 15 minutes you can change all of those parameters and we're going to after talking about this I'm going to show you how to do the actual generation so now I'm in a folder with a file called mdtpr this TPR file users, chromax users are familiar with justifies your system everything that needs the topology of your system to run this relation and this is something that mdbenchmark needs to generate benchmarks for chromax it's the only file so I'm going to type in mdbenchmark generate minus minus aim mdtpr then I'm going to say which module I'm going to use and in this case I want to use chromax 2018.3 and I'm going to tell it that I want to use the template named drako which is the cluster that I'm on in this case and from that command will give me a short summary of all the benchmarks that the system is going to create and it will also ask me whether I actually want to create those systems so it gives me a short summary telling saying that it's going to create the benchmarks for chromax with this version 2018 on one to five nodes with a runtime of 15 minutes each with this template and it will not use GPUs so it will only use CPUs and I can decline the so I can and it will ask me whether I'm fine with that and I can say yes or no and here I'm going to say no because I want to show a few more options before generating the benchmarks so what we can do and what I think is pretty awesome feature about MD Benchmark is if you sometimes mistype the module that you're looking for so in this case maybe you mistype and instead of typing 2018.3 you just type 201.3 and doing that MD Benchmark will actually notice that your cluster does not have a module with that version so it does not have a chromax module with that version but it will tell you which versions you actually have and then those are the ones that you can use but if in any case MD Benchmark you think that MD Benchmark is wrong you can also tell it to skip the validation with the minus-minus-skip-validation option that in the end will just ignore the validation completely it will tell you that it's not doing any name validation for the module and then that it will use this module to generate benchmarks so from here on you're on your own if you want to do that that's fine but we're just trying to help you not make mistakes in creating like thousands of benchmarks for a module that's not there at all alright so another thing that I want to show you is the fact that we can also generate benchmarks for GPUs not only for CPUs so I talked about if you don't supply any options then MD Benchmark will only generate benchmarks for CPUs but you can also tell it to generate benchmarks for GPUs with the minus-minus-skip-validation option you can do so and oh this I haven't thought about so just fix the module name with the minus-minus-skip-validation option you can do so and then it will tell you that it's going to create twice the amount of systems basically 5 benchmarks without GPUs and 5 benchmarks with GPUs and again ask you what do you want to do that for the sake of presentation we actually want to create 10 benchmarks with GPUs and 10 benchmarks with CPUs so we're going to give MD Benchmark the option max nodes tell it from 1 to 10 generate a bunch of benchmarks and we're going to check it out here so the summary says it's going to do what we want so it's going to create a total of 20 benchmarks and there's actually another option that it's very handful with the confirmation that you're getting here maybe you don't want to have the confirmation and always want to confirm you can just use the minus-minus-skip-validation option to skip the confirmation and instead of having it here on top you can just with the minus-minus-skip-validation option you can just go ahead and have the benchmarks generated for you which is pretty neat so if you're doing some back scripting you can just use the minus-minus-skip-validation option there will be no prompts holding your scripts alright so now that I have generated the systems I actually have a new folder inside of my previous folder called tracogromax tracogromax being the template that I used in Gromax being the engine inside of that I have two more folders one representing the version that we're going to run benchmarks without CPUs and the version that we're going to run benchmarks with CPUs and if we look inside of that we see that we have 10 different folders that represent each benchmark on a different number of nodes so this would run on one node and this would run on 10, 2, 3, etc and looking in those folders we have the TPR file that was copied there and then a job file that is generated by the benchmark with the parameters that are needed to run the job and if we look at this file if we have a look at this file it basically just defines the name of the job which by default is the name of the TPR file also the name of the log files that are produced by the queuing system it tells the queuing system that we only want to run on one node how long we want to run on for 15 minutes which module to load and that it should call Gromax for this benchmark and basically that's it so now that we have generated the systems the benchmarks we can actually go ahead and submit them so using MD Benchmark submits we will again get a nice formatted table that shows you I'm going to submit those jobs these are the exact same jobs that we just generated and again it will give you information whether you actually want to submit those jobs or not we're going to see now because again just to showcase it's possible to tell the benchmark minus minus yes to skip the confirmation and to immediately submit all jobs that we want to have and basically starting from this line this is something that is generated by the cluster so this is just the queuing system answering to the queues up, sq whatever command and afterwards MD Benchmark tells you everything was submitted you can go ahead and analyze those jobs and so looking I prepared a state of jobs so we don't want to wait for those jobs to finish because it will take a bit at least 15 minutes so I prepared a few benchmarks to showcase how it would look like so think of benchmarks that are currently running so a few benchmarks are finished but if you have not and you want to check in with the jobs how the performance is doing and how things are going so you can just go ahead and the same folder as you've been before you just type in MD Benchmark analyze and doing so gives you again a table with an overview of the simulations that have been running but now it gives you for every parameter that you've been using it gives you a new line so in this case we have run the old benchmarks in the same module Gromix 2018.3 but on a different number of nodes with the same runtime but either without or with CPUs on the same template and then from the log file it also grabs the number of cores so on this cluster one node represents well has 32 cores 64 etc and then the third column is the most interesting one for you where it shows you the actual performance that you get in your in each benchmark and you see that from one node you get 40 nanoseconds per day two nodes give you 28 45 on pre nodes etc so it's there's some linear scaling as we've seen before but what we can notice that a few benchmarks on GPUs have question marks in them and these question marks indicate that these simulations either have these benchmarks have either not started yet or are still running in a sense so there's no performance information that any benchmark can grab from those benchmarks and you have to just wait for them to finish to get the actual information so after waiting enough time for them to finish we can have a look at the analysis function again and what it will show us now is that all the benchmarks have some performance values associated with them so the missing values are filled in and this is pretty nice so we can now go ahead and generate if we if we look at the analysis function we see that it actually has an option that allows us to save the outputs from the table in a CSV file in a comma separated value file so let's do that it's the minus minus save CSV option so MD benchmark analyze save CSV and I'm going to call it benchmark just for the sake of this presentation and if we'll look in the folder we have a new file called benchmark and if we look at it we see that here are the values that we just got from the simulations here are the performance values number of nodes etc so given that we have those values we can actually go ahead let's do the plotting so for the plotting we need a CSV file so you always have to generate this and looking again you can if you're lost if you don't have internet access you can just use minus minus help to get the help for the actual commands which again is a bit lengthy explaining all the different options that you have and here you can specify use a specific CSV file how the name of your plot should be how the format should be so by default we're going to generate a pg but you can also specify pgap sveps whatever you want to have and what's also possible and this is one of the cooler features I guess you can filter your data by the parameters so you could say I only want to show modules of a specific ROMX version of a specific NANDI version I only want to show a specific template so if you have different simulations on different clusters when you compare them you maybe also want to only put a specific cluster you could do that or you could say I only want to show CPU or GPU benchmarks and there are a few more options here and I'm going to showcase this by plotting the benchmark CSV to a file oh so it's a minus minus CSV and it will tell you it's plotting by default it's plotting all the data so all GPUs, all CPUs it's plotting all host templates and it's plotting all modules so this benchmark jpeg will be somewhat noisy and I'm going to share my screen so everyone can see what I'm seeing so if I open this png file I get the plot that I just basically showed you before in the presentation so you get the same scaling for the CPUs and then another scaling for GPUs but what about, what if I'm annoyed by the linear fix that we put in the benchmarks what if I don't want to have it so in this case you can just go ahead and from the help function told us there's no fit option and we can just use that as an option to a plot call and when we open the new benchmark png that was generated we don't have any scaling we don't have any linear fitting so the plot is less noisy if you want to have that you can just do that as easy as that you can also go ahead and say actually I only want to see the scaling for the CPUs so I'm going to say no GPUs and I'm also going to say minus minus output name benchmark cpu.png and now it's only plotting CPU data into this file so now I can open that and with the linear fix I only have to CPU data in one plot so you can customize your plots to your likings as to whatever you want to have in the specific situation you can combine different modules if you have them to compare benchmarks in a sensible way alright so now I need to change back to this okay I'm having some problems restarting the yeah I think it's quite legible actually from the slide yeah alright if it's not there no it's not no no it's fine alright cool so we'll just do it this way so so I've been talking about when you generate a benchmark you can supply the benchmark with the option template and this template option will tell mdbenchmark to use a specific template that is customized for your high performance cluster so and inside of this this template there will be a few variables that mdbenchmark will replace with the values that the user provides in the options for generate which I'll come to in a second and basically there are different ways of providing those host templates either system-wide if you're a system administrator for your high performance cluster or if you're just a user like I am so if you're an administrator you can provide it in the ETC directory with where the template would be my HPC and then mdbenchmark generate minus minus template my HPC will be the call or as a user you can just create a folder in your home directory if your username is user you can create a directory.config sub directory and the benchmark and again put the same file in there and our documentation actually talks about different parameters that you can use for your system there are a few different parameters that you can use so the name of the TPR file is put into the system there's a Boolean whether you want to use GPUs or you want to use CPUs we provide the module name the number of elements in the time that you're running, that you're planning to run on and basically everything that you're everything that the user supplies to the generate we provide to the template and mdbenchmark then replaces the variables in the template so you can just customize everything that you need so going back to the terminal yeah cool I just want to showcase how this would look like so again I'm in the folder where I want to generate a benchmark and if we run mdbenchmark mdbenchmark generate minus minus self we get a help function that well we get the help text that will show me an option called list hosts and this will show me all the available host templates on the cluster that mdbenchmark is aware of that you can specify as a user so I'm going to go ahead generate and use that option, minus minus host list hosts oh yeah it's list hosts mdbenchmark even tells you if you want to create a benchmark and then it tells you which hosts it has available and in this case mdbenchmark ships with 3 default templates which are high performance clusters which we are using and just provide for us but you can also specify hosts for yourself and I just created a host file called webinar into my directory so here this is my user then inside of my home directory I have a config dot config folder and instead of that I have the mdbenchmark folder and if we look inside of this folder I have a webinar file and if we look into the webinar file basically what I've shown you before so the file that mdbenchmark will put into the folder for each benchmark but with variables and those variables are defined with double curly brackets and the variable name in the middle and in this case this would be the job name several times would just be the name of your TPR file for Gromix the number of nodes the number of tasks for nodes that you want to use and try to use and then different options like which module you want to use etc and if we want to go ahead and use that we can just do so so in mdbenchmark can just type mdbenchmark generate minus minus named mdtpr minus minus module we know that we have to use Gromix 28.3 and then we just type template webinar and if we do so we mdbenchmark again tells us what it's going to do and it tells us that it's going to use the host template webinar for the creation of these benchmarks which so generating benchmark generating benchmarks with custom host templates works and we have on our documentation page we have a lengthy documentation of how to create benchmark well those host templates we have examples for every queue system that we support SGE learn and load level learn which you can use and customize for your needs with the variables in there that you can just create a new one and reuse the variables and with this I'd like to come to an end all right so where do we go from here so far mdbenchmark enables you to it enables you to generate benchmarks and scale your simulations across a different number of nodes and it allows you to try out your simulations on either CPUs or GPUs but we're interested in the future adding the features of parameterizing the cluster specific options we would like to look at we would like mdbenchmark in the end by itself to start looking at hyper threading or whether changing the numbers for MPI does anything would also like to would also like mdbenchmark in the end maybe start to optimize the PME or cutoff values in gromics for example which would be great so it could by itself just run in background of your cluster and try to find the perfect parameters for your system such that it runs as fast as possible and uses as least of this as possible another thing that we think would be valuable for the users of the tool would be some kind of Python API that would allow people writing their own scripts, their own pipelines to just import mdbenchmark and call generate or submit from their own scripts without using the command line interface in any way so this would allow for people to write their own workflows without ever needing to use the terminal and additionally currently we support gromics and mdbenchmark well benchmarks with gromics and md if anyone out there is using amber and or lamps actively using it and could help with the specific details of what we would need to get it running on that it would be grateful to have any input from the community so that we can extend the functionality to other systems to finish my talk I want to acknowledge my PI Gerhard Tammer in the middle here who allows us to work on this project in our spare time between doing science who gave a helpful input into the project for the helpful discussions Max Linke who actually started the project after having to do benchmarks with batch scripts and Max Linke who is also helping also in the development team of the benchmark software and I would like to acknowledge the Max Planck Society for funding and also the Max Planck Computing and Data Facility for their help and access to the clusters and just to mention our code is available on GitHub so it's github.com slash bio minus fist slash mdbenchmark you can find a readme and several links there also some quick installation guides or if you want to dive into all the fine details about it you can also just head over to our homepage with the documentation docs.mdbenchmark.org and with this I'd like to thank you all for your attention and if you have any questions I'm free to answer your doubts. Thanks. Thank you very much indeed. It's really interesting talk and I think that looks like a really nice tool so I'm happy to take questions from the floor if anyone has any questions about the tool and while those questions are coming in I can make a start. Just out of interest you mentioned the slow load level of those queuing systems that you do support how much effort is it to support additional ones I mean I'm thinking of PBS for example I don't know if you've got any familiarity with that I don't have any familiarity with that but it's basically none at all it's a few lines of code we just look for the executables whether they're available and if they are then we just use those so it's a if there's any need for people to have any different queuing systems it's a no brainer to add these but those were basically free systems that are used on the clusters from our institute and from society so those were our first cases. Okay, good. So I've got a question in here actually from Rossin by Excel. Rossin's question was how difficult is it for users to add their own MD engine so if a user wanted to extend the code I know you said you're very open to doing it yourself with things like MAMD but if somebody wanted to make the change themselves is there kind of documentation or an extensibility point or anything to in a sense there's no documentation about that because we're actively hoping for people to get in touch with us so that we can integrate those engines into the code by itself from the current way it's implemented it's fairly straightforward so if you have a look at the GitHub page there are a few ways there's not a few ways there's a specific way I hope you can add an engine which basically requires MD benchmark to know what kind of files are expected by the engine to run a simulation and also where to find the performance results more or less that's all so it's fairly straightforward even for people who have never done any coding to contribute in a sense. Again, if anyone is interested in doing anything like that we're grateful for any help we can also guide people along to adding their own engines into the project so that the whole community can profit from that. That's great, yes thank you I have another question but I don't want to get in the way of the people from the floor if anyone does have a question please type it in so the other thing I was going to ask is the extent to when you show somebody something nice they always want more so I was interested the example there the graphing and the analysis was based on a single benchmark figure like nanoseconds per day if you wanted to plot other things like the runtime or the speed up or the parallel efficiency had you thought at all about incorporating other kind of metrics that could be plotted for performance? It's a very good question actually we have not yet so the only thing we have thought about was to extend the plots with a few more information about the different systems so maybe it would be interesting whether the system was a fully soluble system or in biological applications there could be membranes which could slow down the benchmarks and explain the poor performances we have not thought about any other different ways plotting results but it's actually a fairly good point so that's something to think about yeah cool and related to that as well one piece of functionality I thought would be nice instead of having a sort of linear increase in the number of nodes 1, 2, 3, 4, 5, 6 I think it's quite common to do scalability tests where you look at 1, 2, 4, 8, 16, 32, 64 so either having an option for that kind of doubling each time or either an option to type in which sizes then I think that would be a nice return this is something that we also could add yeah that's good these are the small things that we haven't thought about yet there was something that I tried for the plotting where I tried to come up with use cases of options that people might want to have for example changing whether you want to have the fitter you don't want to have it so that it's as easy as possible to just generate something that also goes in the same direction of having an option to specify a step size yeah so the step size is interesting for the plotting but also in terms of not yeah not wasting the runs yeah definitely so yeah no I think I can definitely see the real value in a tool like this so I wanted to wish you the best of luck in continuing to develop it and thank you very much for presenting it to us here today I don't think we have any other questions from the floor if anyone does think of a question they want to ask later or if indeed you're watching this video on YouTube and you have a question we have the bio excel forums at ask.bioexcel.eu so you can start another question there and if one of us from bio excel spots your question we can pass that back to Michael to follow up on so if we don't have any other questions today I'd like to thank our speaker one last time and we will hopefully see you again for the next bio excel webinar very soon so keep an eye on our website and you will see when that is thank you all very much for coming along and we will speak to you again soon goodbye