 Hello and good afternoon. Welcome to the latest webinar in the BioExcel webinar series. Today we're going to be talking about Gromax and the latest release. So the title of the talk is Gromax 2018 Overview of the New Features and Capabilities. My name is Adam Carter. I'm one of the members of the BioExcel project. So I'm going to take a couple of minutes to very quickly say a few things about BioExcel. And then I'll hand over to our main speaker today, Mark Abraham, to produce the rest of today's webinar. Just a quick note to let you know that this webinar is being recorded, along with the question and answer session at the end. So you should be aware of that. And it also means that you'll be able to catch up with it again later on YouTube on the BioExcel website, and we post it there after the webinar. So I've got one slide here to give a sort of very high level overview of what the BioExcel Center of Excellence is all about. There's kind of three main strands to what we're trying to do. One thing is excellence in biomolecular software. So we have three important biomolecular codes that we support in this project. We're trying to improve the performance and efficiency and the scalability of these codes. We have Gromax for molecular dynamic simulations. You'll obviously hear more about today. We'll have Haddock for docking. And we also are involved with CPMD for QMM calculations. The second kind of strand of the project is on usability. So the idea is that it's all very well having powerful scalable programs, but these programs also need to be usable. So we have various different strands of the project that are looking into how to make biomolecular research software easier to use. And one of the key approaches that we're doing here is to integrate these into workflow environments along with some data integration aspects as well. So that is another important aspect of what we're looking at. And the final part is the fact that we really are aiming to be user driven in what we try to do. So we say here excellence in consultancy and training. That basically just means that we want to be the place that people come to when they want to find out about all things related to biomolecular computation, particularly at scale. So what we intend to do is to promote best practices and train end users. So one of the ways that we do this is by supporting this webinar series. Another thing that we do is that we run interest groups in various different areas. So if any of the interest groups on these slides look like they may be of interest to you, you should visit the BioExcel website. And from there you can see how you can join all of these interest groups. Just a slide to let you know that we will probably be saving most of the questions until the end of today's webinar. It just makes it run a bit more smoothly. But you're able to type in your question at any time. So this question box on your normal go to webinar control panel, it will look something like this, maybe a little bit different if you're not a host. But there'll be a question box there and you can type your question. At the end, I will invite you to ask your question to the speaker directly. If you have a microphone, if you don't, then I can read out the question to Mark and you can answer it in that way. So I'll remind you about that at the end, but you can ask a question at any time by typing into this box here. And if you're watching the recording of this later on YouTube and you have questions about the webinar, just to ask them is on the BioExcel forum at ask.bioexcel.eu. If you post your question there, we will do our best to answer it as quickly as we can. Okay, so today's speaker then is Mark Abraham. He's from KTH, Royal Institute of Technology. And he is the development manager of Gromax, which as you're probably aware is one of the world's most widely used HPC applications. So his role there is as a core developer as well as being the development manager and within the team he focuses on modernizing the code base to be a sound platform for fast, flexible and free molecular simulations in a rapidly changing world. So I think the fact that things are always changing is quite important to the work that BioExcel is supporting with these codes. So Mark's in charge of the strategic plan, the release management and he also coordinates the global team of developers from his base in Stockholm. So Mark is an ideal person to tell us about what is new with a new release of Gromax. So at this point I will hand over the presentation to Mark and Mark I'm about to make you the presenter. And then you can take over from here. So there you go Mark. Right, we can see your slides but we can't hear you yet so you'll have to unmute yourself. Thank you Adam, yes. Thank you for that kind introduction. I am Mark Abraham who's one of the core developers of Gromax here. And our topic for the day as you all will know is providing you an overview of the new functionality and capability that we have in the latest release of Gromax. That's been out now for two months. And so we've already started to get some feedback from users about the things that they like about it and the things that we can improve further in future. So we're very keen to hear from you in future about the kinds of things that you like about what we have done and what you would like us to do in future for later releases of Gromax. So a quick overview of Gromax's capabilities for some of you who may not be regular users of Gromax yourselves. Gromax implements molecular dynamic simulations of biomolecular systems. So that involves us setting up a description of typically a biochemistry system in atomistic detail. Every individual carbon or hydrogen or oxygen atom, for example, gets assigned a position and it has a mass and it interacts with the other atoms, perhaps via charge-charge interactions, maybe it has bonds. All these kinds of physical interactions with all the other atoms in the system model what happens in the real physical environment that is the object of the scientific study. In this particular example, we're looking at a Glick ion channel which forms part of the signaling, it's a model actually of the signaling behavior that takes place in all of our nerve cells. The ion channel has a key role in committing ions to go from one side of the cell membrane in teal to the other only when the overall function of the nerve cell actually dictates that's a good idea and there's lots of interesting mechanistic details that we'd like to be able to study here. It's very difficult to do in the laboratory, so this motivates doing simulations with Gromax because we were able to give to the scientists fine control about how to set up the initial conditions, how to add maybe a drug that might modify the behavior of these ion channels, maybe there's an antibody that binds to it to change its behavior, all these sorts of things can be studied in a simulation, sometimes more easily than in an experiment and oftentimes more quickly and less expensively too. So just a quick word about how we manage the releases of Gromax. We aim to deliver one release per calendar year. So as I said a moment ago, we've got the first release of Gromax out in January this year. Just as we can fact, we brought out the first patch release that identifies a few minor issues that several of the developers and indeed many of the users brought to our attention. So thanks pretty much for that feedback. We're very keen to listen to your experiences to try and make life better for us all. So we'll continue maintaining that release for the remainder of this year. We'll address any and all issues that come up. We have another elder branch that's still under active maintenance. That was the one we released in Gromax 2016. So even though it's 2018, we'll continue to make releases from it. We'll increment the .x at the end to illustrate that what we're doing is only making the fixes that come to our attention that affect scientific correctness for want of a better word. If MD runs doing an appropriate simulation or analysis tool has producing some not quite right numbers, these sort of things we will fix, but we won't make any other changes. So the hope is that by the end of a lifetime there's no known correctness bugs in 2016. Unfortunately, that means there's a lot of older branches of Gromax that aren't be actively maintained anymore. So those of you who might be using Gromax 3.x or 4.x or 5.x are strongly encouraged to update to more recent versions of Gromax. That's because unfortunately those older implementations had issues that have been discovered. Unfortunately, many of the techniques that we're implementing here are themselves research topics and how best to implement them is also an open question. So we get things wrong from time to time and the best way for you to get the best form and the new hardware that you likely have available and also to take advantage of all the changes that we've made to the behavior to make it better is to stay up to date. So our new numbering where we have the year of the release as the major version number will hopefully keep people better in touch with how the progress of the code has gone on. Really if you're using a piece of code that's more than five years old it's very likely that you should strongly consider using a more recent version. So we plan to release the Gromax 2019 version very early in the new year. So if you're a community member with perhaps a modified version of Gromax or a feature you'd like to contribute we're going to need to hear from you maybe by August or September and definitely we'd love to have code up and able to be reviewed by the core developers by the start of September but preferably get in touch with as long before then because often these things need careful consideration, design and negotiation in order that we can deliver code on time that works well that everybody wants to use. So in Gromax 2018 we have a number of interesting new simulation features. One of the high impact one of those is the implementation of the adaptive weight histogram method. I'm not going to talk in detail about that one here today we might give a future webinar on that because it's a very interesting implementation of you could think of it as a fancy form of umbrella sampling that that permits a free energy calculation to very efficiently sample many different many different parts of a reaction corner. Some matter also like metadynamics. So we're hoping that people will find that one particularly useful but please do give us feedback as always. Some of our colleagues in the US Pascal Mertz and Michael Schertz have done some very interesting work implementing physical validation tests. I'll speak a little bit more about these later in the webinar but basically these provide another layer of confidence that the community can have that an individual researcher's science has been done well as well as the code that they used to do it has been implemented correctly. So this adds another layer to your overall impression of quality that we can have about the molecular dynamics simulations that have been done with these software suites. Our CAS on the core team here at KTH has also improved the way that some of the integrators are able to report a conserved quantity of various of the coupling algorithms do physical work upon the system and that work has an amount of energy associated with it which we can accumulate over the lifetime of the simulation so that we can have a conserved quantity not just in an NVE ensemble but also if we have temperature or pressure couple. Several of the integrators already reported these and books expanded those to include the variance and pressure and temperature coupling algorithms and also the Paranillo-Raman pressure coupling algorithm. So hopefully that will allow people to run very simple tests themselves to observe that conserved quantities are in fact conserved as well as they would like but we strongly encourage people to also consider the more rigorous kinds of tests that Pascal and Michael have implemented. More about that in a moment. As Adam foreshadowed in his introduction we also are working very hard on trying to make these quite complicated pieces of software more usable. We have been greatly expanding the documentation Paul Bauer in particular with veteran Militech has been working quite hard on those. There's lots more developments in the pipeline for that including a revamp of the Gromax website so we should have very good things there for Gromax 2019 so please watch this space. One piece of feedback we had from users via our fora was that it was sometimes difficult to understand how to specify where items were restrained in space by position restraints. Our pre-processing tool Gromp or Gromp-PP depending which school you belong to would always take the coordinates from the file you'd passed with minus C and optionally you could do that with minus R and people didn't understand very well that this option existed so we made that required so it's very easy for you to just give the same file again if that's in fact the behavior that you want to have but overall this I think will be more usable to everybody so that they understand yes I've said restraining the positions to this coordinate so that when you're extending a simulation it will be easier to you that you can pass to Gromp this minus R to get the positions you actually want. We also made some minor improvements to some of the helpful messages that Gromp sometimes gives including the one on checking the total charge of a simulation system. There were sometimes some rounding issues associated with reporting that that we improved. Tim O'Rourke one of our long-time contributors also added support for dynamic selections the GMX trajectory tool that allows you to take your trajectory file and get ASCII coordinates from that and he added support for you to use a dynamic selection for that so you could say I would like all the atoms within 2 nanometers of this particular group or this particular atom and you can get a text file out with all of that. I hope is that we'll be able to put more time in this year to porting more of the analysis tools to have this kind of functionality which will be an exciting way to have highly usable analysis tools in the future. O'Rourke's also been great for supporting us there. We've also grabbed a number of performance enhancements which I will spend the majority of the time talking through so that people understand best how to take advantage of these. They include porting the long-range part of the PUMI algorithm to run on GPUs which is an exciting new development. However, there's been numerous both algorithmic and implementation improvements on more traditional not more traditional, these have always been there. Construction of the management of the pair lists that are necessary. I'll talk more about those shortly. BackHest is also supported by several others of us here. Added more support for running on the CPU using the SIMD capabilities of modern hardware to their greatest capacity to make sure that in the traditionally less time-consuming part of the MD simulation which is whereof we've got the forces and we have to calculate updates to the velocities and the positions. Traditionally, this didn't take any time so no one bothered making them run fast. However, we've done so much good work on all of the other first calculation parts that we now need to make sure that the update and some of the bundled kernels are also getting the best possible performance. So we're working at multiple parts of the code. So just a brief reminder for those of you who haven't perhaps reflected on the implementation details of molecular system as much as some others, the critical part for all of these kinds of molecular dynamics packages is to calculate the non-bonded interactions. Here we're just considering the short-ranged ones where we have a single item that we're currently considering and we want to make sure that we consider all of these through-space interactions with all of the neighbors within a characteristic interaction radius. So we might have different radii for the Quillum interactions or the Van der Waals interactions. Sometimes they're going to be the same. That's really up to the developers of the force fields. Gromax will implement all of the common flavours of force field designs that people have implemented. It's one of the attractive features about doing simulations in Gromax is that we support all of the modern range of force fields. However, the construction of the overall simulation is quite inefficient if at every time step you compare all of these particles' distance from the particle whose forces you want to compute at each time step. What we'd rather do is make a list of all the particles that are close enough and to only go through that list calculating interaction for all the items on that list. However, the computation of that list tends to be fairly time consuming. So we'd like to do it once and then reuse it at the bottom times. However, we are doing a dynamical simulation so sometimes particles move in and out of that list. So we actually need to make the list rather larger than the interaction radius so that we can be statistically confident that over the lifetime of that list any particle out here hasn't gotten in there or vice versa any particle that wasn't here hasn't got all the way out of there so that we were calculating them in appropriately. Now in Gromax ever since version 4.6 our Verlet scheme which is now the default since Gromax 5.1 groups together the particles for computational efficiency. So in fact we would consider a particle the unit of computation but rather a group of particles here for the illustration chosen for depending on the implementation piece of hardware that you're running on this would be four or eight. So we still do the actual physical computation in terms of the individual functions but what we do is group them together so that rather than look at the interaction of this atom with that atom we look at this cluster of four arms with that cluster of four. We still don't have this atom interact with any of these central ones because it's outside the interaction radius but this gives us a version of the algorithm much easier to implement faster. We do have to do a little bit extra work in computing numbers that turn out to be ones that we need to throw away but overall the efficiency improvements are much, much higher from expressing the problem in this way. Physically it's exactly the same but the implementation is much better if we cast it this way. What this does is mean that out here when we're considering constructing the noblest we often have many of these clusters that have only a couple of their atoms within the radius so we even have some effective additional buffering outside this list which is also valuable because that means that even though we say our interaction list is this wide actually it's effectively a little bit wider. So that's a subtle thing to pay attention to how things work. So more about this and non we will come back to some of the improvements we've done and how we set up the managing these lists behind the scenes. Put the memory of the set of different sorts of work we have to do inside Chromax that we spend a lot of time making work faster and faster so that you can have your scientific solutions sooner and can sample more widely so that you have greater scientific validity. A major part of the work historically have been computation of these short range interactions they then have to get reduced with other contributions that might exist in this case we're only looking at a computation on a single CPU even if it's running on multiple CPU cores whether our MPI open and P doesn't really matter the work is roughly the same once we've got the short range work done we move on to the bended interactions so these are all your angles and your dihedral typically. Accumulate those to the force buffers then we go and do the various components of the work that go into the long range part of the PME algorithm this is the de facto standard for treating the electrostatics within these biomelectular systems and having got all those components of the force computed we reduce them all together so that we have for each atom the total force that it should experience we then use that force to go through the update phase perhaps using virtual spots if you're using a model that implements those use those forces to go into Newton's law to determine the acceleration so we know how to update the velocities of the atoms use the updated velocities to update the positions perhaps having temperature or pressure coupling implemented in the meantime and then we loop back to calculate the forces from the new positions so this is the fundamental algorithm that everybody's microdynamics implementation does different ways, different algorithms, different flavours but fundamentally everyone's doing this kind of operation Ever since Chemics 4.6 we've had the ability to offload the short range or particle-particle interactions to a GPU and that has given us a traumatic ability to improve performance because not only are we able to do some work on the CPU we're able to do work on the GPU in parallel with that so this important execution means that the overall length of time illustrated in this flow chart is now much shorter because we're using more hardware this does however mean that the execution for the code now has two different parts which means it's more complicated for us to implement and sometimes a little bit harder to use we're working harder making that better but those improvements will have to wait until at least Chemics 2019 but the good part here is that we're able to use the separate CPU and GPU resources separately so many of you will be familiar with this workflow we've had it available for quite a few years now one of the critical new pieces of functionality we've introduced in Chemics this year is the ability also to offload the long-range part of the PME algorithm to run on a GPU so this will only run on a single GPU for the PME component at this time where we may improve that in future depending on how we prioritize when we've got some experimental implementations done so this tool allows us to either offload both the particle particle and the long-range work to the same GPU or potentially to different GPUs there is a possibility to use multiple GPUs if you would use domain decomposition for the short-range interactions on multiple ranks and a separate PME rank that's targeting just one GPU for the long-range component so there's quite a bit of flexibility in how you can use the code there we hope to improve that in future meanwhile the CPU is only doing the small amount of work that is typically associated with the bounded interactions some of which got those recent SIMD improvements and perhaps any extra work that might be going on if you're using umbrella sampling or doing a free energy calculation we have some other workload that goes in here so one of the big advantages of introducing PME on GPUs is that we need much less CPU resource and so that can be much more cost-effective for gromax users to be able to procure hardware now you need many much less powerful CPU resource for equivalently powerful GPU resource so we're going to put all the forces together again which is just a matter of doing the accounting correctly and going to the update phase as before so moving back to these pair interactions what we've done here through some very elegant algorithmic and implementation work by Selen Paul and Jack Hess is to have normally the inner list that works roughly the same before but we actually have a bunch of lists outside that much longer lifetime than we were able to have bug fault in earlier versions of gromax and what we do is to run very quick computation periodically during the simulation every couple of steps to find out which of these clusters has gone inside the inner radius so we update the inner pair list which are the one we actually compute all the forces on what this does is give us the ability to have an effectively much longer neighbor list lifetime which is great because we spend few CPU cycles computing those while we've got our expensive GPU sitting idle and remove from the user the need to tune this parameter for optimal performance we can take care of that internally now which will improve the usability of maximal performance in gromax which is one of our key long term objectives we would love for people to be able to just run GMX MD1 and itself will work out how to run best on the hardware that you have available that's a big ask over so we're doing it in stages so this is an administration of some of the workflow there we have now some extra kernels that run to take the outer pair list, reduce the inner pair list which then gets computed on during the non-bonded forces while the update is going on particularly if we're on GPUs we're able to run some kernels that was otherwise wasted time to do this pruning of the pair lists so that we can find out whether any particles have gone inside that inner list over the recent update stages so this is a quite elegant way of using otherwise unused compute time to overall boost performance and efficiency quite substantially this is illustrated here the dynamic pruning setups running on GPUs with the old style approach where one would have a fixed pair switch frequency that you would set up so running an immediately sized system on 56 MPI overlaps performance would reach peak at sort of 15 to 20 where we're trading off the cost of having a much larger list that you need to have with a longer pair list lifetime because you need to account for much greater potential for particles diffusing the overall efficiency would start to go down however with this dual pair list set up with the dynamic pruning implementation we're only going to have a much more even set up performance across a wider range of pair searches but we can do pair switch much less often too so that's a very elegant dual way so Salad Burk Salad has some Salad, Paul and Burk did the primary work there in couple with the PMER and GPU implementation whose chief architect was Alexey Upanov who was supported of course by Salad, Paul and the rest of the development team here including Burk and I so here we have some of the press comics 2018 performance numbers so what we have here are comparisons between two different simulation setups comparing strengths and GPUs reasonably CPU hardware and the iron channel in that introductory slide that I showed earlier so this is a quite characteristic simulation size that a bimolecular simulator might use what we have here is a plot of the number of simulation cores on the CPU that were used to get simulation throughput in simulation nanoseconds per day we can see here that the we're just looking at the Gramex 2018 capabilities now so that's including improvements to SIMD update and angles and the rolling pruning and all those sort of nice things what we can see here is that we add cores to the version where the PME calculation runs on the CPU performance gets steadily better so that's nice here we're using a GTX 1080 which is one of the more powerful consumer grade GPUs that are currently on the market however if we use the same GPU setup and instead send the PME work to the GPU using the new feature available within Gramex, we get an immediate performance boost that's quite dramatic and also reaches peak performance with rather fewer number of cores and so this is one of our key development targets for future resources to need fewer CPU cores for equivalent throughput performance already we can see here just by turning on the PME on GPUs we can almost get away with having 16 cores you can definitely get away with having 8 cores but even 4 would give you very effective cost benefit performance available there if you move to a more powerful GPU that effect is even more pronounced here in blue we have the CPU only version of PME and if we move to the GPU version of PME again we see a dramatic performance benefit compared to the blue line from the red line and that too not only has a higher peak because we're doing more work on a more powerful GPU this time but yeah the fraction of achievable peak of that GPU you can get for relatively few CPU cores has changed in a very favourable way too looking over time we can see that our CPU implementation has continued to improve through lots and lots of improvements over time ever since Chromax 5 have been delivering steady small improvements there however the capabilities of the newer hardware on the GPUs particularly and our implications on those have given us more dramatic improvements in recent times so we can see that when we're using both a CPU and a GPU we've had a similar but more dramatic trend to our performance over time as we go from Chromax 5 to 5.1 to 2016 to 2018 over a period of about four years however if we go to Chromax 2018 and move the PME calculation to GPUs we get a dramatic benefit so this was from our point of view we're worth our effort and we're going to be working hard in the future to add more feature parity to the PME on GPU implementation support a wider range of hardware and setups and to the GPUs but also in a way that keeps all the benefits available to the CPU that we can working very actively to make sure that we do have a fast CPU performance not only on the high performance cores from Intel but also on those from AMD and indeed upcoming ones from ARM we're very keen to make sure that we continue to have strong performance portability across all the hardware because we literally do not know what hardware we will be running on in five years time so we're going to make sure that your effort in learning how to use Chromax and run well with it will continue to be available no matter what kind of computer hardware you'll have available in the future we're already doing a lot of work to make sure that your effort in learning and using Chromax isn't going to be wasted anytime soon so moving to a quite different topic for all the other exciting features we have in Chromax 2018 is the introduction of a physical validation testing suite Pascal Mertz and Michael Schertz have done a lot of work on this see what I'm illustrating here is actually from some earlier work from Michael Schertz illustrating that if we run two different simulations at neighbouring temperatures maybe a couple of degrees apart we might expect to see quite different distributions of the potential energy these follow a not quite Gaussian distribution of course because we have limited degrees of freedom but we expect a degree of overlap if we chosen our temperature difference suitably such that the probability distributions were the energy of the system overlap in a given way now this particular plot doesn't tell you anything because you couldn't tell if the simulation was doing something inappropriate the shape of this would be subtly off but it's very difficult to tell for example whether this dotted line or the dashed line is a faithful representation of each other we run two simulations at control parameters that are slightly different from each other we can now take those histograms that we can form from this probability distributions take their ratio and plot those on a linear plot and see whether the linear plot is a linear and b has a slope that is characteristic of the temperature difference in this case this turns out to be a very valuable test of quality about whether the simulation algorithm is in fact sampling what you and your mdp file or indeed any other mdp packages inputs have actually asked for so this is an old result from a paper from Michael from a few years ago illustrating that on the left the variance in style weak coupling temperature coupling algorithm produces a distribution that's roughly linear but has a slope that is very much different from what you would expect for all the two temperatures that we chose for this pair of simulations however for example the Nosehura temperature coupling scheme within gromax has known the expected linear trend when you've done enough sampling that also has the correct slope so we can have much higher confidence that the Nosehura implementation is implemented correctly in gromax for this particular simulation system than the variance in weak coupling implementation this of course reinforces the well-known behavior that the kinetic energy distribution produced by the variance in algorithm is rather different from that which you would expect for a properly sampled canonical distribution so it's good to see that the one kinetic energy distribution that is produced by that leaks into the potential energy distribution which goes into the inputs here so equipetition is working as expected when we get a wrong distribution of potential energy not because we're computing the energies wrong but because we're getting the kinetic energy distribution wrong so these kinds of subtle issues are one of the things that keep the developers of EMNI packages up at night trying to make sure that none of these are as creepy so there are many other kinds of physical validation tests that they have incorporated the kinetic energy distributions are another one that are able to be studied because we know in advance developers should follow Maxwell Boltzmann distribution and there are well-known statistical tests that allow us to identify whether or not the distribution of velocities that we in fact observe for any sub-part of the system are being correctly sampled so there will be interesting work I'm sure in future finding any corner corners of Gromax that in particular combinations of algorithms haven't been fully planned through potentially and indeed testing all of the other MB packages out there in the field our hope is very much that you as users will be able to have in your hands a test suite that will then write in your paper Hi, here is my MDP settings I used in this combination and relying on you and your colleagues and perhaps the peer reviewer to catch any errors in this but we can in fact have an automated suite that says yes your integration is set up with all of your different choices of parameters in fact it does produce a sampling scheme that passes all of these validation tests automatically so that we can move to having these tools available in your hands rather than relying on expert know-how of relatively small number of researchers around the world this is already implemented in Gromax 2018 you can use it to verify several interesting properties like the aforementioned ones and also for example the fluctuations of the potential energy those two should scale with the size of the simulation time step so this gives you a way to have confidence in the quality of the implementation of the integrator if we can show that in fact yes the fluctuations in the potential energy do scale with the size of the time step in the expected quadratic way this is a very good check for us so we will now be doing all of our future releases of Gromax to make sure that we pass both the current and future states on this physical validation test so you can have confidence that what you get in the simulation packages is what it says on the box so that brings us to the end of the formal material that I have prepared Adam will now lead the audience Q&A session are you there Adam? Yes, thank you Mark that was a nice overview of everything that's new so thank you very much for that so as Mark said we are now open to taking questions from the floor so if you do have a question it would be great if you could type it into the question box now I can see we already have some questions there we are going to make a start on those I think that's a question we can type it in as we go along so in the first case we have a question from Jovakim Jovakim, what I'm going to do is I'm going to open your mic in a moment if you have a microphone you can ask your question directly if you don't want to do so you can just remain silent and I can read out the question so so having said that I now cannot find your name in the list of attendees so let me read out the question just in case you dropped off and you can always use the answer in the recording later on so the question that Jovakim asked was what do you think about the recent Ryzen CPU or GROMAC simulation any tips to optimize systems under Ryzen? we've certainly been actively following the developments on AMD's front where we're very keen on having a competitive hardware environment for the different vendors to sell and to and we're very keen to make sure that GROMACs run optimally on all of these pieces of hardware so we have done a considerable effort to make sure that our CPU side SIMD usage is well suited for running on Ryzen so we absolutely encourage people to consider both the latest Intel and AMD CPUs or their GROMAC simulations there's not really a good sense of how to optimize systems under Ryzen it's essentially the same one tip for the unwary is that those of you who use all the AMD CPUs may have gotten used to the concept that there's one hardware thread per core the Zen architecture has two hardware threads per core so that's likely the Intel CPUs have been for quite a long time so in a sense that got a bit easier it's just different if you're used to AMD only but really yeah we strongly encourage people to go and buy Ryzen you probably need a couple more Ryzen cores for the same performance you would get for Intel's high-end cores but any kind of direct numerical comparison is quite tricky here you have to consider issues of total cost and power and all these kind of things so yeah absolutely go and use AMD's new new CPUs they're great GROMAC will run great on them thank you Mark okay so I've answered the question the next question was I'm not entirely sure I completely understand this question but I'm going to ask it as it is Mark and then maybe you can clarify if you need to so the question was what would you recommend to optimise the system of 60,000 atoms under Ryzen and GTX 1070 so I guess I guess that's given that hardware what's the best way to set up your simulation so 60,000 atoms under Ryzen there's two ways to interpret that how many CPU cores would you run to buy in the socket to pair for that and indeed once you've got that how would you run the simulation that too is a question that unfortunately doesn't have an easy answer this is one of the reasons why we want to do a better driving future of allowing AMD to try a couple of different setups and have this run automatically so that this sort of question doesn't arise the first part is reasonably straightforward I would definitely want at least 8 cores to go with that relatively powerful 1070 certainly not as powerful as the 1080 or definitely the 1080 Ti but yet you might be able to use a simple 4 core desktop style CPU you definitely want to go for a server grade socket there you want multiple GPUs you definitely need multi socket Ryzen as well as far as running goes it's basically the same as running under Intel or the historical versions of Gromax you probably want to use the additional hardware thread that's available for with OpenMP that's likely to be good you can get performance benefits by running multiple MPI ranks sharing the same GPU that's that's still a valid use case for both for Ryzen and for Intel yeah all standard considerations apply really there's nothing special about Ryzen okay thank you very much for that so the next question I have and I don't have a name against this one either for some reason it's how to activate PME on GPU so is there anything special to know about activating PME using GPU so this is a difficult question because well sorry the difficult question for us to make the assignment work easily because we don't have a good way of telling in advance how efficiently your simulation will run on your CPU and your GPU it's a big complex design problem that we are still struggling with PME will run by default on GPUs if you have one GPU and you choose a PME implementation with order 4 and no then it joins PME and a few other minor restrictions no free energy calculations that's a big one actually it will run by default if it can if it can't run the log file will provide feedback on why it is not running so that you can consider whether that's an important part of your setup if there's multiple GPUs then whether we automatically run PME on GPUs gets a little bit unclear if you would like to first PME to run on a GPU then you can do that historically we had the minus NB command line option to say whether the non-bend calculations should run on a CPU or GPU so that's still available and to pair with that we now have minus PME which allows you to also say that that must run on CPU or GPU if you want to run PME on GPUs you have to run the short range non-bonded part on GPUs as well so those are both available if you're familiar with the GPU ID flag those details have changed a bit we'll go into that now because it's very difficult to talk about without a lot of detailed things maybe we'll do a performance optimization webinar in a month or two but yeah the nature of the information given by GPU ID has changed because we now have a more complex way of addressing work to the GPUs so do have a look at the documentation for both GPU ID and the new minus GPU tasks which allow you to fully specify all of these kinds of things so we now have PME on GPU run on say a system that has 3 GPUs the recommended way of doing that would be to set up 4 MPI ranks for example send 3 of the have 3 of the ranks do their particle particle non-bonded work on say the first 3 GPUs and have a separate PME rank so you want to use minus NPME1 to then send the low range part of the PME work to the 4GPU that would be an example setup your mileage may vary you should certainly vary the different set up parameters to get the best performance great, thank you Mark, thank you so the next question is from Mohammed Asad and Mohammed I'm going to open your microphone in a second and if you would like to ask your question directly yeah Mohammed I can't hear Mohammed this question so let me just read it out he says can you please elaborate on the physical validation test update in terms of pressure and temperature coupling so does the physical validation test so the, thank you for your question Mohammed there's not much in the testing sweep that is particularly sensitive to temperature and pressure coupling if I might be permitted to read into your motivation there has been a trend over time of comparing the quality of MDE implementations by looking at their micro-clinical ensemble i.e. MDE because that has a clearly conserved quantity and there certainly has been literature published about whether or not this is a good idea and if so how best to do such comparisons I won't go into that debate now but the physical validation tests are intended to allow you to run your simulation however you want to and then look at the observable afterwards to see whether you've got the things that ought to be expected given what you asked for so things like other fluctuations of the conserved quantity whether it's energy and MDE or something else in a temperature or pressure coupled simulation whether the fluctuations in those quantities have the expected distribution that you would expect from the theory of having a symplectic integrator that we have so there's not really a sensitivity to pressure and temperature coupling what the implementation now gives you the ability to do is to run these kind of tests in the same setup you don't have to think about I can test NVE but then I need to turn temperature coupling on what if there's a bug in temperature coupling well no you've run your quality tests already on the same set of input parameters that you intend to run your production simulation on okay and there was a follow up question that said that you mentioned also about the update for reporting conserved quantities and Mohammed asked for a little bit more explanation about what that update provided okay so if you look at both the energy file and the log file for simulation flavours that report a conserved quantity in whichever version of what you're looking at then you will see that there was an energy field that reports the conserved energy if it's if you are doing a macro canonical ensemble simulation then the energy is the conserved quantity however if you are for example using the velocity rescaling large amount of thermostat that has had a conserved quantity reported by Gromax in both leapfrog and velocity for some years the Hover thermostat I think has had that for some years and through that set of and also the MTTK I can't remember the names of all of the authors I went Martina someone MTTK is the acronym given to it which is a very high quality pressure keeping implementation unfortunately it only works with a single MPI rank so not very useful in Gromax at the moment we've got big plans to improve that so back in the latest 2018 release added to that set the ability to report the conserved quantity for variance in style temperature coupling, variance in style pressure coupling and parallel O'Raman style pressure coupling particularly with the leapfrog integrators so basically we just expanded the set of integrators that already had a conserved quantity computed unable to be reported you can find within your log file you will see that the energy section has an extra field if this is supported for the particular combination of algorithms you've chosen if you find a combination that doesn't have one please let us know on the mailing list so on us by OXL.eu and we can let you know whether there's any plans to implement that or perhaps how to get what is already implemented available to you thank you Mark we have a number of questions lined up so I'm just going to read them out rather than try to hand it over to Mike in the interest of saving time so the next question is maybe interesting in the work of some of the some of the plans that we have lined up for BioXL but the question is I was wondering if Gromax 2018 can be interfaced with QM programs like ORCA games for hybrid QMM calculation if so is there a particular program that works best with Gromax 2018 that's a good question for somebody other than me actually it has been a bit of a problem for a long time we had an interface to a number of QM packages contributed to Gromax 4.0 4.5 with the last time these were substantially actively maintained I don't have any direct experience I can't give you an immediate answer to that question I suspect that ORCA and games are not well supported we did have some changes to Gromax 2018 that removed support for energy minimization and transition state optimization possibilities within Gromax they were never intended to be used it was one of the problems with integrating new features into Gromax is that as soon as you have feature A and someone adds feature B people want the combination of A and B so that doesn't sound too bad when you have only two features but when you have 40 features suddenly the matrix of does this support this combination gets very comfortable so I can tell you that we have exciting plans in the future for improving the community and support within Gromax some of it continued upon BioExcel's next round of funding but we're going to look very carefully having an implementation with CP2K and then making that work very well so fingers crossed we get the funding for that but for the moment there's no CP2K support but please do you fire a question particularly the GMX users mailing list where Karat Grunhoff our main commentator is likely to see your question and let you know what details he can provide there okay thank you Mark the next question is from Mathieu and he asks have you tried benchmarking the new professional GPU architecture of Volta and if yes can you compare with the GTX 1080 I don't have any numbers to hand so that has certainly done such comparisons and yes the Voltagrade hardware runs very well it is considerably more expensive so we would find it difficult to recommend for researchers procuring their own individual workstation or something you don't need the value you get from the additional features of the professional grade hardware is relatively low we strongly encourage you to get the high end consumer grade hardware the GTX 1080 Ti will give you significantly better price performance however if you already have a cluster that's been provisioned with the Voltagrade GPUs because for example they run simulations using other packages that need some of the features available including for example double precision on the GPU then by all means go and use it it will work really well don't have a problem with that okay so the next question Flora with double precision calculations through the use of CPU and GPU the concern I think is whether you'll be able to maintain the precision by introducing the GPU into the calculations sure we have two precision implementations within Gromax a fully double precision one and a mixed precision different MD packages also have various forms of mixed precision I don't think any two packages actually have the same implementations of mixed precision so by the way there we do not have any support for GPU calculations running in double precision at this time and we don't have any plans to implement that such implementations would make excellent use of the professional grade hardware that has long been available from Nvidia we haven't seen a convincing analysis that says the value to the user from the extra computational cost is worth us doing an implementation that would run in double precision on the GPU we're certainly open to someone showing us that there is a relevant advantage to be had and we're looking actively for those kinds of advantages all the time but so far we haven't seen it so we prioritise the development resources such that no you can't use double precision on GPU however there's a fully supported double precision implementation on the CPU so if you would like have a way to show that there is a simulation observable that is much better handled through running in double precision we'll be all ears to hear about it and we'll consider that very carefully when we're making our future implementation decisions Thank you Mark So it's two minutes to the top of the hour two questions into one and then I'll leave a minute at the end or add on one minute at the end just to tell everyone about our upcoming webinars so the last question that I'll take from the floor just now the combined one possibly from two different people the first one is asking whether there are any plans to create tutorials for the new Gromax features and the second one is just a general comment about whether for a CPU whether Gromax you generally prefer more cores or more gigahertz we will we already have plans in place to particularly for the new AWH feature produce some new high quality tutorials we're very keen for people to try that feature out so Vivica Lindar will be working on that in future supported also by BioXcel we also have some other plans for updating our tutorial support over the coming months now that we've gotten the release of 2018 out of the way and have stabilized the quality of it with the first patch release yes we should certainly do some more work on tutorials we would love to hear from you to find out which areas you find there are gaps that you would like to have covered by tutorials so please do send me an email askast.bioxcel.eu ask on the Gromax users mailing list for gaps that you find maybe you'll find other users that will say oh actually there's this resource over here please have a read of that there's lots of high quality tutorials and documentation out there some of it supported by the Gromax core team some of it not please let us know what you would like to see and we can put that on out to the list there was a partial follow up which does Gromax prefer more cores or more gigahertz the answer to that is yes one of the issues is which particular kind of run setup you are running on and whether the CPU is one of those that is likely to dynamically change the speed of the CPU measured in gigahertz according to what the heat levels that have been previously output actually are more gigahertz is valuable if you're running a simulation that's strongly limited by the CPU particularly during the update phase so that's that's valuable but yes as opposed the question doesn't have an answer that you can give it all depends on your simulation your hardware and indeed how warm it is on the day okay thank you very much for that so I'm afraid we'd have to bring the questions to a close for today if you do have any other questions do feel free to contact us as Mark said using AskBioExcel Mark if you wouldn't mind advancing to the final slide now then I'll be able to give people a quick heads up on the upcoming webinars that we have by Excel we've got four in the pipeline at the moment on the 11th of April we have a webinar entitled MCDNA a web server for the detailed study of the structure and dynamics of DNA and chromatin fibers and that is Jurgen Walter from IRB in Barcelona and then on the 18th of April a webinar that might be of particular interest to this audience we have a perspective on the Martini force field from from the University of Groningen on the 26th of April we have a presentation from Beakie Technologies entitled finding a tradeoff between speed and accuracy in protein ligand binding description and then on the 10th of May we have a webinar titled high confidence protein ligand complex modeling by NMR guided docking ensembles early hit optimization and that's Andrew Proud from Novartis so a varied set of webinars there over the coming weeks and months we hope you will be able to join us for some of them and I hope that you found today's webinar useful so I think you should have also received emails about this webinar a link that you can follow to provide feedback today we appreciate that as well and in the meantime I do keep in touch with BioExcel thank you for coming along today and we'll hopefully see you again soon at one of these next webinars thank you