 Hello everyone and welcome to the last part of the BioExcel workshop in Best Practices in QMM Simulation of Biomechanical Systems. Today is our panel session which concludes the workshop. So some of you who are here will attend the webinars of our individual speakers as part of the workshop. Some of you may have just come to this panel session so I'll just reintroduce our invited speakers who formed the panel today. We have Maria Kronovar from Nonosov Moscow State University and the Virtual Academy of Sciences Research Centre in Fundamentals of Biotechnology. We've got Ulf Rude from Lund University, Professor of Theoretical Chemistry. We've got Maria Joao Ramos from the University of Porto who has been in the theoretical computational biochemistry group. We've got Adrian Mahonant from the University of Bristol who is at the Centre for Computational Chemistry. We have Giannis Mavri from the University of Ljubljana and the National Institute of Chemistry. You know, Soin Ljubljana and Carmen Rovira who's at the University of Barcelona and at the Catalan Institute for Research and Advanced Studies. So all of these speakers' individual webinars in this workshop are available on our YouTube channel and the slides are also available if you're interested and have not yet seen them on the BioExcel website. As well as these invited speakers, we have my BioExcel colleagues and co-organisers of this workshop starting with Gerrit Hoonhof who also gave a webinar during as part of the workshop and he'll be chairing today's session and participating in the session as a speaker. Then also my colleagues Emiliano Ipoliti and Dmitri Morozov at the University of Uwaskola and Forsjoen Centrum-Gulich respectively and I'm based at the University of Edinburgh. So with that, I will hand over to Gerrit to start the session with an introduction. All right, wonderful. So yes, thank you, Anup for the introduction and thanks all the attendants for coming and attending this lecture and of course mostly my biggest thing goes to the panelists who have already given a webinar and have now been found willing to also discuss with you a best practices. To briefly remind, I don't think it's really needed but you have to start somewhere, so my name is Gerrit Hoonhof and I very briefly remind you what was the purpose of the webinar series and how we hope today to be able to conclude them. So we had a series of webinars, the speakers, the pictures of which you have already seen live a minute ago and now they're here static in a different configuration as you can all see, zero temperature I guess. And we have organized together, we have all been speaking together in a QMM best practice workshop series. And the first question when people get introduced to QMM is why would you want to do QMM? And I think for all of the audience and as well as the panelists, it's obvious that QMM provides us with a handle to study chemical reaction in condensed spaces and allowing us to investigate biological matter at the atomic level and that is something that we would need to understand not only how protein works but eventually how organism work. And we have seen examples of that in this workshop actually how QMM simulations actually can be used to understand a bigger, a part of a bigger problem that we can understand. Now QMM, of course, due to what we call publication bias, only success though is tend to become papers, tend to turn into paper. So you may think that QMM is fantastic method and it's almost like a silver bullet, magic bullet, but it is not and it is not because requires quite a lot of thinking beforehand. A friend of mine once said that if the amount of thinking that goes in advance of a simulation would be as much as the amount of time people actually spent on computing stuff, the results in this field would be a lot better. And before you can therefore start a QMM simulation, you must be aware of not just the global picture of what you want to achieve because that is hopefully clear at that point, but you need to think very hard about the smaller details. Details which are often discussed in a little food note or in a method section or in a supporting information not being directly accessible and they tend to be to become a little bit outshined by the fancy results that I presented at the main paper. Yet it is these little details that make all the difference between success or failure. So what we hope is this seminar series as well as this panel discussion, what we hope this is that it helps you in order to decide whether or not QMM is the method of choice for the problem you wish to address. And once that is a yes, also to help you to actually set it up in such a way that the results are meaningful and to validate the results and that this really a substantial contribution to the understanding of the problem that you're set out to work on. Now the webinars we have had, so the organization was consisting of these six webinars we have all seen now actually were seven including my own and today the panel discussion. And the aim of this whole series as mentioned many before is to come up with the best practice to be able to write a so-called best practice guide which in particular helps beginners but also advanced users can help beginners to get into these fields and to kind of do a sanity check whether or not what you do is accept the practice or not. And of course best practice is something dynamic it might change maybe in 10 years we think of best practices in a completely different way. And the way we think it is the best the way to the way to go here would be to first identify what are the challenges when you perform QMM simulations. So what kind of challenges do you meet while setting up the QMM calculation but kind of choices for which you don't know the answer a priori for which you often have to test a couple of options for the answer and then decide the best one and what criteria do we use to decide which option which set of parameters which model system which model Hamiltonian is the best one for you. So all these are challenges and all these should have the best practice that people can rely on. What I hope but don't actually expect but nevertheless that we can meet some kinds of consensus on these issues and that would allow us to actually make a very clear document that will be available in BioXcel and if it's a really good document you might even consider try to publish it somewhere so that people access it can access it also when BioXcel is long gone. Now some take-home messages from the webinars that I wrote down is the most important take-home messages. So we have seen in the message of Maria that in order to distinguish between a reactive and a non-reactive confirmation which of course very important if you don't want to spend all your computational time on trying to calculate the reaction barrier for a confirmation which is reactive in the first place she has actually shown us that you can use the properties of the electron densities and in particular the second derivative or the so-called Laplacian of the electron density to decide whether or not configuration is reactive. Then the second seminar in the series was given by Ulrida and he actually explained to us at least but the speakers can interrupt at any moment if I say if I take if I put words in them out here. It is okay that in order to optimize the structure you can actually use a smaller quantum system but the moment that you want to get the energetics correct you need to include larger quantum system and he also gave a very clear clarification of when to introduce what residues like neutral residues up with a certain cutoff, charged residues maybe accept the ones that are at the surface you need to include and he gave a couple of very good guidelines on how to get converging energies for QMM calculations. Then I cannot see what I wrote anymore for Maria. Ah yes so what she said what she well one of the take-home messages of her lecture was that that QMM is all about compromise and I couldn't agree more and but in order to choose a compromise that everybody can live with you need to do a lot of work and very importantly what she also said that is something we should all also know but this often even I see it myself around me and myself may be guilty of that as well all the information you have or in this case all the correct information, experimental information, experimental information that you can trust all of that must fit with your model. You cannot selectively shop and say ah this paper that fits kind well with what I found but this paper does not so ignore the paper. I think it's a very important message in order to guarantee that your results are meaningful and have something to contribute. Adrian he he actually said a quite creative statement that but most but what is in most of biochemistry textbooks in terms of enzyme mechanism is probably wrong and he literally said I checked it in a movie that intuition and well-informed guesses are often wrong he furthermore mentioned that based on the good agreement that you obtain without incorporating or without explicitly accounting for so-called dynamic effects where enzyme modes are somehow coupled to the reaction causing it and this way can drive the reaction which is an idea that has been popularized in the the last 10 well not so much anymore I believe but was very popular about 10 years ago that those effects cannot be lost. Janus even went a step further and he said that there are no such things as dynamic effects which are then to agree with and he warned all of us to be aware to be aware of improper models such as such a dynamic model he also introduced us as the only one in the series to the empirical valence bond method which provides a very cheap computational cheap I mean a way to get a reverence but to get a free energy service or potential energy service which it can then systematically improve by determining it to a higher level of theory and furthermore what was also interesting in your talk Janus was that he also introduced how we can actually get kinetic isotope effect how we can compute those because those are often an important experimental piece of information that you have about your system and finally Karin me mentioned showed a whole series of chemical of studies that that her group has performed and one of the key but one of the important the whole messages was that already in the systems where they had the sugar rings already in the systems in the adduct state where the substrate is bound into the enzyme force fields do not provide the correct description of that adduct state that adduct state was already slightly distorted as I understood in order to be able to resemble that transition state so also here the common practice of running a long mm equilibration prior to the qm calculation for example to select multiple reactive confirmations one has to be careful there as well because it might be that the mm force will distort your adduct state too much now i want to go to the next slide okay so the challenges so i'm sorry if i'm cutting away the grasp of some of the people but i will not go into detail here when you do computation when you do a qm a project is of course the first question you have to answer what is the scope of qm calculations does my project are just my problem sorry fit within that scope can qm and be used for this problem some problems simply are not doable by qm and a good example constitutes for example chemical reaction on surfaces which is very difficult to cut for example in metal surface in a qmM passion so those tend not to be done at the qm level as far as i know then the other issue always concerns the model structure so you have a structure you have a Hamiltonian and then you have to do some sanding so these three constitute the main beef in so to speak of a qm calculation i have a starting structure that can be coming from an experiment or it can do a computational model like a model model and then what so then you have a starting structure how good is that starting structure Ulfrider showed us how you can use actually qm to improve the the quality of these starting structures but the main problem when you want to do qm calculation in particular for active and for enzymes with active sites in the active site so you have the enzymes it's often pkas are shifted so which tautomatic state of my residue should i consider what about heterogeneity so if the protein structure can exist in multiple configurations all these problems you have to deal with before you make a starting structure and once you have a good starting structure the next step is of course to make a Hamiltonian qm and Hamilton and there the question is all arises what is a good force field what is a good level of theory how do i know if it's a good level of theory how many qm atoms should i put how many m atoms should i put what should i do with the boundary how should i treat the interactions in particular long range interactions all questions that that will affect all problems that you need to address before you actually can go and go ahead and run then validation we can have very long stories about that but that is actually the key i mean if you can't validate what you've done it's probably useless what you've done then an important question concerns also the hardware to run on and then once you have the hardware of course that determines the software that you can can use on that hardware and there also i mean many of these methods are okay are easy to use in the hands of those who have developed them but there might not be that easy to use in the hands of someone who has not developed them because whether the developer like us finds easy to understand might not be the most easy to understand for an average user or worse often we haven't thought about all possible combinations maybe one user wants to use option a option b and option c and we never thought about combining these options so there's a lot of uncertainty and a lot of i would not not call it stress but a lot of options to choose from and many options to choose from is not always better i think um finally or not finally this should have gone before actually but in conjunction with the model and the Hamiltonian you need then to decide what you want to obtain do you want to have an optimizer transition state and just get the activation energy or do you want to have a free energy how important do you think it is to between one or the other that you do think entropy plays an important role and once you decide to sample either statically or dynamically what is the reaction coordinate you know that requires again some informant some intuitive guesses perhaps about chemistry but how do you check that and finally confidence how do you know that your that along your reaction coordinate or more importantly perpendicular to your reaction color the system has converged so you can really call this a true free energy or not so these are general challenges then our panelists we have asked them our speakers to to come up with a a number of issues a number of challenges that they would prioritize and i would now like to give the words to the panelists to the speakers to actually make a short statement about what they would like to discuss during this panel discussion and i start with maria and now i have to give her i know maria can simply open the microphone so can i start yeah you can start sorry i was just wondering if it was maybe yeah actually of course yeah of course there are many issues that should be considered i mean when starting QMM simulations but i try to focus on program and of course there now the focus of QMM simulation is shifting to be free energy scans not potential energy scans but still there are problems that should be kept in mind kept in mind like first of all proper selection of the QM QM method i mean that even dealing with the proper confirmation sampling and all this stuff on free energy servers of course you should properly um calculate evaluate forces between atoms and so electron density also should be properly described therefore i'm i think that maybe prior to the free energy calculations one should do some benchmarks with the potential energies on potential energy surface like simple QMM simulations of the same model system to to study whether this method is okay and only then just to transfer to these QMM dissimulations if it is required and another point is that biomolecular systems are really huge and they are multi-dimensional and we cannot think of this model like black bolt which would be for example to some reaction coordinate and put some maybe set of QMM dissimulations and forget about the structure of course we should carefully revise every structure that we obtain and carefully revise the active side and also the the rest of the model system to to get well meaningful results i think this is actually as for my practice of course we have many students in our lab and it is all the same every year the same case then i tell them like you should carefully look at the model system at this geometry configuration and they tell like okay it's fine that we will do this but of course all the time every year they have the same mistake they just i don't want to waste their time to spend much time on this analyzing the geometry configuration and they just want to get output files and these quantities just put them to the table and go next and it is really important so i will finish that my statement that it's really important to to to study carefully the structure of the model system you are studying that's all okay thank you maria so bottom line check your output um the next because it was was was off yeah was off um off do you want to to comment on on statements that you made in terms of challenges in QMM calculations yeah i can shortly do it i thank you for your introduction and you have made a perfect introduction you have told us essentially about the challenges you need to think of and in principle when you start QMM calculations you need to think of all these challenges you need to set up the system you need to select the size of the QM system you need to select the QM method you need to select the QMM approach in the software and you need to think about sampling or not sampling and so on and then in principle you need to solve all these problems before you can start of it start the calculations and if you should if i should select something here then what we have thought a lot about is the size of the QM system as you already mentioned and the problems with the junction atoms which is big problems and another big problem in the setup is of course to decide should you do free energies or should you do single optimized structures but otherwise i don't need to say much about this i look forward for the discussion okay take your Ulf maria it's your turn now to to highlight or to to tell us why you have chosen these four points on the slides thank you hi everybody it's a pleasure to be here with you again and again i thank the organizers for setting up this online event and just like you all i hope i am looking forward to hearing everybody's opinions and the devices on best practices regarding QMM and in my case basically expanding on what we do within my research group when we need to establish how an enzyme works basically so to me the big lines consist and i told this before and it's written on the choice of Hamiltonian on how we regard long range interactions how we should explore the reactional space and whether the conformational space is totally determinant on what we want to study and that's basically it so over to you Gerrit thanks a lot um yeah the only thing i do is pass the word from one speaker to the other we could actually skip my contribution here but it doesn't matter Adrienne well thanks well i'd agree with everything that's been said so far um i think uh you you've mentioned the the dynamics question and i think it's important here so jan is absolutely right you know that this is not significant in terms of um the catholic the rate of reaction it's not and it's not certainly not a catalytic effect but i think um the debate about dynamics is somewhat distracted from some more important questions and actually what the role of dynamics is and i think for this audience it's it's very important that they realize that molecular dynamics simulations are actually very important now that's not that's not to say that dynamics affect the reaction rate and the chemical step but you know you've got to do dynamic simulations in many cases so the word lots of the arguments as as everyone on the panel knows lots of the arguments about dynamics have come from unclear use of the term and um and and some of the arguments have not been very fruitful um but i you know i think every everything that people on this panel have said about dynamics i i completely agree with um lots of the the problems problems the challenges of a qmm simulation are common to any biomolecular simulation and we've heard about this correct choice of protonation states for titratable residues confirmation um environment these you know these affect an MD simulation and it's important to realize you know there may not be a right answer that biological systems are heterogeneous and exist in various forms the challenge really is for a practical qmm simulation is where you where you start from as we've seen you know if you if you pick a confirmation that is not reactive then you're going to struggle to make a reaction happen even if you want everything else right kame's examples of the saccharide confirmation are a very clear example of that you can do a couple cluster and d simulation but if you start in a chair confirmation you're going to get a huge barrier to reaction because that's not how the reaction goes um so there is no magic bullet there's no perfect recipe and what's very important is to test your findings and why you run in a qmm simulation don't expect it to give you all the answers and that's where it's important to validate against experiment so can you for example predict the rate of reaction of a series of alternative substrates can you predict the effect of mutation on an enzyme um the question there of high level qm with limited or no sampling low level qm with sampling well i'm afraid it's going to depend on on the enzyme that you're looking at and the question that you want to answer um i think it's really important to try to get up some experimental variable um janas or observable janas mentioned kinetic isotope effects those can be very informative you know that's something that allows you to connect to experiment potentially predictively is going to help you not only get a better answer but actually produce something that's useful to an experimentalist um the other things they're more technical but i think the validation should come at every stage you know you when you're you're building a model you want to test it and i think we we need to again this is not for the panel but for the audience what we should be doing is not thinking in terms of models being right or wrong but in terms of testing the significance of your results so if you change some parameter in your simulation does it significantly change your observable and if it does is that a meaningful change in other words is it experimentally meaningful or is it a facet of the model in which case you have to pin down which is the correct choice agree with all you said adrian um yeah so these dynamic effects i think the ones that janas well i think this is actually would consider the different panel discussion and we should probably also invite different people for that one so let's not not comment on that janas is there something that you want to to mention now about the challenges that you would like to see addressed during this panel discussion or later oh thank you uh so i see one major problem for me in the in the multi-scale simulation of enzyme catalysis is simulation time i mean you know we have this empirical well as bond the methodology that describes quantum part on the level of poleca mechanics so basically at the same computation costs us the rest of the protein it's not a thousand times more expensive and at least for our monamine oxidase us we realize that we obtain convergence in terms of free energy i would say if you have one and a half nanosecond simulation time so this basically corresponds to to one and a half million steps so one and a half million evaluations of the force and the energy if you proceed with the ab initio q m m m and still then it's fine to have a little bit of a measure what is the uncertainty we can really afford 10 parallel runs starting from from different starting points so and this is a measure so we can we can we can we have some sort of an error bar and there are more problems with this i mean uh when we started our overwork you know i said guys we have everything on the level of molecular mechanics we can really obtain well converged free energy profiles without applying position restraint and it turned out that you know we obtained very very odd profiles and the reason is that correlation time for the Euler angle between let's say active site and the substrate they have very very long correlation time and it's always necessary to apply some soft position restraint in order to to prevent this internal rotations and so that will that will that will uh that will not give nice smooth profiles and i have impression that most of you guys that are doing ab initio q m m do not face this problem because currently i think if you have a simulation of 100 picosecond that's already a lot oh that's it uh so because still i think ab initio q m m community will face this problem in five or ten years when computers will become much faster second thing are this dynamical effect so this is deviation from the transition state duty i was tacitly hoping that this story is over you know when worship wrote this we call it funeral paper you know and adrian wrote his o come razor contribution so that this dynamical community will slowly accept that but year and a half ago we had the inauguration of our cryo cryo electron microscope and jachim franc was giving this the talk and during the coffee break we had a little bit longer discussion about nature of enzyme catalysis and i was unpleasant to surprise when he asked me that were worries what about the the dynamical effect and so on the transition state cure is it's not really valid for the enzyme catalysis so that's it it's a to kill off isn't it and it's so often does depend on what what you mean you know we all care about dynamics and you know your simulations are very extensive in terms of the md and that's why we know that it's not not important in that particular sense because you've done that sort of analysis well you know that i just i fully agree with you because you said that dynamical effects are minor and they do not contribute to catalysis much you know in enzymology a factor of two when it comes to rate constants is pretty relevant you know so that should be clear so 30 increase or decrease because of the barrier recrossing is pretty around when it comes to 10 orders of me great that's already a bit of a discussion here which is nice so i would like to move on to karme who has also provided a number of challenges karma yeah yeah i'm here so yeah thanks james of course sorry colloquium that will be very interesting i think it's being very interesting i agree with my previous colleagues when they point it out um just to add that okay i quen mem is a very powerful tool i think and somebody has defined it as is the best of the tools classical md and molecule and classical molecular mechanics and quantum mechanics but i think that also means that you need to know or to be expert as much as possible in these two walls to be able to to understand the the results that that you obtain and to distinguish what is true real from artifacts so i would distinguish here two different type of backgrounds of people because i think depends where you come from with just background your challenges are different if you come for instance from the quantum chemistry community and this is for instance that was my case suddenly you are with us from a system you come from a system of 50 atoms and suddenly you are here with 1000 atoms or 100 000 atoms and you think you are losing control and things like this and you need to learn all the other aspects you know you need to learn about the molecular mechanics wall about molecular dynamics and statistical mechanics flexibility of the system and all these things and these are like your challenges but if you come already from this field maybe your challenges are other ones maybe your challenge is how to choose the basis set how to choose the functional and how to do the the qm part of the of the qm-mm simulation and okay for me coming from the qm community it's been fascinating to to learn all the all the statistical mechanics surrounding all the molecular dynamics part of the qm-mm and i'm still learning a lot of it and by doing this what my what i have learned more is that is is that sometimes it's not it's difficult to trust it you need to to get the initial model that is correct and you can trust and sometimes you we try to trust the experimental structure just because it's experimental and even if it's a good resolution crystal structure my colleagues that are crystallographers always telling me don't trust blindly the structure look at the density look at what is missing what is not missing and and then equilibrate properly this structure before starting any calculation and this is that's been a challenge for me and and i think the next challenge is to to be able to do longer time scale simulations because in qm-mm i think that the problem of the size of the system is partially short because we can treat bigger systems as much as we want but the problem of the time scale is still the what what improve can influence our results and this is of course a system dependent for some systems this is not necessary for some others it will be but this i think is a major challenge especially for when we study systems that have not been studied before because it's different if you study chemical reaction that has been studied for the last 20 years and we already know the reaction coordinate we just want to improve a little bit more or totally different reaction than nobody has studied then this is also very challenging and that's all i i think during this during this session we will be able to discuss among us and answer the questions of the new users and knowing the concerns and and other challenges okay thanks a lot kamer thanks for everyone so i think this nicely sets the stage to launch this this this panel discussion before we do now it doesn't work again now it does again sorry yeah before we do so the the users could also contribute questions um yeah so one question is shown here on the slide so how do i prepare input for qm-mm so this is already a little bit more towards the the tutorials that we are organizing within bio excel and i think i don't know the meat we can we comment on when the next one will be because there we will address this type of problem then the second question is whether or not uh is an elastic band sufficient or not but that is something we can actually easily uh they can discuss among us and finally um yeah the last question is again specific can we actually study afm like forces when you pull uh one protein away from another with qm-mm and there the answer i think we can agree on public qm is a little bit too expensive for that type of application um then there was one question now in the comments coming by okay so the other question is what should be what what if someone wants to start working with qm-mm he wants to stood he or she wants to to to to get into qm-mm what is the focus what should what should he or she focus on learning uh how to run the calculations other theories of qm yeah so also here is something that we can take into consideration doing this panel discussion okay now we're back so we see the panelists but not the not the audience still a little bit of strange feeling but that's how it is all right so i think maybe we should this so this is a panel discussion so at any point the attendees can uh interrupt i mean put a question on the on put the question out and as we had agreed before Emiliano Arno and Dimitri is that though your questions will then be injected into the into the panel discussion at suitable moments all right um i think we can start then perhaps with the first issue that is the issue of the model structure so what is the best practice guide we would like to formulate for that based on our own experiences based on what we have seen of each other based on what we have seen of people doing also we can maybe define what we consider bad practices without mentioning names so who would like to start here so let's say i'm a undergraduate student and my supervisor got a fantastic opportunity to collaborate uh on a high-profile project um let's say there are some people that some collaborators are able to to do a time result crystallography experiment and of course you want to be on it because this kind of work tends to get published in high-impact journals i need a structure what do i do next can i just simply take the pdb file structure do i have to what do i have to pay attention to or better now when you're reading uh reading a method section that is not our when you read the method section where it says uh protonation states were chosen according to the to the to the reference pkabl is of the amino acid except amino acid 1 2 and 3 for which we use the protonated or deprotonated states without explanation so what would be what would be a best practice so we don't run into that type of methodology again where it's obvious that it didn't work without the proton there but it is not really discussed so how would you how would you what kind of setup scheme would we consider best practice there so who would like to start who would like to start sharing his ideas on that i can tell you what we do uh whether it is the best way or not uh but i i can tell you what we do or what i do so i i start before that i i start by when when i when i know that there is a some sort of pdb that is related to the system that we want to study i ask my student to go to the literature and i suppose everybody does that and do a very extensive literature search and starts going through it and going understanding as i said and you said very well and thank you for mentioning it that they should also look at all the experimental facts or results that are published and um well yeah you said that actually in your talk yeah i know i know i know but you reminded everybody so i knew it for that i have forgotten already so anyway uh but um so then that they they have to go to the pdb and they have to choose the best structure and for that they have because often there are many uh so good resolution they have to open it and see whether everything has been uh resolved so all the the um atoms are there sometimes they're not obviously not i'm not only about the hydrogens but uh if there are flexible loops or not and the if the thing looks more or less correct or not uh regarding the um often there is a um a mechanism that has been published uh in the literature so uh often by uh experimentalists so they've got to relate one thing to the other so um they do spend quite a bit doing all this i always think that it is uh very good as i think most people have talked about to spend i think it was both actually that said this this time uh it's it's it's like building a house if you um well he didn't talk about the house but he he meant it i think so it's like building a house to me if you have bad foundation so eventually the house is going to fall down i mean that that's for sure nothing holds so you have to spend quite some time studying your system and knowing what you're doing and what you want to do uh and then when all that is solved so to speak uh the student should carry on and has got to see which model is best for his QMM calculation and also whether what you want to do with it you just want to know the mechanism of a reaction do you want to go on for example because we do i know many of you don't but we do uh we do drug discovery we need the transition states and very good ones with good geometries so often we have to do Gaussian calculations on him or some sort of calculations you know static so to speak without introducing the dynamics immediately uh because we need the transition states and with with precision as good as as you can possibly have because of then uh well you know why we just need them so we have to then do screening on all the geometries of the transition state to model your inhibitors and so on and so forth so so you have to think what you want to do next but for that one for one thing or another you need a good QMM model and to choose the QMM model what we do is we should never cut uh that's to me that's the most important thing or two things first of all you have to keep in your QMPart all the residues that are important catalytic for the catalyst for the catalytic mechanism and secondly you should never cut uh interactions that are important i i don't know double bonds and hydrogen bonds and all sorts of electrostatic important uh interactions and all that sort of thing so that's what we do regarding this initial part yes so one comment on the on the last thing you said so this is a detail perhaps but very often you you do cuts between hydrogen bonds in particular because you these are non-covalent interactions so you it's it's very tempting to put one in the mm and one in the qm and of course i understand that the hydrogen bond might be more than just like pure electrostatics because if it were that then it should not be such a problem unless the mm force has been really badly parameterized so you would encourage knots to do that to try and avoid okay let me be a little bit more uh uh precise on that so uh obviously there are times you have to but i wouldn't do it if it is in within the active center or to near it okay okay okay because otherwise it's almost unavoidable to have to have such cuts mm all right so yeah yeah but i mean i'm talking about you know very near the active center and uh more in within it so that that's what we do okay but now i abuse my right at privilege as the organizer asked us to to make the first comment i somebody else wanted to comment as well i saw like how to comment do we just raise the kind or um well it's a panel right i think we can shout to each other like a real panel do so just just speak that's easiest i think for the audience they have to raise the hand i'm sorry audience but that's because otherwise it will be complete chaos i i think i think the question was about um how to choose the protonation state of the residues that are a bit ambiguous and so on i think we all what we all do probably is to to use one of these pqa servers that predicts pqa's know and then decide that's always a good initial solution to check the pqa and add the protonation of state that corresponds to the to the pqa the problem is that sometimes these servers give strange solutions for a reason so i don't know this is something i would like also to discuss what to do in that case no you you just look at look at the experimental data okay and you know the residues we protonated otherwise the reaction will not work but what if the pqa server gives you a pqa that is not the the right one for the reaction maybe the no i i haven't found this problem actually sorry gary sorry karmic i in the middle of my thing i actually forgot the initial question to be that one precisely and i forgot i i started with the experimental part because the experimental part is very important so i always tell them to look for you know the the pga's of the aspartate the you know all the all the aspartic acids the glutamates the even cysteins and all sorts of things so the experimental part is very important and then they have to go and look at the uh well they use prop cap usually or gas they age plus plus or something like that and then go and actually visualize that on the screen and see if they it's everything fits together so sorry i won't take any more you know don't worry this was this was still in the context of the original question because protonation tautomerization of protons is an input is a very important aspect come back to karmic's question about not being able to to to know what the pqa what the protonation should be and then i go to janis um so what what often drives me on the wall is when a mechanism starts for example with the deprotonated OH group on well for example the sugar you know you start with the deprotonated of a ribose sugar for example and you take the proton off and you start the reaction then i never read of other but i often cannot find back in the article what is actually the energy penalty for taking that proton off because if you start in a state which has a different pqa than the ph of the solution in that exercise uh then you then you want to know the answer for you need to get at a certain energy offset and that will of course increase the barrier but if you just don't say anything about it you just remove the proton and because then you get a lower barrier the people rake and publish my paper but then what does it mean i mean you need still to account that most of the configurations will not have a deprotonated OH group so these are your barrier can be exactly the experimental for the wrong reason your barrier could be experimental one and you are very happy with it and then you forgot an initial step so this is an artifact it would be an artifact to albaria jenice okay so protonation states are definitely associated with pqa values so if you know the pqa value of a certain ionizable group then you have an analytical expression for free energy cost to protonate or deprotonated you or deprotonated you know pqa minus ph times 1.38 and then you have it in kilocalories so step task is how to calculate pqa value of of an ionizable group so you know x-ray structures usually do not provide this value even not that the crystallographers usually do not see protons and the method of experimental method of choice is always NMR so then you say this is signal for my aspartate and then titrate it and they got it so it's elbid one was doing that very well but calculations of the pqa values are held on the earth so deprotonation or protonation is is formally as into the action so we are creating a charge and net charge and this is extremely demanding so naive attempt to just deprotonate aspartate and perform thermodynamic integration usually fail so adiwarshal in his molaris actually called that by long javan dipoles with double resolution and pretty complicated i tried to understand this paper a few times and they gave up this is also not coded in the q in the q code of johan pochrist and co-workers they said it's too complicated so we were for over as for over monamine oxidases way we indeed calculated pqa values for ionizable residues and it took us quite some time so it's not trivial so this servers like pro pk or h plus plus give reasonable values but in some cases they they failed badly so i don't know what are your experiences with that but still proper calculation of pqa values is a major challenge for computation and simulation yeah and the main problem arises of course from the coupling between these sites because the simple concept of pqa is a bulk property having an amino acid in bulk water that is a pqa you get a nice sigmoidal titration curve and everybody happy in enzymes or in proteins in general where other types can also titrate in the range in which you want to titrate these titration curves are no longer sigmoidal they can even be going up and down if you look at the microscopic pqa associated with the site so that makes it very complicated story indeed and on top of that to model that to get the dialectic response rights yeah these are all very highly challenging problems but the issue is of course that that the answer depends very strongly on what protonation state do you choose so yeah it is nice that of course you have to write in the methods which protonation states were like unusual but then also please why we would like to know why is it a why was that residue chosen to be protonated and i would say best practice includes statements in the methods detailing why were the protonation state chosen as they were because even if they're wrong the result may still be meaningful but just if you can just decide later that ah this is where where this went wrong because this protonation state is now we have now nmr data and the nmr data clearly indicate that that site is protonated for example in the adult state yeah and you could you know you could have different mechanisms at different pHs couldn't you i think many of the questions we're placing are actually they're not about q m m many of these questions are about of course an interesting aspect of enzymes is that they tend to be they have quite the pH they're not they don't have the sigmoidal titration curve like amino acid would have they're usually active over a range of pH value so they have some intrinsic stability against changes of protonation so that makes them also very interesting to understand so maybe there is some not such a bad it's not such a problem because maybe the protonation state is rather preserved over a decent pH range because it has to perform not just at one pH value right absolutely and you you know you go to acidophiles and um so on that can operate at low pH and they they probably maintain their their active site in the same configuration as a mesophile that everything else is is changing um yeah i think these questions are then they're not q m m m questions they're very important questions and they're probably more important than people freaking out about which particular sort of link out and they might use because if you know you set up your simulation wrong it doesn't matter you know if you're doing couple cluster dynamics as i say you'll get the wrong answer and and i think what's been clear again there's a there seems to be the sort of the very beginner approach is you know is q m m m useful or not and then which method is the best one and that's the wrong question to ask the question should be you know it's like saying is nmr better than crystallography well it depends what you want to do so the first question is why do you want to use q m m m it won't always be the best method we saw one of the questions it's not a good idea to use q m m i don't think anyone on the panel would think it's a good idea to use q m m m methods to simulate afm pulling experiments this is not in principle it's a good idea in practice we will all be dead before you get a result um it's not a good idea so the first thing is what hypothesis are you seeking to test and then what is the appropriate method and we've seen different approaches and the different approaches have their uses different different strokes for different pros but different methods for different problems i do think it's really important that you know as as perhaps the protein structure prediction people have done as a community they've they've set themselves the challenge of you know how do methods different methods um uh of course how do they work in terms of their relative predictions of protein structure you know it's useful to have you know benchmark guinea pig systems like charisma mutates for example so if someone presents a new q m m method well let's see how it does on charisma mutates let's see how it does on my design because then you can compare it with other methods and then again you can say well it has a certain range of accuracy a certain range of applicability the first thing and this is not for the panel this is for the audience what question are you trying to ask and how are you going to address it how are you going to get a meaningful test of that hypothesis this yes also what you mentioned about comparing so when you're introducing a new q m interface or a new q m model or a new sampling technique try to pick an older published work and see if you still get the same answer or not this is always a very good semi-check whenever you're about to go even for your own goal there's far too much of you know take a sexy protein and a new method and then how do we know if it's it's reliable or not doing that important testing work on a well-established system where we know the mechanism and there are other good results to to test against that's very very important so if you're any student it's a good idea to look at a basic example a tutorial example something like you know we all have our favorite enzymes but something you know charisma mutates since there's something and see how you learn the method see how your results compared to what's been published and then take on the exciting new you know the electron laser structure yes and adding into that is that of course for everyone among us panelists and audience best practice also constitutes making your input files available in my opinion so supporting information is almost unlimited nowadays you can have you have repositories where you get a DOE so everybody can refer to that you can access that repository there is no argument for not sharing what you have used as an input of course you expose yourself and I know some people are a little bit worried about that but it is science and the more open things are the more easy it will be for newcomers to repeat your calculations and learn while doing as Adrian estate so this is also a nice option that we did not have let's say 10 15 years ago when sending around yards datasets was not so so trivial but I also agree with Adrian that we're now drifting away from the actual qlm panel discussion on the first topic so is there somebody from the audience or sorry not everybody has had a chance to say something it's somebody from our panelists who who still wants to make a comment or wants to say something about this initial structure selection and optimization yeah I want to say a little bit about the protonation states that's a very big issue and as Janne said it's essentially hopeless to calculate peak wave value so what we do normally is to look at the crystal structure try to deduce the protonation states from the crystal structure we we look at the popka calculations but as you say they are very often wrong so we trust the i much more than than the popka calculation popka is very good to point out residues that you should look at but then you have to look at them and what we do then look at the crystal structure we look for solvent buried charges because in proteins you normally don't have buried charges so if you have a single group that is not forming any ionic pair it's probably not charged so that's the second thing and if you really want to know if a residue is protonated or not what you could try to do is to run MD simulations and see if the structure remains close to the crystal structure if you get the wrong protonation stage you normally get changes in the structure so that's the three points we normally do to check protonation states and then you could also say that protonation states are important close to the active site well so on the surface of the protein or far from the active site it actually doesn't matter what you use we have tested that a number of cases and it is actually like that but close to the active site you really need to get the right protonation state yes i agree and this is already a practical best practice i would say let me do this analysis in the classical md whether or not your protonation state makes sense whether or not they conserve the structure that you started with maria canola yeah actually i also wanted to add that maybe simple but still efficient way to to to guess the protonation state is just look at x-ray structure and well usually if two electronegative atoms like oxygen and nitrogen allocated within three angstrom of course there should be a problem between them and then you just will have to to choose which of these two atoms should be protonated and also another important thing is that we should carefully check the side chains of histidines not only because of the protonation state but also they can be like cleat you know and nitrogen and carbon atoms can be mismatched and the same stuff is about the side chains of aspart glutamine and asparagine and well maybe for the x-ray analysis in general it's not that important but when you obtain your molecular model the strong orientation of the amide groups can be well can can well be well can be wrong in your further simulations they derive it to the wrong results this is important thank you okay is there a is there a suggestion for further discussion concerning the starting structures in qm simulations from audience maybe demitri or Emiliano if you see a question related to this or a comment related to this you can there is quite a comment on that actually so uh first comment yeah first comment maybe for the protonation state so there is a question how much difference in rmsd is not acceptable when you change your protonation states so is there any thresholds when you you should put yeah i think the question is what how how should you check that your system is not falling apart if you change your protonation state you should check both possibilities and see which one gives the lower rmsd that's the simple answer okay i agree yeah but not the rmsd of the whole protein the rmsd of the of the residue yeah the residue and perhaps the closest residues around it you know and something i find people often you know they focus on an rmsd as though that's the answer you know look at the structure um you really there's no substitute for a still for a human looking at a structure and seeing whether you think it is consistent with an experimental observable or not um you you can't just trust two simple metrics like rmsd and maybe another question which is uh connected also to setting up your system so if you want uh so some uh yaskal layan uh asking if uh how the qm methods could be used to develop a force field for md simulations yeah kind of general question but if someone have experience then i did not get that the committee how qm how qm could be used to develop the force field for your md simulations okay how can could help you but this is already done no when you have a a ligand that is with which you don't have parameters you need to do a qm simulation and develop these parameters otherwise it's just for simple amino acids you already have good parameters i don't know if the user means to improve the the parameters that are already available for amino acids or to develop parameters for new ligands i don't know what the user means here by the audience actually i know about the charm force field that there is some protocol that is published on maybe a long website that you should construct a set of model systems with your molecule of interest and also some water molecules around this uh well target molecule and from some calculations and well some quantities from it so it's like for charm it's like already a system protocol yeah i mean if you have um anything else in your in your molecule that is not an amino acid you're going to have to parameterize it unless that parameterization exists already in the literature which sometimes it does and so you are going to have to run qm calculations to find out the charges for example of whatever it is the cofactor or the metal or whatever that is in your in your in your structure and because if you don't you can't run the uh the the calculations anyway uh there will be an error uh coming and and those charges are important if you get them wrong you're going to get the results wrong i mean the whole thing will come from uh it is very important to get them right so qm calculations at a high level um are necessary to do that and there is a questions a question regarding uh this point about uh uh calculation at high level we can calculation for example uh to decide how the active active site uh what is the configuration of the active site uh it was mentioned before for example by maria and says this calculation this high level calculation uh has to be done in gas phase or there is another uh another approach to to mimic what what we are doing with the what what is the reality of the active site and this is more or less the question for example using gaussian and trying to understand the active sites the exact transition state for example uh which kind of uh of level in terms of of uh continuum models for gas phase have to be used okay i think this question pertains to but also janas was emphasizing is that when you want to understand how the enzyme catalyzes the chemical reaction you need to compare it to the same chemical reaction solution and i believe the question pertains to what if in the gas phase the transition state is different then i would say there you have your answer uh janas all right so it's a very good question so very nice and still efficient method to study enzyme catalysis if you have just let's say gaussian available the so-called cluster model fachmihimoin in stockholm is using it a lot and the idea is as follows you truncate your system to active site maybe four or five residues your substrate and the rest between us dielectric continuum so typically people place their epsilon of four and then you and then you can locate your reactants transition state and the products and then you increase your quantum region you know now you have maybe eight residues and the idea is no no no thermal or averaging is surprisingly necessary and then typically when you have let's say eight or ten residues you obtain if you add a little bit red if you add some additional residues energetic it's not going to change anymore we are using typically this cluster approach when we are examining the mechanism and then when we say this let's say over case that hydride transfer is rate limiting step then we can proceed with the empirical valence bond the full dimensionality of the enzyme and thermal averaging so rubric we are nail out that was that joined us as medic curious color work with this approach to very successfully i mean people that are more or less just familiar with uh with gaussian can quite successfully proceed with this sort of of multi-scale calculations okay i think for the interest of time because we really spent more than an hour i cannot agree with um we should move on to the next point and that is the Hamiltonian so the qmm model that we use to describe the interactions in our systems and challenges there well it's of course you know how to well i think we can maybe start with the point that oath found the most important is that the size of the qm region and how to cap the qm region and what to do with the cut of the qm region so what constitutes in your opinion the best practices for that for that problem so what we are doing is that we do geometry is with a rather small model and then we do this big qm approach which normally gives you a model of 1000 atoms and then we check that that should hopefully be converged with the respect to the size of the qm system and also with the position of the junction atoms and of course the advantage with qmm methods is that you have the coordinates of all atoms so it's very simple to enlarge the qm for single point calculations so that's our system so again this considers the best practice that is easy to do in the same way easy to do but that is doable and that would then consider typically end up in the supporting information but these are important things to not just start and say okay these atoms are there and without further motivation start running your qm calculation and presenting those results mm-hmm do the other yeah sorry if you see any difference between the qmm results and the big qm results you you can get the clue of what you have missed in your original qm system they are the same the original qm system is good mm-hmm good okay so this that constitutes the qm subsystem size and I think we all agree that it is the approach to to to make sure that that it is converged with respect to the number of atoms you need to include but now a more important question and then it probably depends as much as to whom you ask as as as to what you ask what is the level of theory we need for qm and that is of course a question that is impossible to give a general answer for but how do you determine that what you have used is not because someone else said that b3 lip is it actually one of the ideas was to call this workshop beyond b3 lip but we didn't dare to but what is the what justifies the method how do you justify your your qm methods and how do you use that dft function and not another one out of the many choices you have can I take that on please so we we do use dft more often than not to to perform to for the qm part of the qm and and and what we do is we always nearly always we didn't do it in the past we do now we benchmark for the the the the functional to use so we get a small molecule which is or a small system sorry which mimics the your active center and we we do a very accurate well we basically um do um complete base we use a complete basis set and do um uh couple cluster calculations and we get a really good result uh so we perform the mechanism of that small system and um and then we benchmark a series of depth dft functionals against that very well that very accurate calculation with a very high level methodology that's that's how we do so we we benchmark that and for different what we find is that for different systems um different enzymes so um the functionals vary quite a lot so if breath relief is good for one it is not good for another one or it's worse for another one and some functionals are really very badly suited for some reactions so that's basically what we do actually it's also yeah oh sorry i don't see you i have the i think it's also important to check the literature because of course all the functions are well i mean not parametrized but the suited for a certain type of reactions and it's really nice to just to check the literature to check which particular functions are good for your type of reactions because in well in in enzymes the the there are not that many types of reactions and you can always find some similar reactions from conventional organic chemistry or some metal organic chemistry and just to to use this problem use if you think that it's okay i mean for that type just transfer it to your human simulations but actually as for me i think that more or less nice is pb0 it is usually working well for organic reactions and it is much better and with relief that there are many words that show that for some reactions like like nucleophilic addition the literally well sometimes fails and that's might be that might be the problem and actually nucleophilic addition reactions are widely can widely found in proteins and like proteolysis or hydrolysis and so on before well yeah returning back well of course you should carefully revise the literature on rational organic methods okay that's it i think kind of i was first there again right sorry you had a comment that's the comment on the sometimes we tell we we give the idea that different functionals give very different results but i think we need to to clarify what do we need what do we mean by different results are we we not solving the mechanism of the of the enzyme sometimes by different results we need different energy barrier we get it doesn't work we just mean they don't reproduce the energy barrier that is the experimental one sometimes we use a model that is too small that no functional is going to reproduce this energy barrier because was more or model is crap so i think we need to be a bit careful with this because not that one cannot use uh there are many functionals that would work that will give you your your mechanism i think by failure we we we should mean something much stronger than what we are meaning like we are looking as an s and two reaction and it comes out and s and one reaction or the first energy is the red limiting and it comes the second one is everything and as long as as all of them give a different a similar potential energy we're talking about potential energy we should smooth a bit this statement that some functionals are bad and some ones are good it's not black and white this is what i mean to me the problem is that enzymes usually the barrier usually is between i'm talking about from an experimental point of view usually it's between 16 and 20 of the 18 i mean as an average it's got to be because if you translate that into rates of reaction if it is more than that it will translate into hours of reaction which no enzyme can yeah and and so if we don't get a barrier of activations that is in within that those limits that that that interval then and i'm talking about what we do from because we all do different things i mean and we all follow different thoughts of of in doing the calculations and following and and concatenating the the calculations so uh we wouldn't be sure whether we were on the right mechanism or not that's so we're always a bit uh doubtful if if we go uh outside those limits and for that we need to the the correct uh functional because you're absolutely right i mean often we're just talking about a few kilo-cals if we talk about two or three kilo-cals that's the error of the of our simulation so we cannot differentiate one function and for another of course one function like gives you 40 kilo-cals something that is okay maybe that four or five i think four or three uh i would not be so so taxative about something works something doesn't work i mean the main the main objective is to solve a scientific problem and sometimes i think we forget about it i agree with that i think there are cases where um some some particular functionals will give a qualitatively wrong mechanism um there are examples of that so you know and that you know in some cases all flavors of dft will give the wrong mechanism we know that so that may be an issue um i think it's it is a good idea to test the sensitivity of it of a dft q mm calculation to the choice of functional if you get a you know there's a strong dependence on the amount of exact exchange you include then that that's an indication that you should be concerned about your choice of functional if on the other hand you know it's the barrier or the the energy of the intermediate is changing by a couple of kilocalories that doesn't matter very much and you know as mario was saying most enzymes have barriers which are in between 10 and 20 kilocalories more they have to um for for natural reactions um and it doesn't matter if your functional overestimates the barrier in many cases or underestimates the barrier in lots more cases so long as the qualitative mechanism is correct as carme says it is possible if you know so it's possible to go beyond just testing different functionals by doing something like exact embedding so if you embed an abinacea calculation with n a dft calculation you can test your single point energies so you can go significantly beyond what's possible with dft and in some cases for some mechanisms that's that's useful sometimes it's i think probably qualitatively useful probably useful for identifying mechanisms but for most of the examples that we've seen um it's going to be a case it changes the barrier a little bit um if you want an absolutely correct barrier you're going to have to do an absolutely correct calculation and that means doing better than dft but usually that's not what biochemists are interested in for the density embedding you would do it as a posteriori analysis on your reaction part yes that's a interesting interesting approach actually it's not it's well so far getting gradient and so on is so this is a single point calculations you know it whenever anyone's doing a dft calculation it's a good idea to test your function that's not obviously not limited to cure a moment no and there are many many ways of doing it you know so i mean each of us has got his or her own way of doing it but one way or another we always have to validate things and to make sure that we know what we're doing and there are many ways of doing it so that's why i started by saying that we have a plan each of us has an initial plan which is different from other people i mean there are no rules in this uh in this field of science at least this one and there are no rules so we have to establish our own rules what matters is that we have a correct line of thought i think the way that we get there well you know it varies but um yeah okay yes before going to Yanis so related to this and related to what you were discussing Maria about how to benchmark so there is a question of one of the users uh of Chao Shan who is pointing out that you cannot perform couple classes to two large systems so if you have a large moment i didn't say that i didn't say that i said that you have to yeah just a moment very quickly i didn't say that it's impossible i said you have to get a small system small system that mimics the active center that's what i said otherwise it will be impossible and then we do the question with with uh that's it no that was a question so you need to find a small representative model for the problem you wish to describe and you cannot all exercise at CCSDT level uh in order of course of course let's not verify that's great then Yanis is the eager to to to mention something yeah yeah no i mean at this point i would like to mention that the very first QMM calculation like this license line were obtained by experimental barriers so people let's say levitan worship basically used experimental barrier for the corresponding reaction equal solution and so they inserted it to the empirical balance point okay in our lab we have very good experience with true large mo six to x functionals in conjunction with those opus basis set so it works reasonably well i would say it's uh it's a good level of theory to to perform quantum calculations with 50 quantum atoms and then of course we fit that to the to the empirical balance point just to reference reaction great okay so yeah again in the interest of time i want now to move on to the next point and that is then so once you have a good qm level in sense that this level works for the problem of of the problem you're interested in now well how to describe the protein so do you want to and electrostatic embed and related to that but it's an important question about the periodicity so i guess there is not a problem for karma who is using cpmd but for all the others this probably is an issue you probably truncate you use truncated models you don't take it to account the long range electrostatics of course it is not the place but then you can discuss how representative is a completely periodic simulation box for the actual simulation in water so do we want to expand on on on that issue of qm modeling yeah maria actually in my case there are two different types of calculations if i perform some qm md simulations i have periodic cell and i mean just the same model system like in classical and dissimulation but with the selected qm part and in this case of course i have cut off on electrostatic interaction and so on to dissuade this periodic system and another and then i get this free energy surface and another way is to to perform in classical and dissimulation the model system composed of the protein and this water box and then just to to to prop the central part i mean the protein and also some layer like six some way of solvent of water molecules and do the potential energies come and in this case it is important to to calculate all electrostatic interactions because otherwise if you have cut offs from point to point on the potential energy i mean during the optimization procedure from when coming from one point to another of this optimization the atoms the charges that well that that are within this cut off can change and then we have these fluctuations of the total energy and of course in this case you should you should well all electrostatic interaction really it's a problem because if you have some kind of maybe 10 100 atoms in your 10 000 atoms in your system then of course it increases considerably the amount of well this one electron integrals in terms of system and then the the the calculations will become really long and time consuming but this is a problem so before you go to exceed maybe 10 000 in my case that's it well we have yes we have two problems as well so there is now karma and oof they have to they want to follow up just that i didn't understand maria would you mean you were warning about using cut off for electrostatic interaction right this is mean it is okay to use cut off if you have periodic conditions and if you have a single system without any copies then the cut off result in this well fluctuations of the total energy from from one to another point well sorry rotations of the total energy sorry fluctuations fluctuations so i mean that the point charges can come to well can be within the cup of distance in one point and then you move the atoms and they are already and then they do not contribute to the GMC many more and that is that can be the problem yeah i understand of course using a cut off is always worse than not using a cut off but actually if performing MD simulations i mean with this umbrella sampling technique you are not working with the energy surface but more with these statistical distributions and then it's okay to use cut off just to like rough angstrom it's enough and well i think we'll follow up cutoffs lead always to trouble you always get heating up so unless you have a very heavy thermostat built into your system you can still get decent dynamics but it's not the right physics so it's in general available to avoid cutoffs and use in periodic system at least an evil summation technique or a fast multiple technique but yeah most qm program do not support such treatment so that is one of the problems that is one of the reasons why homex wish to cp2k because at least there we have a full period as the accounted for but yeah this is a common problem that we all have that if i use a cutoff based technique then i get these 70 fluctuations and i can only get rid of those via thermostatic and that is of course fixing something with the wrong tool and now i let Ulf comment sorry for interrupt yeah so we normally mostly do optimized structure and in if you do up optimal structures and you don't have cutoff and you include the full protein and solvent and this shouldn't matter as long as you don't calculate pqa values and redox potential so if you don't change the net charge then if you have a protein of 30 to 60 angstrom then that should be enough why you do optimize only optimize why don't you do also dynamics you don't like yeah we do dynamics if we want to calculate free energies and of course then you have to think of it yeah yeah let's start with we normally don't do that we start to get the structures cool but if you do the free energy perturbation calculations like you presented in your cqt i forgot the fourth letter um uh method in that situation how do you treat periodicity in that situation then it's normal md simulations at the mm level so then we use yes okay and then you perturb the end state but then then it doesn't matter because the single point energy is anyway even though i would assume that this does add to fluctuation in your exponential averaging then when you do the free energy perturbation steps yeah but then we have full then we have periodic boundary conditions also in the qm calculation no in the qm calculation but yeah that depends on the thermodynamic cycle you do it doesn't add okay i cannot see it now in front of my eyes but but i trust you all right we want to comment on this further are there any questions related to this dimitri in miliano no no questions i don't see it okay so now i actually forgot what the next point on the on the agenda was um is the moment yeah so now we have dealt uh yeah so the qm boundary we think we also have now dealt with that one has to be very careful and show that one can really cut where one kind where one is cutting then the sampling so this is of course always a key choice like do i just want to optimize for example using a naturalistic band an energy barrier am i happy with that or does the problem that i want to address require me to get a free energy barrier and if i want to have a free energy barrier the next step is how to choose the collective coordinates for that problem and if i look at the comments of the yeah so how can we combine low-level for example aliens comments do we want to sample if we want to sample at the low level but in the end we would like to have the energy at a high level and everybody of you mentioned the sampling so i think it is an important panel discussion point i would like to start so who's not doing something i can i well i often don't do it but why so what what is the reason for me to do the sampling i think i think kame i think kame wanted to say something she's been wanting to say something for a while but i can say this afterwards yeah i was just also i can i can start now as you wish kereck your your panel discussion we're supposed to start arguing at a certain point so i think it's fine okay carry on the kame i'll i'll go after i think kereck said that if you want if one wants to compute the free energy barrier you do like molecular dynamics if you just want a potential energy barrier but i think it's much more than this i think the proteins move this is what they do normally and if you do potential energy you are doing things at zero Kelvin and you are missing the movement so you may need some part of the mechanism because all the amino acid is not just your reaction coordinate all the things fluctuate around and accompany this active site during the reaction it's not just a question of getting two kilo calories more up or down because you are computing a free energy instead of potential energy because the ultimate result is not the barrier the barrier we already know you have the experimental energy barrier you want probably the mechanism to get as much so i don't know i don't consider doing dynamic just because i want a free energy barrier that's not the the point of the simulation i think i don't know if what the others you know that's a very good point karm and hit you're right i mean if it's just about reproducing experimental result then then we don't provide much more insight so yes if we want to know mechanistically what is the role of each of the specific amino acids but of course one still would be able to to convince your experimental co-workers that your model makes sense by being able to predict the effect of a mutant where a specific amino acid has been changed and then you at least want to have these barriers in the same ballpark that is not that each of them gives the same barrier for example so it's it builds it's it's not only that you want to get the right barrier but mostly as you're interested from from your analysis what is what is causing the enzymatic effect but for that maybe you would argue that the reaction part would be enough to get the enzymatic effect and you don't need to do the expensive sampling yeah exactly you i think you can do both types as long as one is smart enough to interpret the results and to know that your starting configuration is good enough for that fine okay i agree i agree what you said actually i'm not disagreeing at all uh so so um we didn't do dynamics uh and when i'm talking about dynamics i'm talking about qmm and d yeah so i mean introducing it as a as a full calculation and for other sampling or whatever i mean i know karm et al cp and d but the reason was because we couldn't afford it we just didn't have the enough calculation power it was impossible for us to do it and now we can and as soon as we we started doing it basically because we can now access the european supercomputers and and and for us it was very good to do that karmet has been able to run all the calculations i think in marin ostrom which so she's got a lot of computing power but not everybody started like that for some years so we got used to running uh gaussian and uh we i really like it it makes me think uh we also use cp and d sorry cp2k and and we don't think so much uh because everything just runs and if that's my point of view uh sort of comparing the two the two things with the gaussian we it's actually we actually have to get to know this structure very well the system very well uh the experimental part very well and it makes us do a whole lot of things i mean we have to do it basically because we we couldn't cut corners and and so it gives us a very good uh knowledge of the system that we're dealing with uh on top of it i already said we need the very exact transition states we can't get that out of an umbrella sampling experiment we can get approximate geometries but i want good ones uh because it makes my life much easier in drug discovery so that's my point as well so it all depends on what you want with static and i'll call them static methods just because the dynamics is not involved in it it gives you a pretty good uh idea of what the mechanism is uh with all the steps that you get go through and all the checking that you have to do as well which you also have to do with introducing dynamics obviously uh but what we started so so we started introducing dynamics dividing the two things first doing the mechanism and then working on it introducing dynamics what karmi said about the conformational space is absolutely true if we run into an enzyme that has a conformational uh a conformational um um some problem with the then then eventually we notice it you eventually notice it you start getting very wrong uh results that do not at all agree with experiment believe me we've done that hundreds of times and uh and so eventually we you you go on to dynamics which can be classical dynamics and you find out about your conformational uh uh step and then you overcome it uh again uh when we started doing uh QMMMD what we find is that we know far more doing that uh we know far more obviously about dynamics of the system that the reality of the system so to speak um introducing that which is great also uh so it very much depends on many things it depends on uh and why did we start using cd2k i'll tell you because where we started using the uh um the high-performance uh uh computers uh we have to rely on the software that they give us because sometimes it's very difficult to suggest a new software because it's both because of this because of that because we don't have a license or whatever i mean there are millions of cases so we have to compromise life is a compromise and so is science to a certain extent so basically that's it sorry good and QMMM is always a compromise right yeah by definition i completely agree and we have you know there are at least two axes to consider on the you know how much sampling one does versus the level of theory that you use and you know ideally yes you'd be up at the top right hand corner but you can't be um so it's finding the the polling point of the maximum insight for the minimum computational effort and then do we have to be pragmatic yeah so basically it's i mean so no i think i agree with maria joao when she said that um that you immediately see when you uh when something is uh wrong because it's not a good experiment or what you expect but i think this is something that the beginners the beginners underestimate because if you're a beginner you don't see immediately because you don't look because you just push the bottom and then you get the result oh the result is not this code doesn't work this and what doesn't typical answer not this code doesn't work i get crap okay did you look at the trajectory or what did you do the heating properly did you erase the temperature in in two md steps maybe or typical and because you don't look or maybe because you don't have enough experience because also it seems that people in this field beginners seem to me for the questions that i get people is very impatient to get good results and it's a complicated procedure because you know qm you cannot go to qmm in two weeks you need to learn a lot about the mm first and if you come from the mm world you need to know a lot of the qm world so so just want to put into the highlight that get experience is also when it's only when with experience when you know what something is wrong or something is it's not this is correct no and concerning the transitional states there's also a clarification that maria as well i don't know why you say that in static qmm you get an exact transitional state because in dynamic mm you can also get an exact transitional state i don't know what you mean by this i think you can you can get an exact transitional state whatever your method is within the your method you get the exact transitional state even if you do semi empirical methods or you get also an exact within this level of theory well we do the frequency calculations if we are with Gaussian and make sure that we are on on the cost yeah if you do a transition state sometimes the transition state with with potential energy doesn't need to be the same as the transition state with frienas in most of cases maybe it is because of simple reaction but it doesn't but the way to test in a free energy landscape that this is a transition state you need to do a nice a computer analysis you run a lot of simulation from that point and if 50 percent go to product products 50 percent go to reactants to get the exact transition state so you get the best ways to get the transition state but yeah what about the computational power very simple and when you get potential energy transition state you get one of the many possible transitional states that several there's an ensemble of transitional states but but i think for for for many purposes this is this is fine yeah and i think now with this discussion panel thanks because we also mentioned managed to answer one of the questions of the of the users about what is the structure optimized structure we mean i think this is pretty much nailed it down so it's a state ensemble makes only sense if you have a potential energy service but i'm a free energy server of course it is not a single point on a free energy service yes so one important point that many of you brought up is how do you know if you have converged so if you decide not to calculate if you decide that for your problem you need a free energy barrier you don't want to have a potential energy service how do you know if you have converged right i mean if you use an umbrella sampling based method or meta dynamics based method in all these cases you need to ensure that all the degrees of freedom which are not a reaction coordinate are are minimized or sorry a sample completely but i can agree with this the premise of the methods so how do you know how do you check for this what what can we do to if i want to publish that this is a converged free energy profile error bars or what do you suggest but do we suggest sorry i think that we can calculate some more trajectory i mean for example we can't stop at and pick a second for each window i mean the number of sampling and then just extend it to for example 15 pick a second for each window and check whether the energy profile change it is the same then we have already converged so i mean if further sampling doesn't change the profile then it's okay but of course you never know if you have said if it not if not something conformational change for example happens in the next 15 fan picoseconds right or so this is something where best practices are very hard it's very it's impossible to answer this question i think but maybe janis yeah again according to our experience you need a nanosecond to have a convergence to one kilocalorie or something but there's of course still assuming that you're in a minimum which is separated by barriers at the modern nanosecond to come about right yeah but still you know at the very end you know if you have a very complicated reaction very health gross gross changes of the charge distribution then you know also protein responds to this you know and that's not fun since you know basically in during the enzymatic reaction you know protein is heavily responding to that so it also includes a little bit of problem of protein forwarding and forwarding and we know that timescales in protein is a pretty long i agree but maybe to get the answer you want you might not always need to have the full confidence in the sense that and you want to compare two amino acids which more or less follow the same reaction pathway amino acids in this active site where the reaction pathway is more or less the same or at least you can show to be the same maybe there you don't need your nanosecond to get the answer correct but i guess that if you want to study the mutations at the surface like what elqvist did in this cold adapted proteins versus hot adapted proteins i think there you definitely need to go to many nanoseconds because otherwise you're simply not going to see the effects of this entropic compensation but then you can run only mm simulations but not q m m i mean if you have one also want to run a picosecond at some initial level that is that's also clear and also maybe two or three people from all is within the error of dft method therefore well it's not that important to get really nice convergence i mean it can be good from the point of view of these statistical i mean distributions but still the problems with potentials well we cannot overcome these problems just so the advice now is maria that if i have an energy barrier calculate and have an error of in that range then suggest maria can always a reviewer okay very good um so now for the sake of we're actually about to end this this panel discussion and and as adrian suggested it's maybe a good opportunity also now to talk about uh to do do you want to answer the question or ask the question yourself adrian i think it would be nice to hear from everyone on the panel what what they would like to see in the future so i think we've seen the the state of the art um and what one thing that is clear to me is that it is you know q m m still requires a certain amount of expert knowledge to be done effectively that's that's come through from the panel and so i think part of what i would like to see is ways that we can help to automate that process to perhaps that's from machine learning or in other ways that simulations become something where it's more of a tool and and less of a technique um for the future but it would be interesting to hear from from everyone on the panel i think about what they would like to see in a short medium and long term for the future of q m m yes okay so who dares to start answering what he or she would like to see on these three terms i don't know i can say some native naïve comments for my from my side what i would like is similar to what i read in the document that you pass as javit about the a survey of own um q m m users i think many of them were considered that there's a lack of documentation on q m m and tutorials examples um tests and i think i would like to also to see these more user friendly codes more documented codes yeah than what it is now for the main codes that are available and of course more computer time to be able to test more more things not if i had more computer time i would like to test a lot of things yeah for my education and yeah well computer time will will definitely become more available right this european computer time okay but then we we need to better codes also parallel code exactly so you need to demonstrate that you would call the scales up to 2000 processor and which is a qm tell me a q m m code that scales to 2000 processes well maybe your cp2k grommax interface i would say no not cp2k but you probably have to go to quantum montecarlo if you want to really scale uh scale up to that amount of processors but it is a problem yeah you have to we have to rethink very carefully how we do our calculations string methods if you want to calculate free energies those are of course excellently suited for the new architectures that are being developed even though it is not what these architectures are meant for so now we're going to have the first three p access skill computers installed in europe in the first one is coming available this year what are we going to do with it is it going to be better q m m or we just get the faster i mean necessary to do something because otherwise the q m m community will not benefit from this and this mainly the m m community benefits from these big uh big steps in uh it's a scale and so on but the q m m community seems to be with uh much behind because that the q m m community actually is not developing actually the q m programs do you think it could be the reason in the parallelization uh transfer to gpu use of gpu was etc plus these things in general are difficult to get funding for because you get funding for a project not for a development or not to software development right you need to yeah we discussed it i think i'm going to use occasion that if it doesn't kill cancer you're not going to get funding for it but okay um yes so i i yeah that is that is indeed what what many of us expect so better documentation that is also what comes back in the survey that people lack lack simply the ability to independently get into using these codes then of course the challenges will be that with improving hardware if your code does not keep up being able to demonstrate scaling on those machines you probably won't get computational so then we pour in i don't know how many millions of euros into making fancy hardware which is probably going to be used by group leaders assistant professors who have definitely no funding available to hire engineers to make things scale or what do we then win in the end okay but yeah i also expect for the future for the long-term future that we will be able to go to larger qm subsystems than we are now but that requires work that requires work on the qm code in order to be able to make latest hardware developments about the others i mean what about the structure in in structure what about the revolution structural biology we have now cryo we are so we're going to have much more proteins available to actually use qm on so i see that as a huge uh like a huge a nice thing of being working right now that we we're going to we're going to the structural revolution in my opinion at the moment making it possible to study a lot more interesting mechanisms than we could before because we simply lack the structures molecular motors which are often membrane brown membrane bound the crystallography does not give us the structure for the cryo and dust and that provides the starting point for doing qm and all kinds of other analysis with the computer simulation yeah can i just yes please because i'm letting out a word answer adrian's question well i think i would like to see but i don't know how so i would like to be able to predict the catalytic power of enzymes and maybe machine learning is the way to go and i think enzymes are so different from each other that i'm not quite sure that machine learning is there yet to be able to solve that but it definitely i think will play a big role in the future meanwhile i think we're going to have to simulate our systems to become as real as possible so dynamics will definitely have to be involved and and systems will become bigger and bigger qm part of the qmm will become bigger and bigger and i don't know i can't predict the future but that's what i would like to and karm is absolutely right we'll have to have more tutorials more explain better what this is all about and which at the moment is just something with no rules which is very difficult for the newcomer it's all very well to have a student and then say i'll tell him or her you know well this has got no rules like i often say to my students and i just see a blind look in their face and say oh yeah good now what you know so anyway that's i didn't answer any question i know adrian i'm sorry but and this is just his license to speculate isn't it what about the other way around so when often when you read the ideas about enzyme design it basically means that the user has to think which amino acid he wants to change it and you can do a quick automated qm calculation to see the effect but you can turn this the other way around you can actually make a make a make an algorithm that actually you know you put in what you want i want to bury it to be as this low which mms to have changed i mean i think those are the future and those applications where the qm is just a scoring function of a larger program that will benefit from the massive from the largely from the from the from hard work become available that is as massive as the new lumi computer that will be installed very soon it's already being installed at the moment so i think we have to probably think about other questions than what we have been doing so far in order to make use of the resources so in that sense i envy many of the attendees because they are still at the beginning they have still all these things to be done whereas i'm already feeling like well i have to start maybe thinking about ending this off somehow but okay with maria you want to say something i'm sorry i mixed you up sorry how could i no actually yeah i wanted to i totally agree with marie about that yeah there are really no no strict rules and qmm simulations and now it's still more like art like somebody well everyone that who's involved in this field has it has its own you know some small rules some know-how tips and well of course it will be really nice if you can somehow generalize these rules and make some forward you know directions to to to perform at least some typical simulations in qmm it would it would be really nice i think yeah thank you and now jane sorry it's your turn yeah i think that you shouldn't forget about the amount of information concerning mutants you know genomic medicine is these days producing enormous amounts of data still you know over understanding is still close to zero and definitely we need the multi-scale simulation for that so we're bright student alia prach just started started studying monamine oxidase a you know a catalyzed the composition of of serotonin and if you have a couple of mutations then this enzyme becomes ineffective and so that's dutch know-how brunner syndrome and you have severe severe cases that you know psychiatry so i still think that in the future when it will come to this infer to to to this sort of data it's essential to combine this qmm what we are doing here and some sort of machine learning and this is still relatively empty field okay so a little bit to my own surprise we managed to fill out the time because i was kind of concerned when i don't know booked it for two hours but it's fine um yeah because i think people have other appointments including including myself um i think it would be nice uh as i don't suggest if each of us would give a short closing statements on on what we have discussed from his own on her own perspective and we use that as a kind of a wrap up all right so what do we how do we want to wrap up so i would actually consider that we have reached quite some consensus on how we set up a starting structure we also concluded that actually starting structures on their own in particular considering proton tautomization heterogeneity that would require all panel discussion and perhaps a whole webinar series on their own these are things are absolutely not trivial and i think what is important is that that people mention when they write a paper what was the motivation for choosing the structure that they have that they have used so if they're setting it up the way it was set up and it would also be good if people make a common habit of sharing those structures so let your input structure be part of supporting information so that others can do something with it if they wish we talked about the Hamiltonian and i think there are also the consensus is is is how we can validate our sizes when we choose qm subsystems uh qm boundary all but not much conflict on how to do that we all kind of agreed that that one has to be careful there and that one has to demonstrate beforehand that the qm m set up at the qm m division makes sense is chemically relevant and it's not just it happened to work for this qm subsystem so we go ahead with it that is how we used to do that is how i have just how we've done it in the past simply because there was no way of extending the qm subsystem back then we have talked about sampling so that depending on the question you wish to answer you may or may not want to incorporate the dynamics of the protein but sometimes it's enough to get an answer by just optimizing a transition state as as maria is using in her work for cluster models or on your models sometimes you prefer to run free energy calculations and then get the free energy barrier it really depends on the question you wish to answer and as karma correctly pointed out whether or not there is an error in for example your qm model or your dut model um but it's not important that the error i mean the answer is going to be wrong anyway i mean the question is the error small enough to answer the question that you wish to address and if that answer is a yes of course you need to convince yourself and the reader of that and i think you're free to go ahead um yeah we did not really touch upon soft and hardware because i think it was also very picky because everybody has own favorite software but what i think we do agree on is that these software packages require further documentation that they require further tutorials so that beginners in particular would have an easier time getting into this field but i must also say that this is not an easy method in the sense that you need to not only understand molecular mechanics physical mechanics you also need to understand quantum mechanics of quantum chemistry at least in order to get started so a little bit of no a little bit quite some homework quite some pre-work is required before you can go ahead with qm and once you have all the pre-work then yes maybe a bit more documentation from the side of the qm and developers would help a lot so let's let's give the attendees a chance once more to to still ask a question if they they have still one so i see oh wait a minute it would be good for all doctor speakers to write a detailed review on qm and together oh who wants to help me okay well i can say yes there is a good idea whether we will do it of course that don't depend on that because writing reviews is quite a lot of work but yes we can consider that maybe anyway by the way so for for for for uh king howl yeah the question uh is here the for asking the question what i can answer is that we will post a best practice guide based on on these discussions based on these webinars uh on our bio excel website it's one of the deliverables that we propose to deliver so there will be something written better or not this is review quality you'll see but something will be done and that could be the basis of a review in in in data state you'll see and yeah there is over can you suggest resources to learn qm as a beginner um yeah this comes back to what karmu was already pointing out is that we lack in general a little bit documentation at least there's a lot fewer documentation a lot fewer tutorials available qm and then there are for normal nm but this is hopefully going to improve and check our bio excel website for tutorials with hall max mhcp2k on qm and then the last question is thanks okay welcome so we should probably wrap up there and i'd like to take the opportunity again to thank everybody um because no people need to go so thank on behalf of bio excel and all the attendees thank all the speakers for your webinars and for this panel thanks very much thank you thank you okay thank you bye