 Hello everyone and welcome to this BioExcel virtual workshop on best practices in QMM simulation of biomolecular systems. So why are we organising this workshop? Well, we know that there is a significant interest in the use of hybrid QMM simulation methods for biomolecular modelling and simulation. In fact, BioExcel recently conducted a survey which showed that 50% of respondents significantly were hindered in their use of QMM by not knowing how to choose suitable QM parameters. So although we know that there is a significant interest in the use of QMM, there are also barriers to people being able to use QMM. And these barriers relate not only to the use of software to perform QMM simulation, but also as we've identified to knowing how to choose a good QM treatment. There is a caveat to doing QMM simulation. A simulator must be aware that it is dangerous to use out of the box available QM functionality and software without knowing what it entails. So it is dangerous in general to use QMM software as a black box. So we asked ourselves, how can BioExcel help in this area? So to do so, we organised this workshop. We originally intended to have an in-person workshop, but of course, as like everything, we had to take an alternative approach. So we've decided to have a series of webinars followed by a panel discussion. The goals of the workshop are to enable experienced QMM biomolecular simulators to share their insights and experience of what is best practice for doing QMM simulation of biomolecular systems. That refers to the underlying methodology, including theoretical aspects, but also a lot of the tricky practical aspects which are often not described in literature. The idea is that this would help share experience by warning of pitfalls and highlighting things to do as well as not to do. Both in general terms, but also based around the specific research experience that each speaker has in applying QMM in their particular area of study. So for their particular biomolecular systems of interest and a particular research questions are interested in asking. Now the aim is that this would benefit the wider community of computational biomolecular researchers, in particular those who might be less experienced in using QMM. And we want to use this as an opportunity to also identify common perspective in applying QMM in different areas. The format of the workshop, as I said already, is going to be a series of webinars by six invited speakers. Today's speaker has also been invited by the co-organiser, the kickoff, so I've not listed the listing here. These webinars will take place over the next few months. There is now an event page for the workshop on the BioExcel website listed there. So you can find more information there and the full schedule of the webinars will be up in the near future. Before I hand over to Gerrit, who is our speaker today, to kick off the workshop as a whole with his webinar. I want to just tell you a little bit about the session, how to ask questions. So a key part of the workshop is that we want to allow people who are attending to ask the speakers to reflect on their advice on best practice based on their experience. So to that end, we really want to encourage you to ask questions at the end. Now to do so in GoToWebinar, if you see the GoToWebinar control panel that's pictured here, there's a questions box. So feel free at any point during the webinar to enter your question there. We will deal with them after the main presentation. And I will give you the opportunity to ask questions by audio, by default. If you don't have audio, please mention it because that will speed things up. Then I will just read out your question to Gerrit. So today's kick off webinar for the workshop will be presented by my bioxial colleague Gerrit Ruhnhoff, who's based in the Department of Chemistry at the University of Javascula in Finland. And he focuses on application of QMM and other approaches for the development of simulation techniques for investigating reactivity and zymology and photoreactive processes. So I will hand over now to Gerrit. Okay, thank you. So the title is a little bit boring. And that is just the title of this, the kick off webinar. And what I want to do, so I'm already announced the general idea. So we have a number of excellent speakers who all are very experienced QMM users that all, I hope at least have their own vision on what is best practices. And then after these webinars or seminars, I hope that during the panel discussion, we can all come together and formulate some practical advices on how to do QMM. Because as Arno said, we could identify some severe problems, in particular for beginning users in applying QMM. So as Arno said, I'm Gerrit Ruhnhoff. I work in Javascula on the top picture of your slides, which you cannot see with my arrow. That is where I work. I'm actually currently standing approximately where the arrow is. Now it's not always like this. It's not always when the winter won the land. We also have summer. For example, this year it was on a Tuesday and it was great. But most of the time we're anyway inside forming computations or experiments. All right. So to get started is why this best practices workshop. And I don't already answer that question. So we conduct the survey. And from the survey, we infer that there is need for such workshops. So rather than focusing on what you can do with it, rather focus on how you can do it. And more importantly, how you can know that what you have done is the right thing. So, but before I start that, why QMM simulations? Well, QMM simulations basically allow you to do an experiment, a real chemistry experiment where you're changing chemical bonds, where you make new products without test tubes. So you don't get your fingers dirty. Well, depending on how clean you keep your keyboard, of course. And it allows one to observe the famous wiggling and jiggling of the atoms which are underlying, for example, all of life. And the latter is very hard experimentally. So I show here an example of two typical experiments that is used a lot in the biological community. And the first is crystallography. And I hope you recognize the diffraction pattern here. It gives you unprecedented spatial resolution. You can see where the atoms are sitting. So in terms of resolution, in terms of seeing the atoms, we don't really have a problem. But with crystalloscopy, with femtosecond laser pulses, well, that is precisely the chemical relevant time scales at which chemical reactions occur. At least where the bone breaking occurs once you have a transition state, for example. We can also reach the time resolution. We can probe how the system is changing over time after the reaction is initiated with femtosecond laser pulses. The problem is, of course, you may or may not know that in spectroscopy, the only thing you're sensitive for is changes of energy levels. So you see how the energy levels change as a function of time since the reaction has started. So it doesn't give you any spatial insight into what's going on, because you don't know which atomic motions correlate with the change in the energy levels. Conclusion, experimentally, most of the time, we can either get the higher spatial resolution, but then we don't see much moving. Or we can get high time resolution, but then we still don't see much moving because we don't see those atoms. So this is where computer simulations can provide an alternative and complement both these types of experiment, because you can actually compute the motion of the atoms and even chemical reactions if you do Q&A. But as these little two Martians experience, sometimes the computer simulation is also not being working. Now, I'm sure they're going to debug the segfault while I give this presentation. But nevertheless, so we have techniques that are disposal, computational techniques that are disposal to complement experiment and provide a dynamic picture to link up these two regimes of experimentalists. Why this workshop? So as Arnold said, is that Q&A is a very powerful technique, but it's not a magic bullet, it's not a black bullet, it's not the solution to all of your problems, because there are a lot of little details that are often not discussed in particularly when you present your Q&A work, you want to only focus on the success of it. You're not going to talk much about the details, how much time you spend validating the method that you have chosen. For example, 90%, and I think I make a rough estimate here, but I would say that 90% of all the time we invest in a project that leads to a publication, 90% of that time is actually ending up in material and supporting information that nobody reads. I was actually in preparing this talk, picking up some old papers and looking at the website, how many times you're supporting information was accessed. I almost had to cry, you do your best to make this nice supporting information and a number of excesses, one after four years. So it means people not reading the supporting information. Nevertheless, that is where most of the effort actually goes into, because that is where it demonstrates to the reader, the reviewer, but of course mostly the reader, how the method was validated. What were the motivations for choosing the level of theory that was used in the work. So that is what we want to, what I hope this workshop is going to lead to. We're all going to openly discuss all these details and make them come in good. Then, very important for those of you who are considering using QMM in their work at certain points is, can this QMM tool for you? I can imagine that you're working together in a team and there is an interesting, let's say, enzyme with an interesting mutation or interesting properties and you would like to know more. And of course then, ah, maybe we can do QMM. To decide if this maybe is a B or a not B, that is one of the things this workshop should try to address. And then finally, there's also to help you decide, so if you know, if you have decided that QMM is the method for your problem, then how to choose the right QMM parameters. So how to choose something, model that others are, that's going to actually have value. So these are the targets, these are the goals of this workshop. And to address these questions, well, we have invited six speakers and of course they are not forced to address these questions, but I hope that in the panel discussion at the end, we can actually discuss these things based on the webinars that you're going to see. Okay, so what was QMM again? I hope I'm telling something that everybody is now going to fall asleep, but nevertheless, what is it again, just to get started. So you try to join the best of both worlds. Of course, in practice, you may also end up with the worst of both worlds, but you try now to join quantum chemistry, which is in principle, first principle. So you don't have to make any assumptions or force field of parametrizations. And you can describe chemistry, you can describe molecules, you can try bomb breaking, but it's expensive. So you can only do typically small systems. And you want to combine this with the power of molecular mechanics, which is over parameter. I'm not saying over. It's very heavily parameterized, it's very efficient, and provides a good model for a large system. So this way you can actually get the best of both worlds, describing a large system with force field, that means the rest of the protein, and only that part where the chemistry is changing, so where the electrons play an important role, only that part between quantum mechanics. And then you have lots of choices, can use Appanicio, density functional theory, a lot of other methods that are possible. And this idea I should mention was coined almost as old as I am, many years ago, and let them to receive Nobel Prize in recognition for this contribution. So basically, from the survey and talking to people and also checking, also, but but people email me when they interested in using Chrome for QMM is what is what people want is something that can read in the enzyme structure from the collaborators of the PDB data bank, and then press a button to calculate a free energy profile with which then you can answer questions. So once you have the position state barrier, you can, you know something about the efficiency of the reaction, the rate of the reaction, and then start looking at the effect of the frozen environment, and maybe even tell your collaborators or maybe even yourself, creating a mutant that has a higher performance and then publish a paper. And in particular, when you're at the beginning of your PhD, getting these papers probably is important because it will give you the PhD. So I understand that this is what people want. And I fully understood, and I fully agree that that is what you would want. But what you get in practice is something else, what you get is that you're ending up with an incomplete structure. So if you collaborate and these structures are kind of super duper fresh, they were just refined yesterday, there might be a lot of missing residues. You have to fix that. The growth is not about that. But even more important is the protons and in particular in the active side, which is where the chemistry where the chemical transformation occurs where you really need that QM, So in general, in order to stabilize the transition state of a chemical reaction in a chemical reaction process, these catalytic sites are heavily strained, they are in kind of quite unhappy, I would like to say, conformations and that is not a problem because protein is very big, the enzyme is very big so you can compensate for this unfavorable interactions in the active side by folding a very large piece of proteins along with it and overall still have a folded, folded enzyme. But because of the strain in the active side and because of the unfavorable interactions, pKa's, the proton affinity residues may be quite different than you would expect them to be at the surface of the protein or in solution. And because a residue can be protonated or not, you have two possibilities and you don't always know what to do and if there is many residues where you have to choose the protons, you end up with a combative problem. So already preparing the structure for assimilation, you need to do a lot of thinking and a lot of wishful thinking perhaps as well. Then you need to incorporate it, but that's the role, I mean once you have the structure, you need to have force fields and enzymes, so while we have good force fields, well good, we can argue about that, there's no topic of this lecture, but while we have force fields for proteins to describe amino acids and nucleic acids and water, typical biological stock membranes, enzymes often contain cofactors and cofactors are not common enough that they have been parameterized by default. So if you're taking a force field, maybe the cofactor that you needed your enzyme or the substrate is simply missing. That means you have to parameterize it and there is also a task on its own. Subsequently, after you have been able to generate a good model, equilibrate it and have done whatever is needed to pay input for QMM, you need to decide what is going to be your QM size. Which atoms are going to be part of the QM region, which atoms are not going to be part of the QM region. And if you have done that with whatever argument you can come up with, you need to choose the level of QM to you. Again, a lot of choices, a whole forest of choices out there. And then you're still not done because an unbiased MD simulation of an enzyme reaction, well, these enzyme reactions to get to the transition state is typically takes a long time. It's not the kind of time scales you can cover in a normal MD simulation, in particularly not when you do a QMM calculation taking into account the QM calculations 1000 to 10,000 times more expensive or even worse. That means you have to buy us a run, and biasing the run means a sample according along some predefined coordinates. Now these predefined coordinates require a lot of understanding of the chemistry to know where is the reaction going to happen, which bonds are going to break, which bonds are going to form. And if you talk to an organic chemist, a synthetic chemist, they can tell you by looking at the structure, by looking at the two-dimensional picture of the structure, they can tell you, oh, if you add now whatever water molecule, it will react there. They know what are the reactive modes because of experience, because they're very smart, I don't know. But for me, being a bit more physical chemist, I don't know that a priori. So for me, choosing a reaction coordinate is highly non-trivial. And then you have to decide what you want to add. Potential energy servers, or if you think if entropy plays a role, you need to compute a free energy server, so you have to change maybe the method that you want to use. And then as I said before, you want to know the whole approach environment. Well, then you have to do something numerical, something that gives numbers. So you have to do some energy composition. And finally, you get your paper rejected because the review is one further validation. So there is a little bit, the contrast we have between what you want and what you get. And maybe we can help you a little bit with what you get to make it closer to what you want. That would be nice if the workshop can somehow make one step in that direction. For my group, as Arno already introduced, we developed QMM methods. And the reason why we do that is that I'm interested in photobiology. And now this is a webinar under the funding application, so I don't have to motivate why photobiology is so important. I just find it interesting. And what photobiology is, is how biological systems proteins interact with light, for example. They absorb light and as a consequence, they do something with that energy. And a little illustration of a switchable fluorescent protein which can switch between an on and off state, but the on-stage fluorescent nicely illustrates this. This was actually a cover picture for some paper we did some years ago. So the photon comes in and changes the conformation of chromophore, which is a molecule that can absorb light and do something with it. It goes from an on to an off state. Now to describe it, such a protein, it's a large system. So that already says, well, it can never treat the full system at quantum mechanics level, so we need some QMM for that. Electronically excited states have to be described because the process starts by absorbing the photon. So photon is coming by and it is illustrated here. In the ground state, I think most of you are all familiar with molecular orbital theory, where we nicely pair up all the available molecular orbitals with spin pair electrons. Then if the photon is absorbed, then the energy of the photon is actually benefiting the kinetic energy of the electrons. So the electrons get a higher kinetic energy and as a consequence, the electron has to occupy a higher energy level. However, the picture is too simplistic because there is many combinations of excitations I can do, all fitting in the same energy band. So what I need instead is to take into account a linear superposition, I mean, all of them coupled. You need to take all of them into account in this molecular orbital picture in order to get a well-described excited state. Together excited state well described, sorry, to get a well-described, well, good model for the excited state. So you can already see that in addition to calculating just the ground state, you need to calculate a whole bunch of states. And for that, we use methods which are called multi-configurational. So to take into account the multi-configuration of the electrons. And one of these methods is complete acro space SCM, so CAS SCM. Now, the whole of the thump in computational chemistry is that the longer the acronym, the more letters the acronym contains, the more accurate it is. And it is only six, so it's not so accurate. But it provides you with a consistent description of excited states and ground states, but it is a lot more expensive than, say, your ordinary Hartford Fork or your ordinary D3-Lib ground state calculation. And it feels very unfavorable with the number of atoms. In addition, nowadays, fortunately, time-dependent DFT for certain systems is becoming accurate enough. So a colleague of mine, Andreas Drouf, always says that DFT is a very promising method because it promises you much, and I fully subscribe to that. But things are improving. And if you know the system well and you can compare this to a higher level of theory, DFT becomes an option. So there are two approaches to describe the excited state. But then these photochemical reactions, they start by absorbing a photon. The photon, because it changes the electronic configuration, it, of course, changes the forces that the nuclear experience, or in other words, in the Born-Obenheimer picture, it changes the potential energy landscape of the nuclear. And that is illustrated in this plot here, where we have a ground state surface, where we have typical reaction, like trans-stucis, an isomerization of double bond, like the one that takes place in this little protein. And then the excited state, well, excited state surface is different because the electronic configuration is not the same, and it is illustrated by the red surface. Now molecules get promoted to the excited state surface, and then they start moving, and the gravity will be a wave packet, and our work would be a trajectory, until there is a point where the two potential energy surfaces are very close in energy, and there you have what we call momentary effect. This is where Born-Obenheimer actually breaks down. In order to compensate that breakdown of the Born-Obenheimer and still continue running a classical trajectory, we're using techniques called surface hopping, and as the name suggests, you can just jump from one surface to the other, whenever the quantum mechanical probability tells you to do so. I'm not going to talk about these things in detail, in any detail at all. It is just a couple of caveats or a couple of additional things you'd need to take into account if you want to use QMM for photobiology. All of this thing is just implemented into the home, so one can use it out of the box. An example, always nice to show movies, and this is quite old work, but the movie is still nice. Here we have a piece of DNA, and we have two timing bases in there, and if this absorbs UV light, we can actually do dynamics and see what happens. So we start now in purple excited state, now UV light has been absorbed, and we see that in a movie at least, we see that in a while we fall back to the ground state, and while doing so, we form new chemical bonds. We form the so-called DNA lesion. So at this point, we have a stable ground state product where the two timings have covenantly become a text. Now this might have implications for the DNA application or for the DNA transcription. And just an example, with MD, you can actually follow this type of processes. But the example I want to talk about in this talk is about photoisomerization of biological chromophores. And isomerization means we go from trans to cis, so we have a double bond, which goes from one configuration to the other configuration with the help of light. So here the system that I've been working on since my PhD, so that's a long time ago, and I think we're still sometimes working on it. This is the photoactive yellow protein, and it's a blue receptor. It's kind of the eye between the rotation mark of the bacteria, and this is the depiction I usually show when I talk to physicists, which is what I do most of the time, and they don't really act. They don't have so much with this kind of representation, but I think this audience has. In this representation, you can nice and see the chromophore exposed. This is the chromophore, so it's the molecule that absorbs light. And when we started this many years ago, we already knew that this protein absorbs blue light and changes its conformation. So it goes from an inactive to an active conformation, activating some other signal protein in the cell. And we still don't know which the other signal protein is, but nevertheless, we know how this protein works very well because of all the experiments that have been carried out. But the first step everybody agreed on when I started working on this many, many years ago is that there is an isomerization of the double bond that is pointed here with my arrow of this chromophore when light is absorbed, and that is what we wanted to know. So at the time we started this, there were no intermediate structures, at least no reliable one, so we didn't know what immediately after photo excitation the structure of the chromophore would look like. Or if you compare this to chromophores without protein, what the effect of the protein environment is in order to control this process, did it control the process at all? So what we did at the time is we did the QLMM simulation. So we used this class SCF method to be able to describe excited states. And this is a very, oops, low level of theory. And now that we talk about best practices, well what I'm converging to in terms of best practices is that you validate very carefully the QM methods for example. It is not something I did then, but of course this was 15 years ago, but the standards were a little bit lower at the time in my opinion. So nowadays I think if I had to reverie my own work, I would have probably insisted on a little bit more validation of the level of theory used, but that was not a problem then, so we got away with it. So what we did is we used class SCF for the chromophore alone, but also a very small subsystem which is hydrogen bonded. So we basically hydrogen bonded directly to an MN subsystem which we described at GOMAS. I don't understand anything about it. And we used it at the time. And with that simulation model, we could actually follow quite nicely as a visualization process. So in the excited system, now it's in the excited state, we see very fast on the Fender second time scale rotation here followed by on a slower time scale the breakup of this hydrogen bond. And in the end, we get a stable configuration where chromophore is clearly in a so called cis conformation where this bond is cis and the oxygen is there. Of course at the time, without any backup from time result crystallography or from choir trap crystallography, this was met with skepticism by the community that does theory because of this very low level of theory and by the experimentalist because it's just a model. So it was a bit, yeah, I was proud of it, but like always you're the only one who believes your own work and nobody else believes it. But that's how it is sometimes. Nevertheless, over the years, more and more evidence accumulated that this kind of hydrogen break is indeed an important process of the entry of the photo cycle, that these hydrogen bonds stay intact. So overall there was more and more indirect evidence that this was not so far after we had it. To validate that, which we did apostolate, to validate that we have repeated the calculation. So basically every time when the computers become better, when the algorithms become better, we repeat the calculations. And the latest repetition we did was using a much larger active space. So these numbers I haven't talked about basically means the number of electrons that you correlate in the number of orbitals. So cos 1211 means that all the pi orbitals of the common four are now correlated with each other and are now described collectively with 12 electrons. So you get a better description of the excited state. We used a better basis set and we used the ambulatory force field. And this, by the way, is possible thanks to Alex Kranowski, who unfortunately is no longer among us, but he succeeded in making this expensive CASACF method scale extremely well on parallel computers. So that allowed us, that allowed Dimitri, oh, I thought he was here, but he's not, to perform the MD simulation again. And to perform the simulation again and the result is fortunately the same. Good. Then to validate this, that is the best we can do at the moment. Well, there is something else we can do, but I don't think it's better. What we could do at the time to validate this. So at higher levels of theory, we still can repeat the same simulation. We can also use nowadays, fortunately, we can also try to use experiment to validate this. And the experiment that I'm referring to here is a time-resolved crystallography using femtosecond x-ray pulses. Now that type of experiment you don't typically do in your own lab. No, instead you go to Stanford. Nowadays you can go to more places and you use a free electron laser, which is a machine or piece of equipment that basically allows you to probe structure with femtosecond x-ray pulses. The place itself, the LCLS in Stanford is, well, you enter the experiment CXI hole from this little door here and you can under the ground and this is basically the experimental setup. This is the end station of this accelerator which I will show in the next picture and everything happens in this chamber here. It is a vacuum chamber where you jet the sample in and here interacts with the x-ray laser. This is the rest of the machine. It starts kilometers upstream where electrons are accelerated by an electric field. The electrons enter a hundred later and in the under later it is where actually the x-rays are generated and then these x-ray pulses enter in the experimental hole where we perform our experiment. Now very briefly, for those of you who are not familiar yet with this technique, a free electron laser, so the accelerator depicted here spits out electron, bunches of electron that are accelerated to almost light speed so you have to take into account all the fistic effects. These electrons enter the under later and the under later is what is shown here. It is just magnets which are paired and which are then alternated. So north, south, south, north, south, south. Again, and as you may recall from your physics classes, whenever a charge is moving as a velocity in a magnetic field and then there was these things with the right hand rule where you have the velocity I think here, the magnetic field here, there is a force in this direction or maybe it was one of the other three things, I don't know. It is a force that causes an acceleration. But because the magnetic field changes polarity all the time, you basically start, all electrons are undergoing the sinusoidal motion. When an electron is accelerated like here, the electron makes its own electromagnetic fields. That means that the electrons are not only feeling the influence of the magnets but also the influence of their neighbor of their friend electron a little bit close by. And as a consequence in this equation, somebody has solved a long time ago, as a consequence, fast electrons get slowed down, slower electrons get accelerated, so that eventually all these electrons start moving in slices, precisely a certain fixed distance apart where the distance is controlled by the speed of the electrons and the distance between the magnets. And now what you get is you get coherent emission. So all the light is now, all these radiations are now emitted in phase. If you take into account the fact that the electrons move at almost light speed, that means that there is a contraction of the, so the electrons don't see these magnets like centimeters apart like we would, but shorter apart. And at the same time, we are standing here, so those are the doppelganger effect. So what we see is we see that this radiation is coming at us, these pulses are coming at us with wavelengths in your open angles. And because of that, we can use them for doing extra crystallography. Now because you put so many electrons in this accelerator, your x-rays are actually very intense. So you get in the order of 10 to the power of 12 to 10 to the power of 14, I think 10 to the power of 12 photons per shot. So in a fewventous seconds, you shoot 10 to the power of 12 photons at your crystals. This is enough to get the diffraction pattern, at least a partial diffraction pattern. So instead of exposing a crystal for ages at a single cone, you can now use one or two, one pulse and then get the diffraction pattern from that. Of course, because most of the interaction with the x-rays is actually leading to inelastic scattering, maybe stripping off electrons, your sample charges up and explodes. So that is why this technique is called Thread and Destroy. The way it works is, at least in our experiment that we did five years ago, we have a liquid jet with protein crystals in there. These are nano crystals or micro crystals. They are jetted into the gas, into the, into the, as in a stream into the vacuum chamber where they interact with the x-ray pulse. The x-ray pulse is hit with these little crystals and each crystal gives a diffraction pattern. And because the crystals are randomly oriented, you basically get all the parts of the evil sphere covered. And you just keep on collecting snapshots. You have to sort them and index them, put them all together to get the full diffraction pattern. Of course, all your proteins are destroyed, but that doesn't matter. You get the diffraction pattern. And what is nice is that because this is a pulse technique, so you have pulses of x-ray radiation, you can combine this in a pump probe scheme. So you can pump with an optical laser in the case of photocopied protein, which is tuned to induce the photoisomalization. And then you can wait with a certain time delay. I'm not going to talk about how that works, but it's definitely simple, but okay. Well, it's not simple, but the idea is very simple. In practice, it's very hard, but the idea is simple. You can delay the arrival time of your optical laser, the pumps that starts to initiate the reaction with the probe laser. In this case, the x-ray pulses that then cause the diffraction pattern. And this way you can reach core to diffraction pattern as a function of time since photo excitation with the blue light. And it is very similar to normal pump probe spectroscopy that you do in your own mind. But now use x-ray crystallography, x-ray pulses, 50-second x-ray pulses to record the response of the system to this initial photo excitation. Now what you get in the final results after a lot of steps is that you can look at how does the electron density in the protein change in response to the photon absorption at time zero as a function of time. Now here's many snapshots. It's also very low resolution. So let's take out just three of them at different time points. That is shown here again from two sides from the top, or whatever the top is, and from the side. And the way it works, what I've shown here is basically you take the ground state structure, the resting state structure for which we already had an x-ray structure obtained at synchrotrons that is shown in yellow. And then what we then do is we take the structure factors that we measure with certain time delays, extrapolate them back up and see and then Fourier transforms and back with the phases of the original dark state. And what you then get is basically the electron differences in response to photon absorption. So red means that this electron density has disappeared with respect to the resting state, blue means electron density has appeared. And because the electron density is carried by the atoms, this can only mean that atoms have undergone these spacements. You can then extrapolate these different densities to full densities. That's also a bit fuzzy and a bit complicated, but you can do that and then you can even refine. So what we do here in pink is the excited state and in green is the ground state because we know that somewhere between these two time points there is the excited state k that we know from standard spectroscopy experiments. And then we can refine these structures. And then basically what we're showing then is the refined structures and how we, based on the measured electron difference densities as a function of time, since the photo excitation, how we interpret the signal and the following structural change of the chromopore. This is something we can now compare. Now I showed you the movie of the isomerization process. We can take snapshots from the movie and we can just compare this one to one. So here for completeness we also show the yellow structure of the resting state to show the differences and we use the same colors. And what you can see, but it's important here is that the main structural change, the conformational change of the chromopore going from trans to cis, that conformational change is perfectly captured in my opinion at least. We can disagree. Perfectly captured by the MD simulation. And keep also in mind that this is still an ensemble measurement so that as many molecules excited at the same time, whereas this is just a single trajectory. So we still have a little bit of a mismatch in the order of 10 to the power. I estimated around 9 to 10 in numbers, but the main, what is the chemical, so the driving force is causing this isomerization to happen exactly as it is happening in our experiment. So this was a nice, I don't know, this gave us some confidence in the trajectory. It does not speak against the model that we have proposed. You can also take other x-ray structures which were all obtained after the MD simulation but still these are previous with respect to our own crystallography study. And what you can see now also here is that we kind of capture the displacements in the chromopore backbone quite nicely in the MD simulations. So all of this, of course, does not mean that our model is right, but it does not say the model is wrong either. So we take it as a positive, we take it from the positive side and we say it can conclude that at least for this 450-day approach, QMMMD does a decent job. But it also does a decent job for other systems between us. I don't know, because when I write a grant application and say yes, we evaluated everything that we do, it probably depends on a case-to-case basis. But in this case, you get away with the model that we have. Now, there is one caveat or one, not a problem, but one issue and that issue is that in addition to the wild type protein, a lot of experiments have been on mutants, for example, this mutant where the arginine is replaced with a glutamine. I'm not going to talk about much detail here in water and in vacuum. And in water, several analogues have been studied. And these simulations, which is what these papers refer to because it is about the QMMM, we actually see that in addition to the double-bond pathway, there is also the single-bond pathway, which means that we have a rotation around this bond. So it is showing these two movies. So there is a double bond in water, happens just like in protein, we have rotation here. But look at this, we have a rotation on the single bond. I call this isomerization between quotation marks because it is not a real isomerization. You don't get a different photo product. But we see that most of the reactivity involves this process and not this process. And this is something that my colleagues who have worked a lot on this photographic data protein never wanted to believe. They always question the validity of the single bond and that gets a little bit frustrating. It's very consistent. So it doesn't matter what level of theory we use, even if you throw in a completely polarizable water model, we always get this as a main channel, the single-bond isomerization. So that was still something that could be an artifact of the method. So how do you proceed? One way to convince my colleagues, my interests in the same protein, is that we can very accurately compute spectra, that we can, for example, show here, this may not look too good, the dotted line may not fit very well the absorption, but it is only a few nanometers, so it is pretty good in my opinion. But still, that is not telling you anything about the dynamics. It just means that the ground state is sampled quite accurately. What about the excited state? The only thing we can do is we can try to compute stationary points at different levels of theory. So we can calculate transition states for isomerization around this one and around this bond at the level of theory we use in the MD, and then compare that to a higher level of theory that our theoretical chemist or computational chemist, colleague, agree on. Well, pouring table and a very unsettling result, in a way, because what I'm showing here is that, depending on how many electrons you include in your active space or what basis set we use, and this is just a selection, things change quite a lot. I mean, we change the barrier quite a bit. The only thing which is consistent is that we always find, no matter what level of theory we apply, we always find that the barrier for single bond isomerization is lower than that for the double bond isomerization. So that is consistent. So we have a qualitative consistency among the different levels of theory, but the competing theory in the experiment, you would find both. At least that's what we conclude based on our simulation. So we suggest that there's both isomerization channels, but the barrier ideal, of course, will be critical for the branching. So the quantum yield you find at the end of the day will depend, could be on these barriers, and there is no hope, at least in my opinion, maybe you disagree, but there is no hope of getting the branching ratio correct. So we need something else. So what we then thought of, and that is actually a bit simple, but I still think it's fine, how do we approve that this is possible? Now, what we could think of is, why don't we try to lock this thing? This has been done a lot in the past, it is still being done a lot. You just lock one of the bonds, so that it can no longer isomerize, in this case the five ring to prevent isomerization here, so that the only remaining channel is this one. So what we did is we created, we synthesized this comma four and performed transient absorptions with the microscopy on the comma four. But first, we did again a desimilation to show that nothing is changing, and here, well, simulations are running, we excited, and while I'm talking, this thing will start isomerizing around dead bones just as we predicted and just as it would if there were no lock. And again, of course we knew this already with the simulation, we do it, this was just a similar, these simulations were only done to make sure that we do not change the electronic structure too much by adding this COVID at lock. Then here are the results of our experiments. Now, what we say is that if you excite this system, you get an ultra-fast picosecond dynamics of this isomerization process, and this brings back the molecule to ground stage. The blue blob here in this different spectrum, so here we have the wavelength, and here we put the time, and this is basically how much signal we measure. Negative means that molecules are missing, so we excite at time zero, and then we measure how does the absorption of the sample change, and negative signal basically means that something is missing. That is logical because we have promoted some of the molecules to the excited state. However, over time, we see that this bump in the 2D spectrum is disappearing and it is gone away. That means that at this point, the molecules are all happily back in the ground stage and can be excited again. That is why the different spectrum with respect to the normal absorption becomes zero. And on the base of this behavior, we can only conclude that the dynamics after photo excitation, this molecule, is over within a few picoseconds, and that is totally consistent with the single bond. So we think that now we have always been able to prove, or at least provide strong evidence that the single bond isonization is possible. However, as I write here, experiments always require interpretation. We interpret this data with this in our minds. So there is always a huge risk of circular reasoning here. We don't really know, and the other advantage is this experiment is more expensive than a computation. And the purpose, of course, is that eventually we would like to be able to predict an experiment so that we no longer have to carry out the experiment. And that is why using an experiment to validate what you've just been simulating is maybe not always the best solution. Okay, we stay a little bit forward with our protein and it is one controversy that was raised in 2009 when the neutron diffraction study came out. And the neutron diffraction study is able to see the positions of the protons or the dutrons because they have a higher cross section. And what the authors of that work concluded was that the proton is sitting exactly in the middle. It is not here, it is not there. This is a hydrogen bond, very important one, but the proton is shared. So there's a long distance like 1.2 and 1.2 here. And that was great. Now, this is what literature people call low barrier hydrogen bonds and they have been implicated in chemical reactivity. There's a huge debate going on if they are real and if they are real, are they important? So this paper came out as the first experimental evidence because of the neutron diffraction able to be able to spot these protons as the first experimental evidence of a low barrier hydrogen bond. And of all proteins, it happens in my favorite protein. So that was interesting. We would like to see, because in our model, of course, there is no such hydrogen bond just using the proton is here, it's not here, it's just there. So that was a bit of something of concern because do we model the dynamic correctly if we cannot model this correctly? So what we then did is we compute the QMN potential energy services for the proton or for the neutron for all these positions here. And that is an example of the potential. We used various PFT methods and we used for the force fields for everything which is not shown as sticks. We used the force as six sticks. We used the force fields under all three. And this way, we then can compute potential services and with these potential services we can solve the Schrodinger equation for the nuclei in order to get the wave function of that proton or of the neutron. You don't think then that you change the mass here. And the reason for doing this was that we wanted to see, does this short hydrogen bond is the theory support? Is theory able to predict or to reproduce in this case the position of the hydrogen atom? And with that, can we maybe provide insight into why is there this hydrogen bond in the middle, why is this proton in the middle and why is it not a normal hydrogen bond? So what we did in contrast to what is still customary in QMM is we don't take the single protein from the unit cell and put it in the vacuum which is unfortunately common practice. And instead we try to model the experiment as closely as we could. So we tried to build a model in which we take into account the fact that the protein exists as a crystal. So we take the unit cells, we fill these unit cells up with proteins or six per unit cell and then fill it up with water and ions. We calibrate it, a lot of work, but in the end we get a crystal. We can then perform these QMM calculations for the proton position. So we calibrate the potential energy, we calibrate the potential energy services for these protons and then perform, calibrate and then solve the pseudo-migration problem. Sorry, for the Dutron. Here are the results. It's a big table and I'm only going to go very quickly through it. So here we have tried different functionals and this is the structure of the Dutron deflection study in a vacuum. So that means you take one protein and you put it in a vacuum and you forget about the fact it's a crystal. The total protein is plus six charge and then you calculate this potential and only if we take these functionals, can we get a significant extension of the hydrogen bonds. In addition, we have to assume that hydrogen is deprotonated, which is also, we also show that this is a wrong assumption, but nevertheless, only one model is able to describe this extension. So only one functional in combination with a questionable representation of the protein gives you an indication that this hydrogen bond is a such a low barrier hydrogen bond. Now that was also concluded by other people and particularly the group from Barcelona concluded the same thing and they discussed that in this paper. However, this can mean a couple of things. This can mean that indeed vacuum models are good enough to represent the crystal and can be lived is a very good functional even if it was only made for excited states. Because we have done a crystal model and we have tried all other functionals, we are much more conservative in our conclusion. So we say actually the Q&A models they scattered too much. There is no support, direct support for this low barrier hydrogen bond. So there is two opposing views. So I think there is no evidence based on what we have calculated here. But again, best practices would mean in my opinion that you try as many methods as you can, validate these methods and then see if the trend is consistently observed. So unless you have a good reason to dismiss a certain functional or certain basis set, if you can show the whole range of functions there is still the effect, then I'll start believing the effect is true. If the effect is only there in a specific combination of parameters it might be an artifact. I'm not saying it is an artifact, it might be an artifact. But why? Why is there this discrepancy? To solve that question we got back to Yamaguchi and coworkers and they shared with us their structure factors for the neutral reflection experiment and we made ourselves the refinement and we made ourselves the F observed minus F calculated difference maps. And what you can see, so if you put the proton very close to the atomic acid then you see a huge difference. So basically this should give no signal at all if the positions are right. If the positions are wrong this should lead to features like we see here. And what we see in the lower bottom is that for a lot of distances of the proton the experimentally measures the structure factors fit the model. So based on our conclusion our conclusion says that actually there is no reason to favor this position over that position. Of course this is easier to publish perhaps because this is something remarkable this is something boring. But according to us the experimental data does not allow you to conclude this over that and this would be good because this is consistent with all the TFT method that we have applied. Then finally in the last few minutes enzyme catalysis because I suppose that most of the people who are looking at QMM as a method in the field for their field of study are interested in some enzyme which is catalysing a reaction. And a very important reaction is nucleotide hydrochlorosis a nucleotide glyphosphate hydrochlorosis. So for example ATP which is the energy carrier in our cells when you hydrolyze that it releases quite a lot of energy and you get this ATP with two phosphates and this is now a pyrophosphate I don't really know what this is called it is something with phosphates and net protons. The system we worked on is a system that my colleague Lars Schaefer von Bochum is interested in it's an ABC transporter shown here what these transporters do is they well there is a lid kind of thing which is binding a substrate molecule that needs to be transported it can bind to the trans membrane part on one side and release whatever it has found the molecule, the substrate that needs to be transported into this channel can open and close thanks to phosphate sorry ATP binding and reaction over here so here it's a dimer there is two sides here and the hypothesis well not the hypothesis but we know this is a molecular machine which opens and closes which changes this this part depending on and it depends on ATP and the ATP goes in gets hydrolyzed into ATP and there is somehow it works and the question Lars had and we wanted to answer with Martin Friesen and Henrik Gurdecker was is that ATP hydrolysis in the active site which is shown here so here's part of the ATP the magnesium ion very typical for ATPases is the ATP hydrolysis does that provide all the energy does that provide the power stroke for this machine to open and close to let the substrate through now this is a slow reaction we knew that because it takes a it's a few times per second so we know that we cannot do this unbiased MD I could just run an MD simulation see what happens instead we need to we need to use a we need to enforce we need to use bias simulation so we need to calculate so called free anti-process or potentials of mean force and the way we do that is we decide on the reaction coordinate are in this case it is the distance between this attacking water molecule the nucleophilic water molecule and the phosphate and we we constrain the distance so we fix the distance and then simulate the system and we change the we change the distance so we run this for the multiple different distances and then we basically monitor the constraint force so there is the force needed to keep this constraint to we monitor that we store that over the trajectory and then we calculate the ensemble average or the detector average for each of these positions and then we integrate that force to get the free anti-process we should correct the fixment term here but I'm not going to talk about that because we ignore it because it's typically in the order of a kilo-calorie per mole and the errors they did in that so ignore this part but in principle you should take them to account now the reaction coordinates of course where the difficulty starts and I'm going to actually give you an example of where we chose the wrong reaction coordinates but fortunately the reviewer pointed it out now for the first step there is no doubt I mean every chemist can tell you must be this distance so that's nice then we need to choose a QM region with the two QM regions figure one to check if the conclusions reach for the the smaller one is valid and it's basically all the residues which are shown here in stick representation except for the addoning part of the of this RTP molecule then the level of theory so this is where you have to do your homework so you either have to do the level of theory in this case it's the DFT functional this one over here I don't even know what this all stood for and the basic set and this was nice because here we could rely on some work of one of our other speakers Maria Ramos who had performed the benchmark where the benchmark 52 functionals for this hydrolysis of phosphate and we took the one that they concluded was the best one we actually added something here we added these D3 corrections by Grimmer get also dispersion described better and thanks to them again they shared with us the input files for their study so that they could very quickly run the validation also for this adding these D3 functionals and these D3 corrections and that is even better I think that was our conclusion so now we have a good model it has been validated not by us but like now we have to do homework if your classmate does the homework and you can look what they have done it also helps you and this is also the whole service of publishing papers you share your results so you don't have to redo the whole process and for the force filter we use amber this is the one of the David's show I think it was the improvements of the David's show group this is our model and then we start doing this potential of mean force calculations so the first step of the process we let this water molecule basically attack the phosphate and we see that while we drive this distance to smaller and smaller values this one breaks up and the proton around the transition state which is this configuration that proton of that water molecule has actually gone to the glutamic acid so the glutamic acid here access the base during the process so that is nice and we get the free energy profile which is very much in line with many other ATPases so we have a transition state around 14 kilodoules per mole kilocalorie per mole sorry I don't know why my colleague from Germany uses kilocalorie I only notice that now 14 kilocalorie per mole per year then the second I only notice that I don't notice that now I notice that when I put this presentation together using kilocalories here then the second step so what is important to realize is that at this point the proton is on the glutamic acid and in the starting configuration plus here so the enzyme has not been regenerated and we also know that we need two protons on this leaving phosphate group so then we have to think so how does the proton go from this glutamic acid onto that myerty and then the first try we thought ah well in classical MD simulations of this state we see water molecules going in and out there is probably a water molecule at some point let's put a water molecule at some point because as you can see here the distance is way too large between here between there and between there so we just assume ah there's probably a water molecule and it made a lot of sense to us at the time so we insert a second water molecule and now we assume the proton transfer first goes to the water and then goes to the phosphate so our reaction coordinates now this proton oxygen distance which we then diminish over time and we calculate the free energy of that process and there is the process so here's the proton it moves towards towards the water molecule shown here so we have a hydronium-like system where the atomic acid is here but it gives the proton to the phosphate and indeed we get this H2P04- and the water is restored and the atomic acid is restored again so it looks good and we try to publish that we get a barrier which is really small so we don't add anything to the rate limiting step basically we're very happy with that but the problem is as the reviewer points it out the problem is you need to add the free energy of the inserting water okay no problem reviewer suggest something we do something and then we found out that it is very high the free energy of pulling in and recruiting that water is so high that the barrier would not be comparable anymore to the rate limiting step it would make a way too slow rate limiting step okay so this is wrong so even after we submitted it we realized not realized somebody told us this can't be so we have to do something else because of course you don't want to give up so what we then did instead is we look for something else an alternative pathway and we found one at least we could imagine one we tried to basically transfer the proton to dead oxygen sorry to dead oxygen over there it's a longer distance and this oxygen is actually ligating magnesium but let's try this is the nice thing about simulation you can try whatever you want so the second reaction coordinate is now involved is now this proton transfer and we drive that proton transfer and while we do that we indeed see that the coordination of the magnesium is broken but the proton is transferred and we have created this intermediate the problem with this is that the product state of this intermediate we call it ES2 intermediate state 2 the problem is that the energy is relatively high with respect to the other state that we started from so what we then did is we argue that well probably there has to be some relaxation so the next step the third step is the rotation of this phosphate moiety to restore to rebuild this hydrogen this ligating bond with the magnesium and that is shown here so we rotate this phosphate very small transition state hardly worth mentioning and then we have a new coordination bond over there and as I said hardly worth mentioning it's really small and then we gain a lot of free energy by going down into this final conformation so based on these calculations we conclude that the energy released or the free energy associated with this three-step ADP hydrolysis process of plus 1.8 kilocalorie per mole the rules against that the ADP hydrolysis is the power stroke so we speculate that actually the ADP binding is causing the conformational changes of this protein and that the hydrolysis is needed to get the ADP out again on this case ADP plus the phosphate out of the active site again and I chose this example not to discuss ABC transporters even though Lair Shivers group published a paper this year where they look at the transportation process so maybe the answer to the power stroke is in that I just didn't bother to read it yet but the point of showing this work was that the reaction coordinate choice is critical and it is very easy to make a mistake there or maybe it's not maybe we'll just do it I don't know but we made a mistake but the reviewer pointed it out which is initially embarrassing but finally nice because at least now we have something that we can have more confidence in even though it still remains a model okay I would like to end this now so there's a lot of challenges associated with them and you have to think about these things before you even start to avoid disappointment and frustration and one is the QMM model the level of theory you need to validate that QM region that should also be validated I don't know what is the current state of the art there what is the current accepted practice but hopefully it comes out during these webinars and during the panel discussion the sampling method is also something that needs to be validated I mean why do you choose this reaction coordinate how do you convince yourself not even talking about others how do you convince yourself that this is the sampling method that you need and finally I've briefly talked about validation which I think is very important ideally you would like to use an experiment but these are indirect and you cannot always do an experiment for example for the ATPase reaction what kind of experiment should I do make a mutant but that changes a lot of things at the same time so that's very difficult sometimes for the chemistry it is actually a lot easier as you have hopefully seen as you have seen in this talk and then you can also validate which is what we have used most of the time is that we try to calculate the same processes or recalculate points on potential energy surface at a much better level of theory but how reliable is that when are you done we don't know good with that those of you who still feel that they want to do QMMM I wish you good luck with that I hope you're going to find interesting results and promote these methods as a good method with the caveats that you need to be very careful about what you're doing and then before I really end it I have to acknowledge the funders BioLitCell Finnish Academy and most of the computations not all of the computations I think I discussed today whether on computers provided by praise or by the CSC National Computing Centre thank you all right great thank you Kherit so we have already a couple of questions which I will follow on with so we have the first question from Bagari who said I can ask directly his question is what do you mean by validation of the level of theory now this was a question earlier on and you highlight you already discussed this to some extent when you talked about benchmark benchmarking on 52 functionals so that was asked before that but yes that is a question okay so if this question has not been answered I can still answer it or maybe again so validation of the level of theory is indeed that you show that what you observe let's say it's a barrier let's say it's a process let's say it's a position of a proton that that result does not critically depend on the level of theory yeah it's very hard so up initial methods like if you start a hothead fork then you can do perturbation theory or you can use a coupled cluster we know what is we know what is the Jacob's letter of the of increasing the accuracy that you can expect so there we know quite well how to improve results you may not always be able to do so but we know quite well what can be improved this is of course hopeless for DFT well maybe I'm insulting some people now I'm picking up five that is not my intention but for DFT there is no in my opinion at least and maybe I'm wrong there's no way of systematically improving or choosing a better functional so the only way to do it is to compare this to coupled cluster results and then taking which functional works for which system so that is what I mean by validation and in the work of Maria Ramos and others that I actually refer to much earlier in the talk you can see that for somewhere here there in that study we had indeed taking a standard hydrolysis reaction both with water and I believe if I recall correctly with always minus but is that hydroxyl and and compared the reaction barriers for all these functional to the coupled cluster results I hope I'm telling correctly what was in that paper but that is what I mean by validation that you get confidence in the level of theory that what you see is not a result of the low but is only one specific combination of assumptions that is giving you the result that you want that is what I mean by validation has this question now been answered is there a way to get feedback now from the asker I think the asker could suggest so whether it's been answered or not in the chat but I will move on to the next question meanwhile next question is read it out the question is why did you use such a high level of theory as Cassie F this is earlier on as well saying DFT or MP2 is not good enough for this case the specific problem that we tried to address was what happens after fault and absorption so what happens in the so-called electronic excited state so you're no longer in the ground state can people still see the slides yes I don't know yeah so what I mean if you want to improve this configuration because it is a molecular optical picture it is a mean field picture the reality is different if you now do perturbation theory you start also mixing in excited configuration to the ground state but assume in DFT you don't do that so use this as a model because you have this mapping but now if light is absorbed then you get into an excited state electronically excited state and that is modeled by for example this one this one and all these are close in and in order to be able to get a decent or an accurate model of that excited state electronic structure and with that the excited state forces on the atom on the atoms you need to go to matters like gas as here now DFT methods linear response methods are also being used a lot but there are a lot of issues with the density functionals that are being used for this kind of human gross theorem which was done I don't know yeah they prove that you can indeed look at linear response to get the excited state properties of the molecule but the functionals are the problem so it's very difficult to get to know if a function is going to give you a decent description of the excited state and another problem is that once you get in this region where you have this non-independent coupling that is something where DFT has two potential energy services which are close together two electronic states which are almost degenerate it is intrinsically a multi-configuration problem so you have multiple kind of occupations of the orbitals that you need to consider when you want to describe the electronic structure there so that is why for photochemistry we can't use DFT we can't use amplitude you can improve the gas SCF result by adding D method too many acronyms where you basically take this complete active space wave function zero-water wave function and you then perturb it to get better to get better energies Okay, thanks that sounds like a comprehensive answer Josep who asked this question if you want to ask anything else to follow on from please enter the question in the box but I will follow on with other questions because there are a few more next question from somebody who doesn't have audio Batul with Q&M simulation Okay, so let me first not rephrase the question but let's just so reproducibility meaning that if I repeat the simulation with exactly the same software on exactly the same architecture with exactly the same initial conditions do I get the same answer and the answer is yes I get the same answer however what we do actually in practice is that we run we taking ground state ensemble so we performing MD simulation we have to get an equivalent ensemble and we take snapshots from the ground state trajectory to start the photo chemistry and what you then get is of course the distribution so when we look at an excited state trajectory we do for example with the last paper I think we did 200 or 100 I don't recall it doesn't really matter but in my opinion we did 100 simulations and each of these simulations starts from a different initial condition using the same software obviously on the same architecture and then you get the distribution of lifetime so once the second one at 200 seconds the third one at 700 seconds so in the end you can make a nice exponential decay plot that you can actually compare to experiment and that comparison is not so good but that doesn't matter what matters is that you try to get a statistically converged answer and if you get it I don't know if you have convergence no way of telling but no it is reproducible in the sense that each of these data is on the same chemistry but it's not exactly reproducible that every data is on the same experiment at the same time because then it would be silly to calculate how much they were all the same but when you talk about visibility if I now share with you my input that I did 15 years ago and you run it again you should get no I know you will get similar results but not exactly the same that is because you will use a different computer I mean the machine on which we run it on it was one of these things that you can find in a museum and I doubt that they will allow you to start it up to run it trajectory to check if things are reproducible so if you run it again you get the same outcome but you don't get exactly the same dynamics because it's a different architecture the home expression is different perhaps also the quantum and for sure the quantum chemistry program will have upgraded in the meantime that will always lead to some changes okay thanks but also thanks already in the questions so there are a few questions I don't see these questions is that okay maybe it doesn't matter yeah so there are a few different questions all relating basically to what software did you use ah oh I didn't even mention that hmm so far for reliability so this was all done with homax using the qm interface and the qm interface was to various programs so most of the work let me let me vary with yeah most of the time we use Gaussian for the qm calculations with one exception so this large active space with 12 11 electrons so that was I think one of these these studies over here Dimitri that was done with a program called firefly because it has this very efficient CAS-SEF implementation so that we could actually do this but all the others were done with Gaussian as far as I recall yes I think so oh no maybe this was done with maybe this was also done with firefly I'm not 100% certain of games but anyway but in general always homax and then maybe the quantum chemistry program is changing somebody asks what about cp2k so cp2k is something that that we have developed an interface for within the context of BioXL to integrate these two codes that is available but none of the work I have been showing here was old work it was not our latest work so at the moment we are using it now in our work and there will be I think I'm not sure is this still on schedule the webinar that we give about the cp2k homax interface Anna? yeah we will answer that later okay so there will be Dimitri will give a talk about capabilities of the new interface and the nice thing about that interface is that both of it is open source software so that means there is no because I know Gaussian has to be purchased cp2k doesn't have to be purchased so it means it's available for everybody so we hope that it speeds up a little bit the uptake of QMM but one has to keep in mind before you do QMM you need to know you need to think you need to have decided already how you're going to do it you cannot just don't expect that if you download these two programs and you connect them together and you start running something you can't immediately answer you need to do these decisions before you do that okay there's a question for the enzyme catalysis example validation of DFT but was it a comparison to semi empirical methods such as empirical valence bond considered and if so does the validation process change okay so first of all so if I rephrase the question let me rephrase the question again so have we validated the DFT against EVB and the answer is no because EVB is itself an approximation so EVB is empirical valence bond you parameterize that based on higher level QMM level data so it's not it's a semi empirical it's an empirical method it's almost like a force field which therefore should not be used to benchmark DFT functions for example so the benchmark in this study was done by comparing it to coupled cluster singles doubles and interpretive triples using I think a CBS but I'm not 100% sure what the basis said was there anymore so that is the kind of the golden standard which is also very low possible I think what I would do I don't know if others do that but I think what I would do is I would also parameterize the EVB parameters in particular the off-diagon elements and the diagonal elements I would also parameterize those on such data sets on such coupled cluster data sets so let's answer the question if it doesn't Yeah Barton who asked the question please go ahead it was a personal opinion it was not a general best practice parameterization it was a hard problem and people have their own preferences there but this is how I would do it I would use CC coupled cluster to parameterize EVB models and not the other way around okay Yeah then a question your last example when you do the PMF calculations is the whole protein allowed to move or do you freeze part of it Okay so everything is allowed to move so there's nothing frozen because of course once you start freezing coordinates you need to also calculate the effect of freezing so in this case these are completely free indeed the only constraint in the simulation is the distance between the two between in this case as you can see my arrow between this oxygen and that phosphase and that phosphorous Okay in slide 67 Wow somebody has kept track Yes how is the question is how is the reorientation of the H2PO4 group modeled Aha I went a bit faster over there because I was looking at the time I thought probably it's so late we put a distance restraints distance constraints between the O2 gamma this one over there and the proton sorry and the magnesium so basically what we do we kind of want to re re-ligate the magnesium by pulling the oxygen to gamma towards the magnesium to form again this part so that was the reaction called in at the end that's what you see here magnesium O2 gamma distance and extra starting from a last distance to a low distance where you have again a coordination Okay somebody asks how much computational resources and time that the ABC transporter PMF simulations take Yeah a lot I remember we had a so called grand challenge proposal awarded for that to be able to use very large numbers of CPU hours it was very expensive that is the only thing I recall not exactly how expensive but we can look that up maybe we even mentioned something about that in the paper which is cited here it was a lot it was a lot more than we actually had we had to ask for even more resources because we had to do the second this last step again as you know because this step here was done wrong so we raised actually a lot of CPU hours on something wrong but that's how it goes sometimes in science not every experiment is successful either okay so sorry no answer because I don't know I don't Okay thanks do you have time for any more questions or shall we leave it here I have time I don't know about the participants well it's being recorded so anybody who cannot no longer see mass time they can look at the recording at the answer later as well so what I say can be used against me so another question is did you use electric electronic embedding or mechanical embedding very good question because I haven't mentioned that so all these works were done with ah so all works except one were done with so-called electronic embedding that means that all the atoms in the in the MM subsystem that have a charge enter the one electron Hamiltonian of the the one electron Hamiltonian of the current subsystem it was not done in the case of PYP in the case of in the when we repeat it because here in this implementation does not allow for adding additional elements to the one electron so there we had to do so-called mechanical embedding or onion as it is called in Gaussian does that answer your question I assume so we'll hear back from the question if it doesn't there's one thanks for the inspiring talk in the example of ATPase do you include do you include large time scale protein flexibility maybe taking several configurations from an M-interjectory this could make the active site thermally accessible to water molecules yes so there's a good there's a very good question so in this case I think each PMF window was running a few picoseconds only so no large conformational changes during the PMF calculation the normal so in addition to this QMM simulations from also normal kind of standard MD it's force field MD simulations in those forces in these simulations we have observed water entering in the added state whereas the added state is not here in this one so water is entering the added state as far as I go but we did not observe in simulations of this IS-1 we do also observe water molecules going in but not at the position required for this two water for this two water for this two water mechanism and I recall how to pull in the water molecule there it was a very high free energy barrier so this is not something that would happen spontaneously so we were a little bit how to say you got a little bit too enthusiastic about that idea and the problem is of course what happens now is that so this profile ends here at this intermediate state 1 then a water molecule comes in which is of course changing the free energy somehow and most likely it brings it up quite a bit the point high up starts this profile starting too high here that means that this point is not connected anymore to this point and that is the problem whereas in the other approach where we simply introducing where we just start from this intermediate state 1 and then see what happens to get intermediate state 2 because these are directly connected so we can connect to both the energy profiles to get an oval is that an answer to the question I hope it is and if not then please there's a thanks for the answer and the thanks for some of the previous answers as well because they address what their question is there's another question which is what's your opinion about the use of polarizable force feels like drew to be used together with functionals including dispersion theory d3 compared to others for example Charm 36 so the question is the question is what is my experience in using the QM methods which incorporate this empirical dispersion corrections let's say is QM d3 corrections in combination with a force that is polarizable so what I will do is I will let the person who asked the question unmute Yasun if you are able to speak then you can maybe clarify the question so yes sir I'm going to unmute you now and if you don't speak then we'll just let Charm answer anyway yeah thank you very much yes my my question is basically because there is I have seen in the literature some kind of debate about the use of root polarizable force field because they do not add much advantage when compared with Charm 36 for example when it's used with Functionals that already includes the polarization so my question is basically what's your opinion about that because there is some word that says that in order to do that you have to actually validate root depending of what methods functional and basis that you used so and also root already is computationally more expensive than the the other force field so just a philosophical point of view yes okay I see the point it is an interesting it is an important discussion so the question is basically coming down to do you need would the results improve if you have a polarizable force field over a standard force field and this is a debate which is going on and the reason why it's going on for a number of reasons is that first of all there have been parameterized not for doing QNM simulations the the pairwise potential that we use in all the force fields it is just an approximation you try to capture many body effects with basically polarizability will cover polarizability is a many body effect you try to to cover capture many body effects with a pairwise potential and that means you have to make compromise on the value of the parameters of the von der Waals parameters sorry of the energy energy model and the charge of the atoms these partial charges often do not reflect the actual charge distribution that well but they are trade-off in order to be able to build a good model and then you parameterize that model or you train that model in machine learning language you train that model on test set that can be free energies that can be structure many different possibilities this is bad these point charges have not been optimized and von der Waals parameters have not been optimized for work with QNM so the moment one part of the system and use a QNM method it's actually the fact that this works is more amazing and that it doesn't work now intuitively if you would add polarizability and then do a QNM calculation with a polarizable and force field I would expect the results to improve and in fact I did not talk about it but when we do these calculations with a polarizable force field ESP is a polarizable model of water we actually get very good results I have actually I did not have for example excitation energies so that's a guess that polarizability is very important for excited set properties so we kind of know this but it is also not really workable to use a polarizable force at this point for QNM there's a lot of problems with that because you have to do SCF the nice thing about point charges they don't change so you convert your wave function you find the lowest energy and the moment you have your wave function converged that will of course change the polarizability the polarization of the atoms around it so you have to update the polarizability then of course you have to change the so you have to build a self-consistent field routine which not only updates the electronic degrees of freedom so the coefficients of the wave functions but also the polarizability so it's a very it's a much more difficult thing to do I'm not saying it's impossible people do this there are many groups who develop this so many people are developing polarizable Q&M models but my experience with it is zero the only one we have done is here Dimitri Azhandus EFP is effective fragment potential-based polarizable model in Q&M calculations now to the questions of polarizability in the Q&M regions if you add polarization functions to the Q&M region in general what can happen if you use point charges is what is called the old polarization problem that if you have too much flexibility yeah and again it is a best practice a workshop so in my opinion and in my experience using a bit more limited basis sets so not too many basis functions not too many in particular not adding polarization functions or not adding too many polarization functions or in particular diffuse functions is something that I think you should do when you do Q&M should not include say diffuse functions because that will lead to too strong interaction with point charges but it is a personal opinion I have not seen clear comparison that people look into properties that you can actually validate where you know the answer how it changes when you start including diffuse functions or more polarization or this dispersion collection so bottom line my experience here is relatively limited and from my from what I know others have done the answer is out the answer is still not answered it is not clear if polarization with the currently available polarizable force field models improves the Q&M results so for certain things it improves it and of course there is also publication bias here if you're improving a result you're going to get it much easier problems than if you show that it works the opposite way so it's difficult to justice in my opinion from the literature alone so this is still a philosophical or my not even philosophical this is an important debate which I think has not even it's not even close to becoming settled so now I said a lot are you happy with what I said thank you very much yeah I'm sorry my stuff something good very insightful thank you very much okay we have a question from Mohammed I will let I will unmute you so you could ask the question yourself otherwise I will read it out hello well first of all very nice talk I really learned a lot from it so I had a question about using canp3 LIP for the protonation and deep protonation and well it showed a good result or this is what you are expecting experimentally and this is why you decided that this is a best method for the benchmarking in a specific and my question is the rest of the methods are not doing well and did you think that this might be an error alright so I understand a question but I also realized that I failed to be clear because your question actually is the opposite of what I was hoping to convey so basically but we have so if you use canp3 LIP in combination with the isolated protein so you take the protein out of the crystal you put it out there in the vacuum and then you do a QM calculation with canp3 LIP for the QM region then you find the distance of the proton any other functional or any other model even if you minimize this model in energy minimizer model you lose this observation it becomes a normal hydrogen bond again on the basis of that you can do two things you can either conclude I trust the Newton defection study and I therefore conclude that everybody who tries to solve two to model an environment a crystal environment with anything but canp3 LIP does it completely wrong this is the ultimate evidence that canp3 LIP in combination with a vacuum the charged vacuum model is the ultimate chemistry model that is one conclusion the other conclusion that we draw is that based on the big scatter between different methods and based on the big scatter between different models we can't conclude anything that is the most likely interpretation only one specific combination of parameters in this case model plus parameters gives something that is somehow slightly in agreement with that experimental result now if you take into consideration on top of that x-ray defraction I'm sorry any defraction study because what you measure in defraction is you measure intensities in reciprocal space you don't measure faces you have to do some computational tricks in order to get the faces so there is always some issue there and what we show here is that if we calculate the faces or models where we put the proton at the normal location we actually get as good density map sorry we get as good representation of the structure factors as if we put the proton in the middle so this basically means that for the same experimental data the same experimental data you can put the proton in many different positions and still get acceptable agreement to favor this structure over this structure so my conclusion here our conclusion here is also that the experimental data do not support this middle in the middle hydrogen bonds over a normal hydrogen bond and I think if you have to choose I don't know who is one of these quotes right I mean if you have to choose between an extorted how was it again I forgot the quote but there is no reason to favor this over that if this fits all the other data including the NMR data I haven't talked about that but the chemical shift of this proton should change dramatically if it is like this or like that but actually the chemical shifts look very normal in this protein so all the experimental data that we have points towards this the computational data except one combination points towards this and the Newton defraction data points towards this this this artist then I take this but that is my personal opinion and as you can see if you now go to the literature you find different conclusions based on different methods so that is why we kind of refrain from the conclusion we just say that there is no evidence from computation for such a long which doesn't mean it is not there it can still be there but it's just that with the data we have today it is not possible to conclude that does that answer the question so it's the other way around I mean we don't use this as evidence for come with the lib being the ultimate functional actually far from that I think based on this I say come with the lib it's probably not such that makes sense well thanks so much there's quite an interesting somewhat general question from from Varun hello can you hear me I hear you okay my question is in regards to the enzyme system from what I understood you were saying that the remaining protein is not frozen outside of the active side of the QM region and if that's the case from what I saw in a couple of the free energy profiles how are the protein configurations reflected in this because I only saw that the bond distance was shown as the reaction coordinate and so how are you taking that into account the solvent or let's say the protein environment into the reaction coordinate yeah how is that good so basically what you do is you constrain the distance between in this case the oxygen and phosphorus and you change and you bury the distance so you do it and now you run a few picoseconds of molecular dynamics QM and molecular dynamics then you at the same time or the different processor you run assimilation the distance a little shorter at the same time on other several processes but it is shorter so in the end after a couple of days you have got a lot of picoseconds for each different distance and with that information you can perform this integration everything else in the system you probably don't see my hands everything else in the system is just moving around like it normally does and that is reflected in these arrow bars for example on the profile so the water molecules everything else was just moving around of course a previous question that was earlier asked is that conformational changes like large changes in the conformational protein they cannot be covered because you only have four picoseconds okay for window then for that you would have to simulate maybe microseconds per point in order to be able to model whatever conformational changes can happen and even then we don't know because these proteins typically have a whole hierarchy of time scales of relaxation time scales some things happen very fast some things happen very slow and it's difficult to cover all of that okay thank you okay we have what could potentially be the final final question from Renata you saw the pump spectrum for the p-w-p protein for the for the and I was wondering to which extent do you use it like you use it just to just to look okay the bleaching signal is until this number of seconds this number of seconds and that means that the process lasts to this extent to this like interval of time or you use it also to look for example at the different dynamics you can find for example if you have like the fingerprints of an isomerization process or another one you know yeah yes I understand the question so what we tried so let me first go through this fingerprinting so what we had so Saturday actually we started out using infrared using time-resolved infrared where we know that if a single bond is isomerizing in a solvent which has a very high viscosity and we did it in cyclohex 16 carbons and an OH group what is it called something or no whatever it's a molecule it's an alcohol with very low alkane change it's very viscous but the isomerization still happens there the plan was or the hypothesis was or the hope was that the single bond isomerization channel would form an intermediate where it was 90 degree twisted and that had a large effect that had not a large but it should have a measurable effect on the CO stress and the CO stress is always easy to identify so we tried to use time-resolved infrared try to point out that the CO stress is that we can see a fingerprint change indicating that we are half way the single bond isomerization and that failed because single-to-noise was way too poor so that we resorted to very standard time-resolved time-resolved because where you just measure the changes in absorption after you initiate the chemical reaction so this is not really fingerprinting because in the optical range respect to our way to go out because all the fingerprinting and the gravitational fine structure is hidden under these bands so instead the only information that you can tell from this is the blue indicates that something is missing so it means that you have a whole test of molecules let's say 10 to the power 40 molecules you may be exciting 10 to the power 2 or 10 to the power maybe 10 to the power 5 molecules whatever but you still have the majority of the molecules are not excited so if you then after some time you will see that some molecules are missing so the total absorptivity has gone down and that is what this blue reflects so we see that molecules are no longer present in their ground states because they are now doing their photodynamics they're doing the excited state dynamics but we see that after some time we no longer we see why that means that the absorption spectrum way before let's say time-10 is the same as the absorption spectrum as now and that means that molecules are basically the whole system is basically back to it can only happen if the molecule undergoes a transformation back to the ground state reforming the original configuration and we interpret this because it matches with our MD simulation as an ultra-fast isomerization on the single bond but I cannot rule out that maybe the blown breaks the molecule forms again and that causes this ultra-fast reaction we don't see it in the simulations but it is not an evidence that it is not there so now what we have is basically inference we infer that based on the fact that in the simulations we have this ultra-fast repopulation of the ground state by the isomerization we see ultra-fast repopulation of the ground state in experiments but of course we don't know what happens these two things together say we think it's this and we have experimental data supports such mechanism but these are not fingerprints for anything there's too much noise in this data and you would have to go to the infrared and that we have tried but it was very difficult does that answer your question great okay now shall we shall we leave it there there are a few more questions but we can leave it there and maybe ask if anybody would like to still follow on they could perhaps get in touch that is fine yeah people can write me an email yeah okay so they can find find your email on the BIOX hope they can contact you you're listed on the BIOX website BIOX.edu and also via your your institution and also Google yeah okay well thanks everyone for attending we hope this was useful the other webinars in the workshop will be advertised on the event page on the BIOX website we look forward to seeing you at future workshops at future workshop webinars I should say and thank you very much okay goodbye I say