 Okay, so welcome back everyone to our final practical session. In that session, our main idea would be to go through basic workflow, how we usually work with the protein systems and starting from that PDB structure which usually almost the same as you could download from the PDB file, could make a protein system and then convert it, then make a normal classical MDE equilibration and then convert it to QMM and calculate some properties for that. So let's, I will go, ah yes, also in that session, we will go through the, how you can set up the non-standard CP2K parameters. The one which for example, Holly told you about in a previous session, yeah, like using DFTD3 corrections using non-standard basis set and plane wave cutoffs. So I will show you, I will give you an idea how it should be done on the example of calculation of absorption spectra for the EJFP protein. Okay, so let's just go through the basic workflow which you usually, which one usually apply when you do in some KMM simulations. So in reality, what you do, you first start with some input structure, usually it crystallographic structure, sometimes ZMR structure, sometimes it's cryo-M structure, whatever. So you start from some coordinates, then what you should do, usually it's from PDB4, PDB data bank. Or if it's membrane, you will membrane yourself for example, but first of all, what you should do, you should treat your PDB file. So you should add any missing residues, you should check the protonation states of your histidines for example, you should then prepare your, define box size in which you want to solve this here or how you want to calculate this. Then you need to basically do PDB2GMX to generate Rommax topology. So of course you need, for any non-standard residue you will need parameters, like for chromophore in that case, you need parameters for standard residue. So and after that, you basically will get a working system, but I suppose that as almost all of you are, well know what Rommax it is, so you can go through the steps yourself and you are going through that in your usual life. And on the next stage, you perform your MD, classical MD simulations. So you do first energy minimization, then you do equilibration, then you do usually production run. That's how usually they work for MD, for classical MDs working. And then you can do whatever you want. So then you can do a FEP calculation or a replica exchange or whatever to such confirmations. Okay, but then how then to switch to QMM? You basically need to do what we already done last day, in the brief day. So in the index file, you need to define the group which will be your QM atoms. Then you need to put that MDV parameter switch I showed you already yesterday. Or there is another option, which we'll talk about today. You can also provide an external CP2K input instead of the automatically generated one. And then you can use whatever parameters of CP2K1. So and with that, you can perform QMM calculations. So in geometry optimizations, like a computational spectra doing QMM molecular dynamics like porno and Heimer dynamics in such case and calculating some properties like charges, social spectra, that's all. Okay, so here today we'll work with a system called TGFP. So what is TGFP? It stands for enhanced green fluorescent protein. So it's a mutant of the wild type green fluorescent protein which have cell like border or so higher fluorescence intensity. It's one of the first mutant which has been made for that. And usually if you're talking today about some GFP labeled proteins in the biology, usually they are labeled with that EGFP instead of wild type GFP. But nevertheless, so the crystal structures that have been solved already a long time ago and it is available in several variants in the PDB data bank. I have two, just one of them. The one, the most complete one. And inside that protein, there is no standard residue called chromophore which can absorb light and then fluoresce. Yeah, these with very distinct green fluorescence. So this residue, yeah, the cold chromophore and this is a, it's formed by after catalytic reaction between glycine, tyrosine and in that case, I think it ran in. So it's some after catalytic reaction but you can read about that how it works but it's not so easy. Okay, but anyway, this non-standard residue can absorb light. And what we want to do in our final exercise, we want to calculate that first spectra of that protein in solution using the EGFP. But first we need to make a suitable model for that. So, and that's why I suggest you to start with exercise five steps from one to five. So in steps one to five, you will perform basically the first buildup of your system from the PDB file. And then you will do the molecular or mechanical equilibration of that. So first you will do energy minimization with classical force fields. And then you do NVK run one and a second only but it should be okay for that system. And yeah, so please do that. And once you've done that, please rise your hands. Once you submit the second one, the second once you submit the NVT molecular dynamical trajectory because it will take some time for it to calculate. Why does this happen? Because probably you have incorrect charges for the chromophore, for example, in that case because charges of the chromophore could be parameterized as such they are nonzero in total for the chromophore atoms. So usually when you parameterize some amino acid residues you have a zero or plus one or minus one charge for the whole residue. And that is your constraint. Yeah, but in that kind of system it depends how you parameterize that constant that residue. So for example, this chromophore which I showed in the balls here, it is a minus one charge. But I think in the force field that some of that charges on that atoms will be non minus one. No, no, no, this could be, but it's very rare. But it depends if you have a very small deviation from the charge then it's probably just rounding graphs of your scripts with the issue convert. But if it's not, then it is probably that you are missing some parameters for some residues in another force field. And what happens that basically this, for example, zero solid, it's a top automatically zero charge on that kind of standard residues and that's why you see some integers just. Yes, usually you need to check your charges in the parameters so that they will be integer because you don't have, you cannot have an integer charge in classical usually this means that something went wrong. In QMM however, you can see that, you could see that as you saw yesterday. Just checking that we don't expect that minimization convert just complete one thousand steps. Yes, it should not converge. We don't need, you don't need usually to wait for convergence. Minimization in case of classical MDE, its main idea is that your bonds angles and the hydrals will be more or less in the right, in the correct average value. You'll have more correct average value so that once the MDE will start, they will not, it will not break your system into the parts. Sometimes it's also happens when you use constraints for production run, if you use it with constraints, then usually if you do not do minimization beforehand, your constraints will start complaining and it will blow up your system. So minimization needed just for, so that your starting structure for MDE will be more or less in the right, in a with the correct bonds length and angles. Yes. So for example, that's why in that, also in that exercise, you start your MDE equilibrium with the optimization trajectory. So you're starting with the last frame of optimization. So basically that's the idea that you first optimize then you do a equilibration and here on and then you do production and here on. Why do we set option during salutation shell 10? Okay, it's one of the options how you can fill up your system. So if you are familiar with Gromax GMX Salvate tool, so shell 10 means that the Gromax will salvate the system such there will be at least 10, oh, not please, but how it's called. So it will try to salvate your system, which is protein in that case with a range of 10 nanometers. But if that 10 nanometers will go over the box of course sites, because our box here is 10 nanometers by 10 nanometers by 10 nanometers, then it will cut it. So it just a way how you can salvate the whole box. That's how I usually do it. There is a hundreds of way how you can do it. But with that shell option, probably the easiest way how you can. So it's basically the radius in which it will add a solvent around your system, but it is not bigger than your box size. Yes. And no, no, no, it's not specific. It's just how I'm usually doing, but you can do it whatever you want. You can do it even in VMD. You can salvate and you generate items in VMD and then just pass it to the Chromax, PDV to JMX. Yes, that system, but just how the Chromax works. Okay. I see that many people have already managed to do so. So let's continue. I will put back your hands. Okay. So now we will switch our system to KMM. Once the calibration for you is finished, I will check quick. Okay. I see that some people are still running. Okay. But no, anyway, wait until your job, equilibration or classical will finish and then continue. But so for the MTV parameters, we will use the following parameters for initial KMM equilibration again run. We will do first a short KMM equilibration again. So our KMM group will be KMM items, but in that case, you will need to generate index value yourself. So our KMM group will be that items, which I highlighted here. So it is the one which is both here. So it's Chromax for basically items. It's not a big system, but still. So and the chart shows that system as I already told it's minus one. So this is a new one. And the multiplicity is again one. So you can look into that parameters in the MDP file and respect the MDP file. But first let's start doing step six to nine in the step number six. I suppose you will need to generate index file. Yes. And this is how you define that. You also can check yourself if you render the Conf.Grow file that these items are actually Chromax for items. Note that they may be, they may have index plus minus one depending on the starting index of the first item. In your particular program, for example, I don't remember. In VMD they will have index minus one because VMD starts indexes from zero. Chromax starts indexes from one. So, but you can render them in VMD or PyMol and basically check that indeed these items which are set as a QM items in the make and index utility is that items which I showed you. I had to rename my QM items group in index. Yeah, the group you will create will be called A underscore and number range of items. Then you need to rename it to QM items. That's what the second line is doing. When you do name, name 18 QM items, you rename it to QM items. Or you can leave it like that and then put that index name into the MP file. It's up to you. It's up to you. What is also important here, I'm getting 19 items in my index. Let me check. I guess it should be a 18. Yeah, correct, it's 18 items. But do not forget to rename it to QM items. So, idea is that this charge, the Chromax warning you that after the transfer in topology to the from purely classical amount to QM amount, you don't have a zero charge in the total system. The total system is already half a thousand zero charge now. Why? Because the QM sum of charges of the QM item is not classical charges on the QM items is not necessarily should be the same as your QM charge. That minus one. For example, in that case, it's something like minus 75 or minus 74 and seven. And that's why it warns you that you have access 0.25.3 charge. How it suggests that usually how it should be done, how it dealt with. Usually it is suggested that this excess charge and also Chromax prints on the next line that you have, you consider to redistribute that excess charge onto the nearby MM items. And that's actually what you need to do. So basically you can modify the charge of that nearby items here and there, such that they will be increased or decreased by the number. So that's the total charge of the system will be again zero. Only you can do that, but in reality it is not so important because the charge is really small. And usually no one care about that, but sometimes it can be important. For example, if you do a membrane system, it's even the membrane. So that's why the Chromax just suggests you how much excess charge you have and how you should be, but how you should redistributed itself up to you. You can even leave it like that. So another problem is that there is no unique solution how you should redistributed. So everyone is doing it like how they want mostly. Yes, that's why it warms you. But for example, if your total charge of your Chromatoms was zero, like yesterday in still band case, still band neutral molecule and both all atoms have zero sum of classical charge from zero, then it will be zero excess charge, of course. But it's usually happens in the protein systems when you break your bones and that's a new total charge should be not zero. Okay, so let's continue. Let's continue. So as the outcome of your QMD equilibration, I should have something like that. So this is just a render of the trajectory over 100 femtoseconds. Yeah, this is a temperature. So it's not really well-equilibrated or maybe to break more, but we don't have much time to do that today. But if you look at it like in several picoseconds, for sure. Okay, let's finish with it seems like it's going to run like 30 minutes. Who is going to run 30 minutes? I'm sorry. This one? No, it should not. If it tries to run for 30 minutes, then ah, spec, but QMM spec is the next steps. Yeah, you're already doing the exercise six. Yes, it will run approximately that. So exercise six, how we want to modify, we want to use non-standard in our system. We want to go with the spectrum of that system. It should just equilibrate it. And we want to, how we want to do it, we want to use the EFT. And to do that, you should modify CP2K input. So how it should be done in general, instead of that, you use a QMM method called input. And then you need to provide the additional string, which is called QMM QMM input file. And it should be a name of the CP2K input file in the same directory as the key parents will be. So in that case, you can provide whatever options you want in that file. And just remember that this file should still be, should still represent the same system. So it's better to use a reference CP2K to generate it by the interface and modify it and then provide here. Because then you will have no problem with the system. Okay, so this is our objective. So now we want to calculate absorption spectra. How we want to do this? We will do the molecular dynamic trajectory. And along that trajectory on each step, we will calculate also excitation energies using TDDFT. And then we convolve that excitation energies into spectra. So how you should do this? Please do the steps one to five of the exercise six. So here it is a simple, simplest case, how you modify your input file. You basically copy already the input file, which have been generated in the previous equilibration run with PDB also. And then you modify whatever you want in that unit. So you can also change here basis set. You can also change here cut-offs. You can also change here. You can also add here dispersion corrections if you want. You can also even change here to the bit relief if you want. So whatever you want to do, you can do like that. But here we just add a property section. We will add a property section to calculate which informs us to keep on each, during the after the calculation of the force to also perform TDDFT calculation for five states, excited states with my e to the power of minus three convergence of like in electron of the excitation of energy. It is also easy to run simulation from excited state of the molecule in the QM part of the case. Well, with that kind of things, yes, you can, but what you should be warned, yes, you can, but for now interface does not support surface hopping dynamics of course, so you can deactivate. So you will be stuck in one state always. Yeah, but in practice with that, yes, using non-standard parameters, you also calculate the gradient of the excited state. If you know how to do it in CPTK for TDDFT and then you can perform dynamics on that excited state using that non-standard input and perform optimization of that excited state. So, for example, if you want, yeah, okay, please. Once you will run the spec, that spectra calculation will take quite some time, over 20 minutes as already have been noted by the people. And yeah, probably we can discuss a bit further what will happen. And next, it seems like you are reading wrong because, ah, yeah, yeah, it's an error. Yeah, I saw your thing. Yeah, of course here, because you are providing the external input as the gromax does not know that your QM system have minus one charge. So how you can solve that, you need to add this again into the MP file, this QMMM, QM charge, equal to minus one option into the MP file and then it should be correct. Yeah, I need to think about how to fix this. Probably it should not print at all that warning if you're providing external input because then it's up to you how you do this. Yeah, indeed, sorry for that. Actually I need to write it down because in that case indeed the output was a bit confusing. Yes, that needs to be fixed in the future versions. Yes, I will write this. Yes, probably just never mind about this. This is just an error, not an error, but inconsistency for now because it's alpha version. Yeah, if you provide your own output file or input file then the gromax basically should not complain about this kind of stuff. Because when it's up to you, it doesn't know what is a charge of your QM system. Okay, you will still have a time to finish whatever you want. Here I just want you to show what we're gonna do next after how we can walk to the actual spectra from our simulations. So this is how your MDP file looks like. So yeah, here you add this external input file. And how we want to calculate the absorption spectra of your system in the following way. So if you do a less on that output file, you can search for it and find that kind of things, that kind of tables after each step, after each calculation of energy. So this is a result of the TDD calculation. Basically, this is a state's number, excited state number. So we request five states, so there is five states. This is excitation energies of that in the electron walls and this is a transition dipole from the ground state and this is a completeness later strength, which is just a function of excitation energy and transition dipole moment. And states, I don't know, typically several states, like five to 10 will be fine. Yeah, I mean, the energy of individual states is not dependent on how much states you requested. It's just a completeness of spectra will be different. Okay, so and those later strengths and using that two parameters, this one, this and this, you can convolve the absorption spectra. Yeah, okay. So here, how we will do that? First, we will grab out of this output file, the column number one, two, three and seven, yeah, so column three and seven, okay. Using the Oc tool, it's a bit of the bar scripting, so everyone who know what it is, probably already understand what it is doing. And we'll create this, you'll create a file excitations, which will contain or adjust only this and these parts of your output file. So, oops, is this missing something? Yeah, so do basically step seven. I think this is excitation generation. I guess. And after you've done that, it's pretty easy. I mean, just one option. So if you look into that excitation file, there will be a two columns, first is excitation energy, second is slater strength. Grip from that output file. So first one here is excitation energy of electron volts, second is a slater strength with some arbitrary units. It's actually unitless, but yeah. Okay, so now how we will deal with that, so how we'll control the spectrum. We will use Gaussian functions. So we'll basically, in each point of that energy, we'll place a Gaussian function with the heights equal to the slater strength and with the energy equal to this one and with sigma, it is some parameter, it's just a half width of your Gaussian. It will be 0.1 electron volt. Why? Because typically the TDDFT gives you a 0.1 to 0.3 EV error and yeah, 0.1 is just a less meaningful number for that. So yeah, so here is what we can do. So here is UVF, so here it is. E is your final energy which you want to convolve. So this is just a parameter as it's just like X and you want to convolve that I function. So it is done in step four eight. So you need to module load Cray Python to use the Python actually on the Archer so do not forget to do that. And then you can convolve your spectra. So first parameter is your name of your file with excitation. Second parameter is, I will show you probably here. So first parameter is file with excitation which you just generated. Second is a sigma in electron volts, 0.1. And then there is minimum and maximum energies in electron volts between which you want to calculate the spectrum. So basically the boundaries of the spectra. So after doing this, it will be generated a spec.xvg file which you still can open with any software you want. I mean, in a way you don't need necessary to wait until the QMM spec run will finish. So after 10 minutes, there will already like 20 to 30 frames which from which you can put some spectra. But yeah, basically you can do it at any point. But what is very important that if you look into the results that after approximately 100 femtoseconds which I generated like yesterday or several days ago. In my case, in your case, it might be different but the spectra looks like that. So you see that there is really two bumps. However, if you extend your sampling and you increase and you calculate the spectra over 3,000 frames. So after the three-picker second simulation it will stabilize and it will be always like that. So in reality, this is like one huge bump and a small shoulder on the side due to the higher excited states. So basically this is one peak, usually main peak and this is a side peak or from a stool from the higher excited state. Also it should be noted that this peak is not very well reproduced the experimental peak but this is mostly a problem of the EDFT for really reproducing exactly the experimental peak. Usually if you're doing this EDFT, you need to shift it. So you basically need to apply. Now you need to check what is the experimental peak here and you need to apply that shift to get it correct. However, the EDFT is generally not so suitable to reproduce from the scratch the exact spectra. There is much more elaborated methods to do so. But what you can do is the shape of the spectra and you can reproduce very well the shape of the spectra and you can very well reproduce the, for example, if the EDFT you quite well could reproduce the what's called stock shift. So basically the difference between excitation and absorption of the mission spectra. Okay, so I think for today this kind of tutorial is finished. So what I want to kind of inform you that we have already calling noticed the best practice guide for that non-standard parameters for all the K2K parameters relevant to the biological KMM. So the second one, there was already a workshop which you can about the, usage of the KMM in real for real scientists, from real scientists, yeah. There was series of webinars which you can pick up in the, which are available through the YouTube channel. So you can see how the people are using the KMM in the real job and the examples of that. Also the CP2K, for example, there's a couple of people who are using it was here giving the talks. And what I want to you also kind of promote is the Excel YouTube channel where you can find a lot of interesting stuff and a lot of interesting webinars regarding the modeling of biological systems, not with KMM, but with anything, yeah. Not only with KMM, but with many of other stuff. So yeah, from that I think we are kind of finished. Yeah, so any questions? Yes, so the question is, what is the minimum solution type for an accurate spectral question? How it depends a lot on the system. I can say yes, so how I suggest you to do it. So first of all, you will need to do a long equilibration run with classical MN. So basically do like tens of nanoseconds yet to be sure that your system is equilibrated. Then from that 10 nanosecond, like let's say 10 to 100 nanosecond trajectory, yeah. Then from that trajectory, I will suggest you to pick up frames each several nanoseconds. And from that frames start a KMM trajectory like we done today. And do that trajectory for several hundred seconds and gather the spectra from them and then convolve the spectra over your system. So starting from several MN structures from which you do the KMM simulations and from which you convolve the final spectrum. So basically put all everything into one slide. Why you need a long MTC equilibration? Because typically your protein could live in several conformations. And transfer time between this transition, time between these conformations could be in nanoseconds or even 100 nanoseconds time scale and you never know. And they could have a bit different spectra. So that's why you need a long classical MDA equilibration sampling. And from that sampling you then pick up frames from which you start KMM simulation and convolve spectra. Yes, so in that you need to have a good sampling, yes.