 Hello, good afternoon, and welcome to the latest webinar in the BioXL webinar series. My name is Adam Carter, and I'm going to introduce today's webinar, which will be given by Andrew Proudfoot from Novartis. The title of the talk is High-Confidence Protein-Ligan Complex Modelling by NMR-Guided Docking, enables early hit optimization. So I'll hand over it shortly to Andrew to tell you all about that. But just while we get started, I'm going to give you a little very short overview of BioXL just for a couple of minutes. And before I get started, I should let you know that this webinar is being recorded, subject to some checks with compliance. It's likely to go on YouTube on the BioXL website after the webinar. So the chances are that you will be able to watch it again there afterwards. All right, so for those of you who are not so familiar with what BioXL is, we are a centre of excellence for computational biomolecular research, and we're kind of focused around three main areas. The first is biomolecular software. So we have three core codes that we work with in the project, and we actually have some of the lead developers of these codes involved as part of the project. So we have developers from Gromax and Haddock, and also people working on the QMM interface and CPMD. So the development of these codes, improving their performance, their efficiency and their scalability, that's one of the important aspects of what the centre does. Another important aspect is usability. So we're trying to make it not only possible to use these three codes, but also think about how they are used in wider workflows to make them easier to use by thinking about the whole process in which they are used. So the project is looking at certain different workflow environments and the building blocks for workflows so that these things can be used more easily. Finally, the project is also involved in consultancy and training, so we're trying to promote best practices and train end users into how to use these pieces of software and other aspects of biomolecular research using HPC and high-throughput computing. The project has a number of interest groups that you're welcome to sign up to. At the moment we have the following interest groups that you can see on this slide here. So if you're interested in any of these subjects, you can go to the BioExcel website and sign up there to be a part of these interest groups. The different interest groups work in slightly different ways, but in some cases they have forums, for example, and there's other services like code repositories and video channels that we can offer to interest groups if they want them. We will have time at the end of the session today to take questions for the speaker, so I hope you're free to ask some questions. We'll normally wait till the end. It just makes things flow more easily, so you can save up your questions until then. You can type them in at any time into the questions box in GoToWebinar, and I'll invite you to do that at the end as well so we know which questions we have for the speaker. If you have a microphone and you're able to talk, I can open your microphone and you can ask your question directly, otherwise I will read out your question to the speaker. If you're watching the recording of this later on YouTube and you do have any follow-up questions, you can send your questions to the forums at ask.bioxcel.eu. Now I'm going to hand over to Andrew shortly. Andrew completed his PhD in 2012 at the University of Sheffield, where he was working with Professor Michael Williamson there. He was using NMR for structural studies of certain enzymes, and then after his PhD he moved to the U.S., where he's been doing postdoc research in CURT. Futhrich's laboratory, apologies if I've pronounced that wrongly, at the Scripps Research Institute and at the Novartis Institute for Biomedical Research. So you can read more about Andrew here and on our website, but I'm now going to hand over to Andrew to allow him to go through the rest of his presentation and I'll be back at the end to take any questions. Alright Andrew, I will now make you the presenter and you can take it from here. You should be able to share your screen now. Thank you, Adam. That's good, great, thanks. Perfect. Well firstly, thank you very much for the kind introduction and thank you very much to everyone else who's logged in to listen to me today. So I would like to talk to you today about a common problem that we have in pharmaceutical work, in that we generally do not have problems identifying ligands that can interact with our targets of interest and these are usually identified using screening methods like NMR or SBR or DSF. But a problem that we have is that we can only generate structural information for about 20% of these binders and this is normally done through X-ray crystallography. X-ray is hindered to a certain extent when working with dynamic targets or when working with low affinity ligands. Normally you need to saturate the binding site to generate interpretable electron density. So on average you need to have roughly 10 times the KD, which means that generally only your more potent fragments will allow you to get co-crystal structures. This results in a deprioritization of compounds. If we have a structure then we will follow up with that scaffold rather than investigating the scaffolds that we cannot get co-structures for. So this leaves a problem. We have 80% roughly of fragments that we have identified interact with the target and we can do nothing with these and often these fragments have different features which may be beneficial for a drug-like molecule. So we then start to look at other methods to generate this structural information. There are NMR-based methods. We can do a complete structure determination. This takes time. It's fairly labor-intensive. There are other methods that can be used which use ambiguous distance restraints or chemical shift perturbations. There are other methods being developed like the NMR molecular replacement method or NMR squared as it's been called. One of the techniques which I decided to focus on and develop was to use haddock to generate docking models. That is what I would like to talk to you about today. So this is the workflow that I'm going to go through in my presentation and I'm going to address each one of these as they are all important parts in being able to generate the end result. So the first thing I'll talk to you about is the selective labeling of your protein of interest. When working with small soluble targets, uniform carbon or nitrogen labeling is often sufficient to analyze these proteins by NMR. But as your target gets larger, you need to think about possibly deuteration if you're working in the 20 to 30 kilodoton range. Or as you get to an even larger target, you need to start thinking about selective labeling of side chains. The target that I was using is an antibacterial target called CoAD, which is an enzyme. It binds ATP and for phosphopanethine and results in the production of dephosphor CoA and pyrophosphate. CoAD is a 115 kilodoton homohexamer. So it's not amenable to uniform labeling. So we need to start thinking about more elaborate labeling strategies. As you can see in the bottom panels here, CoAD is functional as a dimer and the enzyme active site is made up of two protein subunits. On the panel on the right, you can see that the enzymatic pocket is made up of two sub pockets, an ATP binding pocket and a for phosphopanethine binding pocket. And these pockets are surrounded beautifully by a number of methyl groups. In fact, methyl containing amino acids constitute about 44% of the polypeptide sequence of CoAD. So we decided to use methyl labeling. There are protocols in place for the selective labeling of each of the methyl groups individually. And here I have spectra of CoAD that have been selectively labeled for each individual amino acid. However, to maximize the amount of information that we could obtain from a single sample, I needed to develop a protocol that would allow the simultaneous labeling of all methyl groups. Here you can see on the left hand side a superposition of the six spectra that I previously showed you. And on the right, there is an NMR spectrum of a Milvap labeled sample. So this is where all six methyl containing amino acids have been simultaneously labeled. You can see that the spectra are very comparable. And the nice thing is with CoAD, there's very little overlap observed between the different methyl groups. There is always a concern that as you introduce more NMR active nuclei back into the protein that you're going to have adverse relaxation occurring. But as you can see in this case, the simultaneous labeling of all methyl groups does not adversely affect the spectral quality. So this is good. We can now use our Milvap labeled protein and we can start to analyze some of these small molecules that don't crystallize. Another consideration that needs to be taken into account is the type of carbon source that you use whilst expressing your protein. Here I have a 1D NMR spectrum of Milvap labeled protein, which has been expressed with carbon 12 protonated glucose. Now as you can see, some of the carbon 12 protonated glucose has been incorporated into the aliphatic side chains and the aromatic side chains of the amino acids. This is done primarily in a deuterated environment. So all of the protons that we have seen here either have come from the carbon source or from the selective introduction of the methyl groups. Now this can be problematic because as you can see, the signals for a small molecule come roughly around the regions where the aromatic bound protons are or the aliphatic bound protons are. So if you're trying to get into intermolecular NOEs between your labeled side chains and your unlabeled small molecule, you're also going to generate intermolecular NOEs between your labeled side chains and your unlabeled parts of your amino acids now too. When you express the protein with carbon 12 deuterated glucose, you can see that there is a significant reduction in the number of aliphatic or number of protons that are now incorporated into the aliphatic and the aromatic regions. So this will allow you to get much cleaner data and ensure that the only intermolecular NOEs that you observe are between your protein side chains and the small molecule of interest. So now we've identified the labeling scheme that we're going to use. The next thing we need to do is to assign the methyl peaks of the protein that we are interested in. A standard protocol for assigning methyl groups is to make point mutations. Here you mutate out the methyl containing amino acid of interest and you acquire a 2D spectrum. By comparing the apo to the mutant spectrum, you see peaks disappearing and you can start to assign methyl groups like that. As you can see on the panel on the left, it's a very nice clean data set where a single peak has disappeared after mutating 3 mean 10. So it's quite clear which peak assignment belongs to 3 mean 10. On the left hand side, however, we have another example where mutation of leucine 102 has resulted in global changes in the NMR spectrum, making it impossible to assign this peak. It's not just global changes in chemical shift which can inhibit this protocol. Often mutations can significantly reduce the amount of protein that can be expressed to the extent where you get no expression. So this method is highly variable and you can invest a lot of time with no guaranteed results. An alternative approach is to use what they call a methyl walk. In this case, you acquire a high-dimensional nosy data set. In the example shown here, we have acquired a four-dimensional HMQC nosy HMQC and this experiment allows you to obtain information about the spatial proximity of all methyl groups surrounding the methyl group that you are looking at. So in the panels shown here on the left hand side, we have focused in on all of the methyl groups which are observed by isoleucine 36 and based on comparisons then with a crystal structure by comparing the observed NOEs to the expected methyl NOE network, you can then start to make assignments and walk through your protein to assign a specific region. Because often when we're doing these studies, a full protein assignment is not required. You're only generally interested in reaching around an active site or in a specific part of the protein. So anything that we can do to reduce the amount of time that we spend assigning peaks is highly beneficial. Now this can be done in a manual way, but there are now also groups that have developed automated methods for the assignment of methyl groups using similar techniques, one of which is a group in Oxford who have developed a process called MAGMA and they will take the spectra and peak list that you have produced and they will automatically walk through and compare it to a crystal structure and will provide you with a peak list at the end. So if this is something that you're interested in looking into these automated techniques may be beneficial. So we've now labeled our protein, we've assigned our methyl peaks, now we need to start looking at using HADAC to dock these small molecules into the protein active site. So the docking protocol that I used was to titrate the small molecule into a milvat labeled protein sample and then monitor changes in chemical shift. Now this does two things, this allows me to transfer my assignments from apoprotein to a saturated compound bound form of the protein. I can also, if the peaks are in fast exchange, then fit the chemical shift later to generate a binding affinity. This is important because when I'm running my NMR experiments, I need to ensure that I have saturated my protein binding site with small molecules, so I have a concentration sufficient to do that. I then need to assign the protons on the small molecule. And once I've done this, I can conduct a 3D edited filtered NOSI experiment to identify the intermolecular NOEs between the selectively labeled methyl groups and the unlabeled protons on the small molecule. Now for my experiments, I used a mixing time of 120 milliseconds for all of my NOSI for my edited filtered NOSI experiments, but depending on the type of system that you are working with, you may need to adjust that parameter accordingly. So the benchmarking of the docking protocol was done with these three small molecules. Now these compounds varied by the number of rotatable bonds that they had and thus the total number of energy minimized conformations that they could achieve. The affinity that the small molecules had for CoAD and also the distribution of the NOEs that could be observed around the small molecule. The other key factor was that we had high-resolution crystallographic information for all of these targets. So we could compare our docking results to a crystal structure to validate how well Haddock was performing. The first fragment that we used was our optimal fragment. It was a high affinity fragment that had no rotatable bonds. This is important because it meant that there was only one confirmation that this small molecule could have in the bound state. When I titrated the small molecule in, you can see here that the peaks which exhibited a chemical shift perturbation all correspond to peaks that are in the phosphopandething pocket. So we were confident that the small molecule was interacting in that part of the pocket and not the ATP pocket. One assumption that we did make was that, as I previously said, the active site is made up of two protein subunits. Now in a Homohexma, it is not possible to differentiate between the different residues on the different subunits. So the assumption that we made was that peaks, for example in this case, valine 135 and lucine 131, these peaks came from the second subunit and not the original subunit. So this was important when we set up the docking runs because we needed to define where the NOEs came from. The edited filter nosy experiment that we ran was able to identify the intermolecular NOEs between the unlabeled ligand and specifically labeled methyl groups. And here I have a 2D projection of the edited filtered nosy and you can see that all the peaks that are present in the spectrum correspond to methyl groups that are surrounding the small molecule in the crystal structure. And each of these now contain information or intermolecular NOEs to that small molecule. So the small molecule needs to be assigned in the presence of the protein. And here I have a superposition of the 1D of the small molecule with the bound form of the small molecule in the milvat label protein. And as you can see, the peaks superimpose quite nicely. So there's not a significant change in the chemical shift of the peaks in the APO form, the bound form of the small molecule. Now if you're not comfortable with assigning small molecules, there is software out there that can help you with this. ChemDraw is a program where you can import a small molecule and produce a 1D proton NMR spectrum. While software may not get the exact chemical shifts correct, they will get the fingerprint of the small molecule correct. And normally the distribution of the peaks is pretty accurate. So you can use these programs to assist you with assignments if you're not comfortable with it. We then, I then looked at all of the intermolecular NOEs that I had generated with my edited filter NOESY and was then able to integrate these and group them into upper-distant bins for the docking experiment. Here, underneath the 1D spectrum, I've shown strips from the three-dimensional edited filtered NOESY. And as you can see, the chemical shift of the peaks in the indirect proton plane correspond nicely to the chemical shifts observed for the small molecule. And I have colored them to coincide with the assignments of the small molecule. So from this, we're able to build up a picture of how the small molecule is sitting in the enzymatic active site. So now, having acquired my edited filter NOESY, manually go through the data set and assign the intermolecular NOEs and integrate the peaks and also identify the peak intensities. From this, I generated a ratio of the peak intensity to peak height. And I used these to then place my NOEs into bins based on their ratios. During the first experiment, I correlated the ratios that I generated to the expected distances based on the crystal structure and then used this as a calibration curve for all other experiments. So the NOEs were placed into bins between three and a half angstroms and eight and a half angstroms based on their corresponding ratios. Now, this isn't a new technique. This has been done in the past and has been published. So, and we found this quite successful. Now, there's a lot of information on this slide, and I don't intend to go through it all, but these are important parameters that you need to be aware of and adjustments to the haddock docking protocol that need to be made in order to, in our experience, to successfully dock the small molecules into the active sites. There's a lot of conversation out there on haddock forums about how to generate ligand topology and parameter files. The methods that were suggested were not available to us at our company, so we used a different method called Explo2D. And when we were doing this, all partial charges of the ligand were set to zero. The ligand and the topology and the parameter files that were produced using this method were successful for all docking protocols. So that is another method which can be used to generate these files. The NOEs were defined as unambiguous restraints between the carbon atoms to the attached protons. And the NOEs which observed were placed into bends and contained a lower distance limit of 3.1 angstroms and upper boundaries of three and a half to eight and a half angstroms. As I previously said, the COAD active site is comprised of two protein subunits. So for the purposes of the docking, these protein subunits appeared as a single chain. That meant that haddock could not throw out one of the protein subunits during the docking process. I've also listed here a number of parameters which need to be taken into account when running haddock. I'm not going to go through all of these, but they are here for reference if you want to follow this process in the future. So how did this work? Well as you can see on the right hand side there's a superposition of the small molecule generated from the docking protocol in magenta with the actual crystal structure. Haddock was able to produce a cluster and the best representative of this cluster had a ligand RMSD of 0.84 angstroms to the crystal structure. Now this was a very good result and gave us confidence going forward that Haddock can use this experimental data to dock these small molecules. So the next part of this validation process was to start using ligands that had rotatable bonds. When you're working with such ligands you need to identify the confirmation of the small molecule in the bound state and this does not always correspond to the lowest energy confirmation that you have in solution. Now different groups who have worked with docking have made different assumptions. Some groups have assumed that you can use this lowest energy confirmation. Other groups have made other assumptions. What we decided to do was instead of assuming the state of the small molecule in the bound state we would dock a library of ligands. We would generate all possible energy minimized confirmations and then we would give all of these confirmations to Haddock and we would allow Haddock to use the experimental distance restraints that we generated to not only dock the small molecule but to then select the confirmation which best fit the data. Libraries of all possible low energy confirmations were generated using MOE and then each one of these confirmations was individually parameterized and used for the docking. Now there are two approaches that can be used for this. We ran independent docking experiments for each small molecule. However Haddock can work with libraries of small molecules where you provide it with all confirmations and then tell it to produce structures in its IT0 phase for all compounds. If you're going to use the latter technique then you need to make sure that you increase the number of structures that are calculated proportionally to the number of small molecules that you have in your library. So by default Haddock calculates a thousand small molecules per structure or per small molecule. So as you increase the number you need to multiply that number by a thousand in order to accurately select for the correct population during that initial docking phase. So the second fragment that I worked with showed weak affinity with CoAD and had a single rotatable bond. The NMRKD that we calculated was 1.3 millimolar for the small molecule and the small molecule had a single rotatable bond around the methoxy group. When I docked both of these small molecules comparison of the docking score showed that there was very little difference between the two small molecules docked. However, when I looked at the number of NOE violations there was a considerable difference. The structure 2a had four NOE violations while structure 2p had zero NOE violations. So it was clear from analysis of these two factors which of the small molecules best fit the experimental data, which of the conformation best fit the experimental data that we had. And when I superimpose the docking structure of 2b with the x-ray we can see that indeed is the correct conformation and we have a ligand RMSD of 2.4 angstroms. Now the final core fragment that we tested was the most challenging fragment. We had it was a high affinity compound which had multiple rotatable bonds, but it also had a sparse NOE distribution around the small molecule. Now in the compound structure on the left hand side I've highlighted the groups to which NOEs can be observed to and as you can see there is one side of this small molecule that does not have any groups to which NOEs can be observed to. There were 14 different confirmations that were calculated that I had to dock and as you can see from the scores in the table below there were two confirmations that gave very similar scores and had very similar NOE violations. Now based on this data alone it would be impossible for us to differentiate which of these is the correct binding mode. However if you look at the superposition of the two docking structures you can see that the groups to which we have NOEs observed to are located in very similar positions. It's only the groups to which we do not have NOEs to that are varying in position in the active site. In this case in order to confirm which is the correct binding mode we would need to go through and carefully select two or three analogs which would allow us to tease out the correct which one is the correct confirmation and I will show you examples of this later on but in this case 3n was the correct binding mode and as you can see there is superimposes with the X-ray crystal structure nicely and produced an accurate and reliable result. So based on these three core fragments I was now confident that this methodology was working and could be used for fragments for which we didn't have crystallographic information for. So here is a summary of the core fragments. As the molecules get more complicated the ligand RMSDs increase slightly when compared to the X-ray crystal structure but in all cases HADAC is able to dock the small molecule in the correct orientation in the correct part of the pocket and the small molecule is orientated in a way that we can select analogs for structure based drug design. The protocol can accommodate multiple different types of ligands with multiple different types of rotatable bonds and by using the experimental data to both drive the docking and analyze the docking results we can select the correct confirmation. So now we've developed the protocol the next thing to do is to use this in a prospective way and dock small molecules for which we have no crystallographic information for. So two hits were identified by NMR screening to bind to coAD. The goal was to take each of these small molecules dock them into coAD and then develop these fragments through analog selection based on the docking results and then if all goes well we can then generate a co-structure with a more potent hit to validate the original docking studies. So the first prospective core that I wanted to work with had two different confirmations it had a single rotatable bond again around methoxy group. I titrated in the small molecule was able to follow the chemical shift perturbations in the spectra and from this I was able to generate a kd of 2.8 millimolar so it was a very weak binding fragment. The edited filtered experiment to generate the intermolecular NOEs elucidated 29 different intermolecular NOEs to protons all around the small molecule and the docking results shown below showed that Haddock positioned the small molecule in the again the phosphopandetine pocket which corresponded to the chemical shift perturbations that we observed and the methoxy group was positioned deep into the pocket. Now if you look at the schematic of the residue surrounding the small molecule we can see that there is space to grow the small molecule at position R1 and R2. R1 can only accommodate a single methyl group but there is space to expand small molecule much further from position R2. To validate the model that I produced I also selected an analog that grew from position R3. Based on the if the docking results were correct there is very little space to expand in this region so any additional chemical matter that's positioned at this position should have an adverse effect on the KD and also the other biophysical and biochemical metrics that I was using to analyze these. So I selected a number of analogs but the three which I'm going to show you here have a phenol ring attached to the peridazine core and an additional methyl group. The second group or second analog has a chlorine substitution at the power position of the phenol ring and the negative control that I talked about is the same as the top compound but it has an alanine group positioned at R3 which will vary into the core and should have a negative effect and these are the results that we observed when working with the biophysical the DSF and also the biochemical and also when working with the IC50 data. As we increased the chemical matter that was positioned at R1 and R2 we could see that there is an increase in thermal stability of the protein up to when using compound 9 we could see that there was a 2.5 degree increase in thermal stability by DSF. The biochemical assay also showed that we went from an IC50 which was greater than 8 millimolar to a biochemical activity of 546 micromolar so that was an order of magnitude increase in biochemical activity which is quite significant. So at this stage we have gone from a compound that was completely intractable by crystallography to a very highly potent small molecule. The compound 10 which we selected for a negative control indeed had the adverse effects that we were expecting where it showed no increase in thermal stability and the IC50 was also greater than 2 millimolar. We suspect it would have been higher however 2 millimolar was the maximum concentration that we could measure with this small molecule and the Kd of that compound was also significantly higher than the other analogs which we had selected. So based on this we were confident that our docking model was indeed correct but to validate this I generated a crystal structure of Kd in the presence of compound 9 and through comparison with the docking model we could see that indeed the position of the peridazine core of compound 9 is in a very similar position to that of the docking model. The one difference which we observed was that the orientation of the methoxy group had flipped between the docking model and the analogs. However when we were generating the models of the analogs in Moe we discovered that when we added in the second methyl group to the peridazine core due to steric reasons that should flip the methoxy group. So this was a result that we were anticipating and it was indeed observed in the crystal structure. Below I show a superposition of the docking model produced with the x-ray crystal structure and also the predicted position of that small molecule that we generated in Moe based on the docking model and as you can see the predicted binding orientation and confirmation of compound 9 based on the docking model is very similar to what we actually observed in the x-ray structure and these superimpose with the ligand rmsd of 1.4 x-troms. So we were very pleased that based on the docking results alone we were able to prospectively grow, perform structure-based drug design and confirm this with the x-ray structure. So to summarize I was able to use experimental data to guide docking and I was able to dock a fragment for which I could not generate crystal graphic data for. Using Moe I could I generate analogs which I then screened using DSF and 2D protein observed NMR and confirmed that these were indeed hits and that they increased in potency as I expected. Compound 9 which was identified had a 100-fold increase in affinity and exhibited the largest DSF stabilization effect and had the highest biochemical IC-50 called the lowest in this case sorry. I was able to generate a crystal structure of compound 9 and this was determined at 1.8 angstroms and we were able to show that the crystal structure was highly similar to the docking result that we had previously generated. So to confirm that this wasn't just a one-hit wonder I then took the second perspective call and I docked this. Again in the top panel we can see the chemical shift perturbations induced. This fragment had a slightly what was slightly tighter binding it had a 1 millimolar affinity and it had a biochemical IC-50 of 4.7 millimolar. Similar to what I did previously I used Moe to grow the fragment to identify analogs and then I also selected a negative control to confirm the docking results. In this case the negative control was compound 13 which had which was a different stereoisima and it had the imidazole group flipped on the other side of the unsaturated ring. As you can see in the docking model and x-ray superposition on the top right hand side the docking model had the imidazole ring pointing into the pocket. If indeed this was correct the imidazole ring would now be facing out into the solvent and we should lose affinity and this is indeed what we observed. Again giving us confidence that our docking model was correct. Selected a number of analogs which grew from the imidazole ring. Compound 14 had the difluoromethyl group added on and Compound 15 had a vinyl group and each of these gave increases in KD, IC50 and also showed significant increases in thermal stability when measured by DSF. We were able to produce crystal structures with both of these and again the crystal structures matched up nicely with what we observed from the docking models with ligand armacities of 1.4 and 1.7 angstroms with Compound 14 and 15 respectively. So with that I'd like to conclude the talk and acknowledge Andreas Lingell and Dirk Buzier who were my advisors when I was doing this work and also multiple other people at sites in Emeryville, Switzerland and Cambridge who helped and answered questions on a regular basis I'd also like to acknowledge here Alexandra Bovin who helped with Haddock related questions and with the implementation of Haddock at our site. So with that I'd like to open it for questions. Thank you very much Andrew that's great. So yes as Andrew says we are now happy to take questions and so if you do have a question and you want to type it into the question area you can do so if you're one of the organisers and you have a question you can't type it into the question box you can maybe type something into chat or I can just open the mic or you can just open your mic and ask a question directly feel free to to cut in if you want to do that. So while I wait to see if there's any other questions I should firstly say although that I should firstly say this is not an area in which I'm an expert so apologies if my question is not not quite correctly formed but I was very interested in the part where you were talking about the selection of the parameters for Haddock because I think this is something that we've we've heard elsewhere and it's not just true Haddock but in general it can be quite difficult to choose some of the input parameters for for these simulations. I was wondering how how much time was involved in selecting these parameters whether you're able to tell how sensitive your results were to the parameters and how you know that you've got the right ones in the end you have to wait till your end result like you were talking on slide 34 to say that you you got good results in good agreement or were you able to perform some earlier tests to check that the parameters you've selected were the right ones if you see what I mean. Yeah so we're very we're very lucky here in sense that the the compounds that I was benchmarking with all that high resolution crystal structures so we had a definitive endpoint that we knew was the a correct solution. Now again what is observed in crystals and what is observed in solution could be two slightly different things but generally they're in agreement with each other. So we have that checkpoint at the end of each run I could take the best cluster I could take the best structure from that cluster superimpose and I could see how far off I knew from the NOE data that I had based on the crystal structure that if haddock was doing its job properly it should put the small molecule in the correct place. Now with regards to how long it took there was a lot of reading on forums a lot of people have done work with small molecules it's not so extensive so there is a lot of information already out there about parameters that that haddock that works well with haddock and this is also where talking with Alexander came in very beneficial because he was able to also help us discern which of these were important for what we wanted to do and which were not so important but to answer your question yes it is just a lot of trial and error it's a case of trying these different parameters seeing which ones produce the best models and then ensuring that it's not just a case by case specific set of parameters if you go on to the next small molecule and you find that you need to adjust a different parameter will you then need to go back to the start and need to re-run all of the controlled experiments to ensure that that one change that you've made has not affected results elsewhere and through the iterative process of trial and error you can start to formulate a global set of parameters which that work yeah yeah good okay well um you mentioned Alexander we have him uh watching as well I think he has some questions so um Alexander do you want to uh unmute your microphone I think you can ask your question directly yes hi Andrew hi Alexander how are you yeah nice story uh I'm of course a bit biased here and so and so um so I have a few questions for you uh first the NMR part of it so you measure I know he's up to 8.5 angstrom yes because you have a deterioration and uh VTI labeling only that you can exactly so there are reports out there that uh in these highly deteriorated environments the amount of relaxation always this significantly reduces the amount of relaxation and you can observe distances up to 10 angstroms away now the reports of 10 angstroms come when you're working generally with stereo specific labeled leucine and valine sidechains uh these are the cases where only the pro r or the pro s have been uh selectively labeled and the the other uh methyl group is deuterated um in our experience when we were assigning our methyl groups using the metal walk we were quite confident that we could see distances up to 8.5 angstroms uh in our um in our milvat labeled deuterated system so yes we were confident up to 8.5 angstroms but you can certainly observe distances more than the 5.5 angstroms that you you generally measure as your upper distance limit in a uniformly carbon nitrogen labeled protein okay thanks and then related to the NOEs as well so you show different example first the benchmarking and then the the real application to a blind case which you get crystal structure at the end yes so do you have a feeling of and you reported different RMSDs for the ligands for different cases so do you have feeling of these correlates with the number of NOEs that you observe so whether NOEs were only on one side so yeah so it's correlate you you picked up the two factors there it's both the number of NOEs and it's the distribution of the NOEs around the small molecule and these are the two important things that determine how well haddock docks that small molecule in the in the test case where there were no NOEs on uh uh one of the sides of the small molecule in our results from haddock we could see that there was just significant variation of the position of that small molecule or that side of the small molecule um in in the docking results the the regions to which NOEs were observed to they were fixed well in position but everything else there's a lot of variation so it's if I was to say what was the more important factor I would say that distribution of the NOEs around the small molecule was more significant than the number so you can know a priori basically uh what to expect based on what you observe in terms of NOEs so that that's a good point yeah and I guess that that you have a large distribution of poses if you have a bad distribution of NOEs also comes exactly because of the lack of electrostatic exactly charges yeah which brings me to a question about the ligands so you more to generate the the minimized conformations yes compare those to the crystal structure so are those conformations really close to what you see in a crystal or and uh those haddock deform the conformations for example so related to that will be should you freeze the conformation between docking so this is this is an important point that I I neglected to mention during the presentation was that the benefit of so in each case one of the energy minimized conformations produced by MOE was the correct conformation during the docking process the small molecule was fixed it was not allowed to be uh rotate there was no conformational freedom allowed to the small molecule so the assumption was that in the in the library of fragments produced one of these is the low energy conformation found in the bound form and and haddock was able to select from the library based on the experimental data which of those conformations was indeed correct so but that's a very important point the small molecule was was not defined as flexible during the docking protocol yes and so more must be giving you also some partial charges or not so the MOE uh I don't recall if MOE gave partial charges but when I produced the topology in the parameter files using x2d all partial charges were removed from the small molecule okay and I have one question Adam yeah go ahead yeah no we have one other question from the floor um you okay can I take one from Andrea first and then we can come back to yours Alexander just in case uh Andrea needs to go um Andrea are you able to um open in fact I'll let me allow you to speak um if you've got a microphone um Andrea would you like to ask ask your question uh yes can you hear me I can yes yes that's good yeah thank you so I have a couple of uh technical questions um I'm from the University of Hanover um so one is how does your assignment strategy um obtain or deal with obtaining stereo specific information do you get it from the methyl walk in the nosy or do you or do you use an OR statement in the in the docking and then second oh sorry yeah uh so to answer that question uh the during the methyl walk we're able to identify the partner pairs of the leucine and the valine uh methyl groups however um correlating those to the uh crystal structure is a different thing because even though you can uh generate samples where the pro art or the pro s are selectively labeled so you can identify in your um you can identify in your spectra which is correspond to which when crystallographers um assign or when they they they they fit the the the residues into the electron density often you can't you can't tell the difference between flipping the site in around if it's poor electron density so what's labeled as pro r and pro s in your crystal structure may not necessarily correspond to what is actually pro r or pro s in your nmr spectrum it's again similar to what we were talking about before um it's the case of trial and error you you see when you run your docking runs you set them up with one permutation if um and but you need to make sure then that once you have connected pro r peak in the spectrum to the corresponding methyl group in the um structure that you use that same nomenclature each time yes that's what we did so we identified we identified the correlations and then i annotated it accordingly and made sure that i used the same um the same nomenclature each time yes and the second question is whether you have tried using glycerol as a deuterated glycerol as a carbon source for your milvat labeling whether if glucose is absolutely necessary because this makes a difference in the in the cost it does it does and this was something that we did not investigate we did not look to see whether deuterated glycerol could work i see no reason why it shouldn't deuterated glycerol can replace glucose in a normal expression so there's no reason why it wouldn't work but i would assume that there would be similar caveats that this it would be slightly slower growing and it may impact your your total cell growth or total cell mass at the end but i see no reason why it shouldn't work okay thank you okay thanks very much for your question andrea um so alexander i think you had one last question that you wanted to ask i think we've still got time for one more yes out of curiosity did you try to use some of our small molecule docking software more is capable of docking as well i would think so so we tried looking at um other uh software that we had uh access to at novartis um but in all honesty haddock was superior when working with the uh experimental enemy restraints because haddock would use this data to actually drive the docking other programs appeared to want to use other parameters to actually drive the docking and then use the distance restraints that we put in to filter the results um which wasn't what we wanted we wanted to use the experimental data to to actually drive the docking so we realized very quickly that if we wanted to do this that haddock was pretty much our only option so we looked at mo we looked at gold we looked at um icm focus also uh but i said haddock with regards to setting experiments up and producing data um we found that it was um far superior these other programs were quicker to run their um their their docking runs but like i said they they didn't run them in a way that we wanted to run our experiments we couldn't we didn't find a way of running them in a way that we wanted to run our experiments okay thanks general thank you thank you very much everyone for your questions and thank you andrew for your answers i think we're going to wrap it up there for today but if anyone is watching later and they have any other questions do come over to the bio excel forums at ask bio excel and you can post your question there and we can pass it on to to andrew or andrew can connect directly to to answer that um so we'll hopefully see you again soon at another bio excel webinar and until then thanks very much for coming along and we will uh speak to you soon bye