 Hello and welcome everyone. My name is Rosen Apostolov and I will be the host of today's event. Before we start, I have a few announcements to make. First, this webinar is being recorded and you will find a recording available on our website, bioexcel.eu slash webinars that you can listen to the webinar later or share with your colleagues. At the end of the webinar, we will have a Q&A session during which you will be able to ask questions directly to Alexander. During the webinar you can ask your questions in the chat or the questions interface in the application and at the end I will give you the microphone to ask directly. If we have problems with the connection, I will ask the question on your behalf. Today's presentation is the first presentation of the Bioexcel educational webinar series and I would like to give you a short overview of Bioexcel. Bioexcel is the center of excellence for computational biomolecular research and it was established in November last year. The center works in three main directions. The first one is towards providing excellence in biomolecular science and we work with three important and widely used software packages which are HADOC, a lot which we have today's webinar and you know very well. We have in the center GROMACs for molecular dynamics simulations and also CPMD, which is a code that can be used for hybrid QMM simulations, for example, enzymatic reactions. Another objective of the center and area in which we provide expertise and services is in usability and we work towards making various applications and tools to be more usable. We develop and integrate workflows along with data integration services with the final goal of devising more efficient workflows for all the users. We work with several of the quite popular platforms and workflows such as Galaxy, OpenFacts, KNIME, Comps, SSS and Apache's Taverna. In addition to working with the codes and providing workflow environments, BioXL is working heavily on providing training and expertise to both academia and industry. So this includes not only university researchers but also industry users from pharma, chemical and food industries. We work also together with commercial software vendors and resource providers such as HPC centers. What may be interesting for you is to know that as part of the center we are starting the establishment of several interest groups in different areas and domains of computational biomolecular research. We are starting with six interest groups. The first one is on integrative modeling and we hope that you might be interested to join this interest group. We will send you information about it later. We have also interest groups on free energy calculations which are very important for areas such as computational drug design and other areas. We have an interest group on best practices for performance tuning since some of our coaches Gromachs very highly tune, they run on different architectures and it's sometimes challenging to make the best of their power. Similarly we have one interest group on hybrid methods. This is mainly for CPMD users where you could tackle problems on electronic structure level. We have also an interest group for entry level users where we offer very easy portals for running applications and we have another interest group on practical applications for industry which target specifically users from companies and we're looking into how best we could support their work. You could find more about how to get in touch with BioExcel and more about us on our website. We provide support forums, there's code repositories where we will keep the code that we develop. We have an open chat channel, video channel where we have the webinars. So that's all for BioExcel and without any other traditions I would like to present you today's speaker. Professor Alexander Bonvin, which some of you maybe know, he's the main developer of the Hadock software and his work on different computational approaches for biomolecular interactions are very popular and I hope you will find today's presentation very useful. So I would like to ask now, Alexander, to start. Okay, so if all goes well, you should be seeing my screen now. Rosen, can you confirm before I move on? Yes, we can see it. Thank you. Good. So welcome everyone who is online. I'm going to tell you a bit about the integrative modeling of complexes using Hadock and Hadock 2.2 as you see on the title slide. So I want to give you a short introduction and then give you more specific of what the web server can do for you and what the local version will do for you to explain the differences. So as probably all of you are well aware, we are living in times where we have a huge amount of genomic, proteomic and interactomic data that are coming out and we have to try to make sense of this puzzle and when it comes to interactions it means this large number of dots connected by lines that you see now so-called interactors where proteins are the points and the connections indicate complexes that they can form and there is a huge number of connections. There are many more connections than there are players in this. So the structural landscape that you would like to cover is much larger than the protein landscape that we are dealing with and try to understand what goes wrong in those connections often requires taking the step to modeling the structure or solving the structure of those 3D complexes. But solving those structures is not always easy. CryoEM these days sees a huge boom so there is a lot of exciting work being done so we will see I guess a lot of new large complex molecular machine being solved by CryoEM. X-ray is of course still a major player but it's also encountering difficulties with the more maybe flexible complexes and membrane associated complexes and next to that you have NMR of course which is also contributing to populating this 3D landscape of complexes but given the complexity of this landscape and the huge number of interactions we also need complementary computational techniques and this is where our approach and our software is coming. So if we look at somewhat maybe all the data this is taken from a review by Patrick Alois group from a 3D interactome where they have been looking at the coverage, the structural coverage of interactomes in a PDB and you see here E. coli and you see human and the number of interactions that are listed here are only interactions that at the time have been experimentally documented and validated so we expect orders of magnitude more interactions for sure in human compared to the say 45,000 that are listed here. Now for those documented interactions you go look into the protein database, what is the information in terms of 3D structure which is available and let's just look at that human. You see that for humans for only about 5% of the complexes we have a full structure of the complex you might have 5 more percent even less than that for which you might have only domain interactions that have been solved not the full complex or you might be able to build model of the complex for example by homology modeling and then you have this large blue fraction which represent about 50% of those interactions for which we do know the structure of the interactors so the individual component of the complex but not the complex and this large blue fraction which is even larger in E. coli is the basically the regions where modeling are using methods like docking can play a significant role because we have the starting point of the complex so then what we have to solve is basically a 3D puzzle. So molecular docking in a nutshell you're trying to predict the structure of in this case in this example 2 proteins using a number of descriptors to measure how good the models are that you generate. So shape will play a role here and there are algorithms that are using mainly shape to model those complexes, electrostatics of course is also an important component when it comes to molecular recognition van der Waals interactions you see here Lennart Jones potential so these are all say physical chemical energy terms that might be incorporated into this docking software and what you see here in the middle I will come a little bit later to that is the kind of function that we are using to represent the experimental information. Now my title says integrative modeling so we don't want to do modeling I've been sure just for the sake of it we want to integrate as much information as possible in the modeling process and these days is a lot of different methods that can provide you pieces of the puzzle and then the game will be to take all those pieces together with some computational algorithm to create a model of the complex you are interested in that fulfills all the data so here you see a number of well any experiment that might give you access to distance information so PREs will be NMR, EPR allows you to measure distances, FRET experiments will also give you distance information those distances might not be very accurate in most instances but it's information. You can see here the NMR titration so this is the classical NMR experiment used to screen for for example small molecule also used in pharmaceutical settings but also screen for interactions especially in cases where the binding is not so strong so weak interactions. A lot of popularity these days cross-linking detected by mass spectrometry for example also distance information these are information sources coming from NMR which tell you something about orientations of molecule you might do simply work in a wet lab do metagenesis combined with some binding as say this is also giving you pieces of the puzzle and cryo electron microscopy, small angle X-ray scattering might tell you something about shape of those complexes although cryo-EM these days goes to reaches resolution that allow you to solve the structure de novo the highest resolution cryo-EM map is 2.6 angstrom currently and when you don't have anything you can still go back to bioinformatics and try to predict those interactions and then the game will be to make use of all this information and code it in some way so that you try to take it into account directly in the modeling process so if you want to read more about integrative modeling in general here are a number of reviews that we are listing so that the first one is really basically the classical review when it comes to docking there is a capri which is a blind competition for the prediction of protein complexes mainly here is publishing every 2-3 years a special issue as CASP does for structure prediction of proteins and in those issues of proteins you can find what is the current state of the art in terms of methodology so there will be later this year or January 2016 sorry there will be a new issue of capri in proteins so we just had the evaluation meeting for capri well since the slides will be online later on so you can look up those references if you need to so now I want to spend some time explaining you actually the machinery beyond ad hoc so how do we do the docking in ad hoc, how do we model complexes in ad hoc so our main way of modeling and incorporating data is to use the data which is very often fuzzy not very accurate but we use it by defining ambiguous and low resolution restraints to guide the docking and you find again here the entire view of data that we can incorporate in our modeling process so a large majority of these data might be used as ambiguous distance restraints and I will explain you that later one of the features of ad hoc is that we can dock up to 6 molecules simultaneously so we are not limited to binary docking which is the case for quite a few software but you can dock up to 6, we are working currently on lifting this limit of 6 but of course the complexity of the modeling increases greatly when you go to a large number of molecules and this only really makes sense provided you have some good information to drive your modeling especially when you go to larger assemblies if there is symmetry in a system you can make use of it, you can define a symmetry restraints to guide the modeling process this has great value actually, it limits very much the interaction space that you have to cover or conformational space if you want to call it that way we also have ways of dealing with flexibilities so ad hoc does a flexible docking and I'm going to describe the different stage of the modeling process but we do refine the interfaces, typically the complexes that are coming out of ad hoc will have structural qualities that are similar to what you would find in a protein database so you don't find clashes at the interfaces they are refined explicitly in water at the end of the protocol and we have shown a consistent performance in the blind docking competition capri over the years so how do we do all this search in interaction space so we use a combination of empirical force fields so the classical force fields where you describe bond angles, torsion, rotation around bonds and then you have the non-bonded interaction term and in addition to this energy term we add an additional term which represents the experimental information that we own in the system and this can be a distance term, this can be a diagonal angle of rotations information about, so we recently implemented cryoEMRH 20 into ad hoc so we can actually dock into the density dikely the search algorithm is done using a combination of energy minimization and molecular dynamics minimization so we need to have the derivative of the energy function so ad hoc is not a Monte Carlo search, it's really using molecular dynamics for that and the forces are driving your search so the protocol itself consists of three stages in the first stage the molecules are treated as rigid bodies so they are rock solid in the second stage we perform a simulated annealing to basically optimize the interface of the complex and during that stage flexibility is introduced both alongside chains and the backbone and the final stage is a refinement in explicit solvent to fine tune basically the final complexes and this is quite similar to what is done in NMR structure calculations so the first stage this is reasonably fast there are docking software that are much faster than ad hoc on there but here we can generate typically in a few seconds a model so the molecules are treated as rigid bodies so the only degrees of freedom at this stage are rotations and translation driven initially by the information that you put in I already told you we can handle up to 6 molecules and typically the sampling in terms of conformation is between 10,000 and 100,000 conformation and we save about 10% are written to disk the top scoring conformation I'm going to explain you our scoring schema later so about 10-20% of the solution generated at a rigid body stage are then subjected to a semi-flexible refinement stage so this is done in torsion angle space and not in Cartesian space this allows us to easily freeze and release torsion angles in a molecule so we start at high temperature, cool the system flexibility is introduced first along the side chains and then along side chains and backbone and by using torsion angle dynamics we can easily freeze part of the system without fixing molecule in space so the molecule can still freely move in space and only the interface regions are typically flexible so if you look at protein-protein interactions typically the amount of flexibility conformational changes that you might get at this stage is not large it's up to two extra typically but you have to realize that conformational changes are still one of the major challenges in the modeling of complexes as demonstrated by Capri there are benchmarks for protein-protein docking where everything which has a conformational change of more than 2.5 angstrom is classified as a challenging system of course if you dock smaller molecules so we have worked also on protein peptides for example here we can get up to 5 angstrom conformational changes maybe on the peptide side if you introduce data in your modeling we recently demonstrated the use of CryoEM here we can again drive conformational changes to a larger extent so the amount of conformational changes that you might expect also depends on the quality amount of data that you put into the system so typically all the models that you have refined through the simulated annealing protocol stage are then subjected to a final refinement in explicit solvents this is a short molecular dynamic simulation in Cartesian space this time where we define a shell of about 8-9 angstrom of water around the molecule no periodic boundary conditions or no full-blown molecular dynamics but this is a gentle protocol slowly heating the system to 300 Kelvin and cooling it down and the total time is in the order of a few tens maybe 50 picoseconds so this is nothing like full-blown molecular dynamics and you will hear about Chromax for example in some of the webinar it's mainly to improve the energetic of the complex and the rim of the interface so at this stage not much happened in terms of conformational changes it's rather limited but the contacts that are made are improving quite a lot during that stage so in terms of flexibility I already mentioned that we can explicitly describe flexibility during the refinement stage by allowing side-chain and backbone flexibility on both sides so there are other software that do allow some extent flexibility also in the small molecule docking fields but for example AutoDock usually mainly consider flexibility in the ligand and the protein is rigid but what you can also do is to start your docking process not from a single conformation but from an ensemble of conformations so you can obtain them from NMR for example you could run molecular dynamics simulations and get a sampling of possible conformations you could think of LSD network models to generate those and you can give an ensemble of structure a starting point to the docking usually these ensembles should not be too large otherwise you get what we call a dilution problem because the number of possibilities that explodes so if we take for example 10 molecules on both sides or for binary complex you have 100 combinations of molecules that you can create or if you're going to sample 10,000 models at a rigid body stage it means that each combination is only sampled 100 times so starting a docking from 100 conformations is not a good idea so this is this dilution problem so it's better to restrict the number of conformations but if you know that things are happening for example loop modeling it might be tricky it might be a good idea to provide such an ensemble now about the energetics and scoring so one aspect of docking is to generate models the other important aspect is to be able to recognize what are the good models out of those so our force field is OPLS we use the OPLS non-bonded parameters we use by default a united atom force field from OPLS to speed up the calculation we remove all the hydrogens that have no partial charges so we keep the ones that carry small partial charges these are important for hydrogen bonding but you have the options to keep all hydrogens if required for example if you have NOE data from NMR you should keep them all for the restraints to work as you can see the non-bonded cutoff is rather short compared to full blown molecular dynamic simulations 8.5 angstrom during the vacuum part of the protocol we have we scaled on electrostatic by using an epsilon of 10 during the explicit solvent refinement epsilon is back at one since the water is explicitly present at the end of of the protocol we do cluster the solution and we have two options of doing that based on an RMSD calculations all based on the contacts that are made at the interface and we are going to score a solution on a cluster basis so the ranking is based on a cluster based score and the score is calculated only on the top four model of each cluster since the cluster might have different sizes we want to calculate the score on the same number of models for each cluster usually when you put information in this kind of modeling you cannot assume that the largest of the most populated cluster is the best one in an ideal world you would see you would like to see the largest cluster and also the best ranking one but there is no warranty that this will be the case and this is also the reason why we don't consider the size of the cluster in our ranking but only the basically the Hadox score and the score is illustrated here you see that we have different scores at different stages of the talking process so at the rigid body stage you see a bit of the experimental information this is where we put the experimental information but scaled down to only 1% so if you have a high trust in the data you might increase that bond bars interactions are also scaled down to only 1% because they might still be clashes there is no optimization of the interface at this stage electrostatic is important and the dissolvation term this is an empirical dissolvation term we take the parameters from Juan Fernández Recio and we select for complexes that have a rather large surface area this is this negative term here at the final stage this is where we will be ranking the clusters, the clustering is only done pretty much at the water stage so we still have the experimental information 10% full bond bars 20% of the electrostatic but with epsilon equal to 1 in this case and a dissolvation term this very simple scheme which looks like it has never been optimized because it has nice round number as proven very robust actually in the latest round of capri there was a joint gas capri scoring competition among other where because of the time pressure in this round we didn't do anything fancy we simply blind this score using this very simple function and we we were the best scoring group in this round of gas capri there is a paper about this round of gas capri that just came out in a journal of molecular biology by Mark Lansing and Shoshana Volak if you want to read more about it so this is our scoring scheme so now I want to give you just one application exam to give you a little flavor of what you can do and I chose in the spirit of integrative modelling a system in which we have been using actually a bit of metagenesis data but also different kind of mass spectrometer data to model the system and it has to do with the cyanobacterial circadian timing this is basically the internal clock of those bacteries which consists a very simple system of three proteins so I don't want to go into the biology and function but the information that we have for the modelling from native mass spectrometry we knew the stoichiometry of the complex 6 to 1 stoichiometry from HD exchange we could map the binding interface and there was one more piece on the function that was delivered by the collision cross-section basically of those complexes which you can also extract from mass data so this story has been published so you can read all the detail about it so this is just a mapping on the structure of KiB of the region that are protected upon complex formation these are the blue regions so this is an interface information that we give into ad hoc as a list of amino acids basically ad hoc will try to enforce that those amino acids should be part of an interface it will not define what the orientation should be but it will make sure that those amino acids should be at the interface if you look on the KiC the system here is a bit more complex so it's a it's an examer which consists of two rings and you see protection data that you are seeing actually well six binding sites it's a 6 to 1 binding you see a site on the top of the structure and there is another site at the bottom so there is a communication alloster in this complex so for our docking process we targeted both the upper region and the lower region so we generated two set of solution and then I mentioned this collision cross-section from MS which allows you to filter basically the solution so we didn't use that information for the modeling but we back calculated those collision cross-sections from the model you see here the two set of docking and the cluster that correspond to those and these are the collision cross-sections that are calculated back calculated for the different models and they are shown in this plot here where the dotted line indicates the experimental range for the collision cross-section so the cluster in this case are ranked based on the Haddock score and you see that cluster 1 nicely fit in the middle of the range cluster 3 seems to be also and 4 are within the range, the experimental range and these are all cluster that correspond to the docking or the binding to the top part of the model all the system that bind to the bottom part of the complex will result in a much larger collision cross-section which is not consistent with the data so our prediction in this case was that our best scoring model of Haddock which fits the collision cross-section data and in this case it's also the most populated solution but it's no warranty that it's the right one we can also deal with more complex molecule, this is just one example where we worked based on NMR data at modeling that interaction of LEPID2 with fungal defencing and this is just to show you the complexity of this kind of molecule so you have sugar, power phosphate you have amino acid and then you have this back to a plain old tail this kind of complex molecule cannot really be run through the server this is too complex so you need to build topologies but if you are able to do that then you can use the NMR information and here you see that in magenta the binding sites mapped by NMR, those yellow surface are the binding sites to the membrane actually this protein binds to the membrane and extract LEPID2 from the cell wall so now a few words about the web server so we have released this year version 2.2 of a web server it's quite heavily used, more than 7,000 registered users worldwide more than 120,000 runs have been run since its opening and currently about one third of, well in total one third of the run have been running on a European and worldwide grid infrastructure what you can also see here is that they are different level of access to the server so at the basic level you just submit a list of amino acid and your PDB models and that's all you need to do the docking at the rule level that you see here in principle you can fine tune if you know what you are doing up to 500 parameters you can define symmetry during between the molecule, you can add additional restraints like from RDCs from NMR so there is a gradation of complexity during those various servers so this is a map of our user base so it's well represented worldwide with the majority of users actually in India and the US, next to Europe and what is also interesting to look at this here is the usage that our users are making of the server so we developed Hadoq mainly for protein protein and then we have been working on protein nucleic acids and peptides but you see for example there is quite a fraction of users that are using it also for protein small molecule so you can do small molecule docking with Hadoq using the web server this is just to show you where a great job coming from the server in my land a lot of sites in Europe but also in Beijing in Malaysia, Taiwan and even in the US so the jobs are being distributed around the world depending on where there are resources in the EGI grid we have access to more than 110,000 CPU cores to run those so what's happening behind the scene when you submit a job to the server so you can submit different types of experimental data in an even format you submit PDB files there will be a validation step done by the server on those input data and then we start running the Hadoq process we generate the topology for the different proteins different molecules we define basically the chemistry defined the force field at this stage also all missing atoms will be built automatically so you don't have to worry that you are missing a side chain in your protein this will be done by the server and once this is done this runs locally in your threads we start the docking process where you have rigid body, flexible refinements water refinements and this will be sent to this world wide grid or run locally and there is a post processing analysis which is done like clustering and scoring and the results are going to be presented to you so what does the server do for you that manual installation of Hadoq will not do at least not at this time maybe in the future it validates your input PDB files it checks for duplication of residues so you have to remove those if you have duplication of side chains for example in high resolution crystal structure multiple occupancies might also be a problem that you encounter in high resolution structures we run more poverty on the input files to correct some issues for example with side chain as part in side chain and define protonation states it might define automatically the restraints depending on what you input and if your molecule has gaps meaning that if you were to dock an antibody which consists of two chain we define a number of restraints to keep the fragment together during the high temperature simulated annealing stage if you don't do that the molecule might drift slightly apart because of the high temperature and kinetic energy in the system the input restraints are validated so it's using Xplosion as format for small ligands cofactors who will get topologies and parameters from those and we use for that pro drug as an input and it does all the post analysis of your docking results cluster analysis and statistics and this is a snapshot of what the server might return to you so for each cluster you get statistics each cluster you will get statistic of the different components of Hadox score Hadox score and you have also some visual analysis of the results so we hope in the future through Bioexcel to build a much more interactive analysis tool for the server but this requires some work so running locally I think I should probably speed up a little bit to leave some time for questions but let's go quickly so if you have a local installation of Hadox you will have to prepare your PDB so you will have to worry about double occupancies you need to remember to avoid overlap so often question that we get is I want to docker homodimers but there is overlap in numbering yes then you have to renumber those and we have a number of useful tools actually to help you do this so you can visit or get a repository the server accept ensembles of input structures if you run manually you will have to split them in single files you have to manually define a protonation state of history in the server does that automatically for you if you want to let it do for that you will have also to prepare your instance files you can use the one of the interface of the server to do that you need to generate some initial files that contain the information and more importantly if you work with small molecules you have to worry about generating the ligand topologies and parameters so you can use again for drug which is available as a server or AC pipe or even the ATP automatic topology builder from Alan Mark in Australia and you would have to edit the parameter files some of this can be done online using online forms that we have on our web page so in summary a local run will be some manual editing of those files or online you will have to give the other comments in your window to start the run this will create a complete directory structure and then in that directory you will have to edit and change the parameters that you need to change you will have to copy the parameter files for the small molecule if you have them to the proper location and at the end of the run you will have to do all the analysis manually that the server is doing for you again hopefully in some future version we are working on ad hoc 3.0 some of this will be some of what the server is doing now will be integrated directly in the local version so that should make things for your life also easy as a local user so some additional information to wrap up so if you are interested in getting the software you should look at our websites under the software directory you will find here the link to ad hoc where there is information about the licensing scheme it's free for non-profit organization and commercial people should contact us so you find in the ad hoc sites what I mean the online form where you can edit some of your input files even if you are running local migration you will find an online manual describing a lot of the parameters what they are doing describing also how to manually analyze the results the link to the servers are there and importantly there is also this link to ask by our excel we have also ad hoc-related software which is freely available from the ad hoc data propository so this is useful to manipulate input pdv files clustering based on contact if you are interested, these are our tools related to some overmodities this is more tri-o-m related on our website you will also find some tutorial so here we posted last week after the CAPRE meeting and I've been issued a docking tutorial using symmetrical for co-symmetrical complexes taking one of the CASCAPRE target for that there is still the WIRMR website where we have been contributing where you can find also documentation some useful information for different aspect of using ad hoc and our medium of choice now for questions related to ad hoc is askbyourxcel.eu which has recently been released where you can post questions and generate topics that are specific to a specific question that you have so these are a list of papers that are relevant for ad hoc in general a lot of people have contributed over the year for developing and still continuing to develop in ad hoc so we hope also within byourxcel to come with a lot of nice new interesting tools to make your life easier and with that I'm finished for this part again here are the links to ask byourxcel if you are using ad hoc and you have questions this is the way to contact us www.volvanlab.org and it's the way to look at our tools to request the software and ad hoc.xcel.eu is giving you access to the web portal and some other portals so now it's time for the question and answer session and for that session I'm giving back the world to Rosen thank you Alexander I hope all the attendees enjoyed the webinar and found it useful we have one question asked by Jesse I'm going to give him the mic let's see if we can hear him hi Jesse could you say something oh hello can you hear me yes so fantastic talk thank you very much so I have a question about running ad hoc locally so how well does the software scale in distributed cluster environment and how does it scale on a large multi CPU shared memory server so the computations that we are doing except for the pre processing and post processing that are more kind of sequential steps all the docking process is unbiasingly parallel to a large number of jobs typically so if you have a cluster with say 200 cores so our internal clusters typically we submit in the order of 100 or 200 jobs in parallel to the system so that's a perfect use case it's not worth parallelizing the single structure calculation it's better to distribute to run in parallel a large number of docking routes and this is exactly what we are doing it's also the reason why it's working on this grid infrastructure because each job is independent and they are not very computationally expensive so if we run say and if you have one node so these days you might have a 64 core node so you could run say 50 structure calculation in parallel at the same time this is going to work perfectly fine so I think if you are an average docking run of an average size complex if you have 100 cores at hand might take maybe half an hour and from that half an hour probably 50% of it will be post processing like the clustering takes time there are some analysis that do take time yes and the clustering does that use multiple cores as well no so the clustering doesn't use multiple cores so there we could probably win time because not a server if you do RMSD clustering you have to read in all the models you have to calculate the pairwise RMSD between all models and that's a bit the slow part the fraction of contact clustering is much faster because we don't need to fit we just calculate contact between the molecules and this is done very quickly so the fraction of contact clustering is going to run probably in say 5 minutes so this also scales much better yeah thank you welcome we don't have more questions in the queue if any of the attendees has a question you can post it in the questions interface or in the chat well it looks like we don't have other questions yet oh yes there came one just the second Nanje Deng hello Nanje can you hear us well I guess the question is how do I restrain a torsion angle during docking yes so in so Hadock is using CNS for the structure calculation and CNS is a software which is used for X-ray refinement but also NMR structure calculations so you can define diagonal angle restraints with a given error margin for the other diagonal angle that you want to restrain so this require basically for selection of four atoms and you can output that in the server so let me you're still seeing my screen I guess let's me I'm showing my screen so I'm going to switch now so you should be seeing another web browser and I'm going to go to the border lab we're going to go to the Hadock software the Hadock manual we blow this up and you see 2.5 before ambiguous restraints so the contextual restraints so I don't have an example here for the which is unfortunate for the diagonal angle restraints for the diagonal angle restraints so they are basically a selection of four atoms so there is the reference paper that you want to check where we give an example of each type of restraints is actually a nature protocol 2010 paper what you have to realize if you define diagonal angle is that they are not going to define your intermolecular interactions they can only work within one molecule but there is an upload in the server again if you go to the for example the expert interface you see a menu which now allows you to upload a diagonal angle restraint file so you can give that to the server and it will be taken into account of course the part of the molecule that your restraint should be flexible because if they are rigid nothing will happen so this can be used for example for small peptides if you are docking a small peptide and you know that it should have an alphyl conformation that will be the way to impose that so I see another question in the hello yes so I'm doing a protein and RNA docking and my RNA is double stranded and it's a little bit symmetrical so the way I've been doing this is I would specify the amino acids that I have from NMR data on the protein side but on the RNA side we don't have any base pair data so the way I've been getting around this is picking one side of the RNA and just specifying those base pairs but I'm finding it's kind of a little bit biased because there's still the rotation of the RNA and I'm looking at also making the RNA slightly flexible so I guess my question is is it possible to only give data for one of the protein itself every time I've submitted it always comes back with so the way to do it so we've been working on RNA but we have not really published any systematic benchmarking there and we have actually been running exactly the scenario that you are telling us so the way to do that will be so first of all for RNA you need a good starting structure if you don't have a so if you know your starting structure for the RNA then it's fine but otherwise you might be into trouble but the way to do it in this case will be to define on the protein side your active residues and for the RNA side you define the entire RNA as passive so in the interface of ad hoc you will give the list of bases in your RNA all as passive so that you don't bias any region another small piece of information which we just recently found out if you're going to do RNA docking you should turn off the disolvation energy term from the scoring function so these are unpublished results but it gives a much better sampling especially at the rigid body stage so define your entire RNA as passive turn off the disolvation okay so that kind of brings up the other point which is the starting structure of the RNA I got basically from you know a PDB but this sequence of the RNA is different but it's how I think that the RNA would be shaped so I made that piece of RNA even though the sequence is different as kind of a rigid RNA and then dock this piece but I have generated through other you know and my RNA is double stranded so the main structural difference in this RNA is just basically bending of the helix so would this semi-flexible if I generated my piece of RNA with my actual sequence that we're using in our experiments and I would it be better to use a semi-flexible or the flexible for this or you know making them middle parts so the for RNA there so for DNA in principle the server analyze the structure and define automatically base pairing restraints and also some restraints impose some restraints on the diadol the phosphate for RNA I'm not sure that this is going to work properly so my advice will be to start to just use the automatic flexibility definition that you only put flexibility in the regions that are in contact with the protein so that you don't pre-decide where to do and I think that's important because if you define the entire RNA as passive the binding could be at any region on the RNA surface so you only want to do that region flexible so let the server automatically do the flexibility it means that each model that you generate might have a slightly different region that becomes flexible it depends on the contact that I made what I will also suggest is to maybe you could define a few distance restraints if you know the base pairing for example to ensure that nothing bad happens to your RNA during the modeling process it depends also a bit on the size so if you have like a tRNA which has a lot of 3D structure it's going to be more stable if you just have a short double helix it might distort more so you could add basically intramolecular restraints and give them an ambiguous to enforce base pairing in your RNA to keep to maintain the structure while it is refined with flexibility okay yeah sorry so what I also have SACS scattering data do you get your mention that you could fit some of these models to allow your data here so do you guys have anything for SACS as far as like you know if I gave you guys a curve or a shape would you guys be able to model to that or what's appropriate way to enter in SACS so for SACS we only able to score with SACS so there is no well there are restraining functions that have been developed for SACS that are actually even available in CNS but they are very expensive in terms of computation so far taking them for calculation it would be too expensive but what you can do is to score your model with SACS and actually we have published a paper about SACS it's in Acta Crystallographica so if you look up on the website you will find it and if you look here on the weird MR sites now you still see my screen I assume we have a topic here about how to use SACS data in scoring decoys so here you could run but I guess if you want to do the scoring at the end just discriminate your cluster using the SACS data you can even do that online using Crystall so there is a web server where you can do online you don't even need local software or you could do a local run and then already do a scoring round at the rigid body stage so you manually score at the rigid body before going in flexible stage so that you can reach your solutions with model that fit your SACS before you so flexibility into it ok thank you very much I really appreciate your this webinar is very good thank you thank you all so these were all the questions that we had I want to tell everybody that the recording of the webinar and the slides will be available on the website and just to mention again that this is the first webinar of a series of webinars that BioXL will continue to organize on different topics covering mainly the interest of the interest groups that I mentioned earlier and our next webinar will be on performance tuning and optimization of Gromax with Mark Abraham who is the core developer and project manager of Gromax it will be on the 11th of May we have two more additional webinars already scheduled that you can find about from our website I would like to ask everyone that in a follow up email I will send your link to a very short survey with just a few very short questions that we would like to hear from you how about your experience with the webinar we would really like to make them very useful to you make them better and specifically let us know about any questions or topics that you would like to get covered in future webinars so that they are more relevant to the community so and with that I would like to thank Alexander for the very nice and useful presentation and I hope to see everyone again in our next event thank you all and have a good day thank you Alexander thanks for listening bye